Sequentially a player decides to play and his adversary decides . At time , a decision results in a vector payoff . Given is the average vector payoff at time , Blackwell’s Approachability Theorem is a necessary and sufficient condition so that, regardless of the adversary’s decisions, the player makes the sequence of vectors approach a convex set .
The Weighted Majority Algorithm is an is a randomized algorithm used to learn the ‘best’ action amongst a fixed reference set.
- The Hamilton-Jacobi-Bellman Equation.
- Heuristic derivation of the HJB equation.
- Continuous-time dynamic programs
- The HJB equation; a heuristic derivation; and proof of optimality.
- Markov Decisions Problems; Bellman’s Equation; Two examples
- Dynamic Programs; Bellman’s Equation; An example.