Wiley, 2022. — 1138 p. — ISBN 9781119815037.
Reinforcement Learning and Stochastic Optimization offers a single canonical framework that can model any sequential decision problem using five core components: state variables, decision variables, exogenous information variables, transition function, objective function. This book highlights twelve types of uncertainty that might enter any model and pulls together the diverse set of methods for making decisions into four fundamental classes that span every method suggested in the academic literature or used in practice.
Sequential Decision Problems
Canonical Problems and Applications
Online Learning
Introduction to Stochastic Search
Derivative-Based Stochastic Search
Stepsize Policies
Derivative-Free Stochastic Search
State-dependent Problems
Modeling Sequential Decision Problems
Uncertainty Modeling
Designing Policies
Policy Function Approximations and Policy Search
Cost Function Approximations
Exact Dynamic Programming
Backward Approximate Dynamic Programming
Forward ADP I: The Value of a Policy
Forward ADP II: Policy Optimization
Forward ADP III: Convex Resource Allocation Problems
Direct Lookahead Policies
Multiagent Modeling and Learning