Product Overview
This 4th edition is a major revision of Vol. II of the leading two-volume dynamic programming textbook by Bertsekas, and contains a substantial amount of new material, as well as
a reorganization of old material. The length has increased by more than 60% from the third edition, and
most of the old material has been restructured and/or revised. Volume II now numbers more than 700 pages and is larger in size than Vol. I. It can arguably be viewed as a new book!
Approximate DP has become the central focal point of Vol. II, and occupies more than half of the book (the last two chapters, and large parts of Chapters 1-3). Thus one may also view Vol. II as a followup of the author's 1996 book ``Neuro-Dynamic Programming (coauthored with John Tsitsiklis). The present book focuses to a great extent
on new research that became available after 1996. On the other hand, the textbook style of the book has been preserved, and some material has been explained at an intuitive or informal level, while referring to the journal literature or the Neuro-Dynamic Programming book for a more mathematical treatment.
As the book's focus shifted, increased emphasis was placed on new or recent research in approximate DP and simulation-based methods, as well as on asynchronous iterative methods, in view of the central role of simulation, which is by nature asynchronous. A lot of this material is an outgrowth of research conducted in the six years since the previous edition. Some of the highlights, in the order appearing in the book, are:
(a) A broad spectrum of simulation-based, approximate value iteration, policy iteration, and Q-learning methods based on projected equations and aggregation.
(b) New policy iteration and Q-learning algorithms for stochastic shortest path problems with improper policies.
(c) Reliable Q-learning algorithms for optimistic policy iteration.
(d) New simulation techniques for multistep methods, such as geometric and free-form sampling, based on generalized weighted Bellman equations.
(e) Computational methods for generalized/abstract discounted DP, including convergence analysis and error bounds for approximations.
(f) Monte Carlo linear algebra methods, which extend the approximate DP methodology to broadly applicable problems involving large-scale regression and systems of linear equations.
The book includes a substantial number of examples, and exercises, detailed solutions of many of which are posted on the internet. It was developed through teaching graduate courses at M.I.T., and is supported by a large amount of educational material, such as slides and videos, posted at the MIT Open Courseware, the author's, and the publisher's web sites.
Contents: 1. Discounted Problems - Theory. 2. Discounted Problems - Computational Methods. 3.
Stochastic Shortest Path Problems. 4. Undiscounted Problems. 5. Average Cost per Stage Problems. 6. Approximate Dynamic Programming - Discounted Models. 7. Approximate Dynamic Programming - Nondiscounted Models and Generalizations.