Hamilton–Jacobi–Bellman equation

In optimal control theory, the Hamilton–Jacobi–Bellman equation gives a necessary and sufficient condition for :wikt:optimality|optimality of a control with respect to a loss function. It is, in general, a nonlinear partial differential equation in the value function, which means its solution the value function itself. Once this solution is known, it can be used to obtain the optimal control by taking the maximizer of the Hamiltonian involved in the HJB equation.
The equation is a result of the theory of dynamic programming which was pioneered in the 1950s by Richard Bellman and coworkers. The connection to the Hamilton–Jacobi equation from classical physics was first drawn by Rudolf Kálmán. In discrete-time problems, the corresponding difference equation is usually referred to as the Bellman equation.
While classical variational problems, for example the brachistochrone problem, can be solved using the Hamilton–Jacobi–Bellman equation, the method can be applied to a broader spectrum of problems. Further it can be generalized to stochastic systems, in which case the HJB equation is a second-order partial differential equation. A major drawback, however, is that the HJB equation admits classical solutions only for a sufficiently smooth value function, which is not guaranteed in most situations. Instead, the notion of a viscosity solution is required, in which conventional derivatives are replaced by subderivatives.

Optimal control problems

Consider the following problem in deterministic optimal control over the time period :
where is the scalar cost rate function and is a function that gives the bequest value at the final state, is the system state vector, is assumed given, and for is the control vector that we are trying to find.
The system must also be subject to
where gives the vector determining physical evolution of the state vector over time.

The partial differential equation

For this simple system, the Hamilton–Jacobi–Bellman partial differential equation is
subject to the terminal condition
where denotes the partial derivative of with respect to the time variable. Here denotes the dot product of the vectors and and the gradient of with respect to the variables.
The unknown scalar in the above partial differential equation is the Bellman value function, which represents the cost incurred from starting in state at time and controlling the system optimally from then until time.

Deriving the equation

Intuitively, the HJB equation can be derived as follows. If is the optimal cost-to-go function, then by Richard Bellman's principle of optimality, going from time t to t + dt, we have
Note that the Taylor expansion of the first term on the right-hand side is
where denotes the terms in the Taylor expansion of higher order than one in little-o notation. Then if we subtract from both sides, divide by dt, and take the limit as dt approaches zero, we obtain the HJB equation defined above.

Solving the equation

The HJB equation is usually solved backwards in time, starting from and ending at.
When solved over the whole of state space and is continuously differentiable, the HJB equation is a necessary and sufficient condition for an optimum when the terminal state is unconstrained. If we can solve for then we can find from it a control that achieves the minimum cost.
In general case, the HJB equation does not have a classical solution. Several notions of generalized solutions have been developed to cover such situations, including viscosity solution, minimax solution, and others.

Extension to stochastic problems

The idea of solving a control problem by applying Bellman's principle of optimality and then working out backwards in time an optimizing strategy can be generalized to stochastic control problems. Consider similar as above
now with the stochastic process to optimize and the steering. By first using Bellman and then expanding with Itô's rule, one finds the stochastic HJB equation
where represents the stochastic differentiation operator, and subject to the terminal condition
Note that the randomness has disappeared. In this case a solution of the latter does not necessarily solve the primal problem, it is a candidate only and a further verifying argument is required. This technique is widely used in Financial Mathematics to determine optimal investment strategies in the market.

Application to LQG Control

As an example, we can look at a system with linear stochastic dynamics and quadratic cost. If the system dynamics is given by
and the cost accumulates at rate, the HJB equation is given by
with optimal action given by
Assuming a quadratic form for the value function, we obtain the usual Riccati equation for the Hessian of the value function as is usual for Linear-quadratic-Gaussian control.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...