# Lecture SS 22 Numerical Simulation

## Optimal Control and Reinforcement Learning

Lecturer
Prof. Jochen Garcke
Contact for exercises
Dinesh Kannan
Location
Room 2.035, Friedrich-Hirzebruch-Allee 7
Time
Tuesday, 10:15 - 11:45
Thursday, 8:30 - 10:00
Exercise
Thursday, 10:15 - 12:00
Location for the exercise
Room 2.035, Friedrich-Hirzebruch-Allee 7
Office hours
On appointment

Exercise sheets: Please check eCampus regularly.

### Content of the lecture

#### Theory and Numerics for Hamilton-Jacobi-Bellmann Equations

The first part of lecture concerns Semi-Lagrangian approximation schemes of first-order PDEs with a special focus on Hamilton-Jacobi equations, reviewing their construction and theory on model equations. The analysis of Hamilton–Jacobi equations will require the analytical tool of viscosity solutions, which we will introduce in the beginning. One of the most typical applications of the theory of HJ equations is in the field of optimal control problems and differential games. Via the Dynamic Programming Principle (DPP) many optimal control problems can be characterized by means of the associated value function, which can be shown in turn to be the unique viscosity solution of a PDE of convex HJ type, usually called the Bellman equation, the Dynamic Programming equation or the Hamilton-Jacobi-Bellmann Equation equation.

At the numerical level, the Semi-Lagrangian approximation mimics the method of characteristics looking for the foot of the characteristic curve passing through every node and following this curve for a single time step. In order to derive a numerical method from this general idea, several ingredients should be put together, mainly a technique for ODEs to track characteristics and a reconstruction technique to recover pointwise values of the numerical solution.

#### Reinforcement Learning

In the reinforcement learning setting, we consider a system in interaction with some a priori (at least partially) unknown environment, which learns “from experience’, i.e. the underlying first order PDE is not perfectly known, but its effects have to be approximated during learning. Reinforcement learning is in its basic form very general, it is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, statistics, and genetic algorithms. We will address RL from the viewpoint of HJB equations, Semi-Lagrangian schemes and function approximation, if time allows including Deep Learning approaches.

### Prerequisites

The content of the two lectures on Numerische Mathematik from the second year of the bachelor studies are expected. In particular knowledge of (nonlinear) optimization and numerical methods for ODEs is recommended, the (German) lecture notes from the course in 2020 are available on request. Furthermore, (Lagrange) interpolation is expected, function discretization by finite elementes is helpful, although for HJB-equations one cannot use the mathematical ideas from the field of numerical solution of PDEs (e.g. Sobolev spaces or Galerkin methods do not play a role here). Parts of the prerequisites will be freshened up in the exercises, which must be solved in groups of at most 2 people. In the second half we might do some numerical exercises / experiments for reinforcement learning using existing python-based frameworks.

### Exams

Selected Literature:

• Falcone, M., & Ferretti, R. Semi – Lagrangian Approximation Schemes for Linear and Hamilton – Jacobi Equations, SIAM, 2014.
• Sutton, R., & Barto, A. Reinforcement Learning, MIT Press, 1998. Draft of the second edition.
• Bertsekas, D. Dynamic Programming and Optimal Control Vol. II, Approximate Dynamic Programming, 4th Edition, Athena Scientific, 2012.