# Legacy display Course

This is an archived course. The content might be broken.

# V4E2 - Numerical SimulationOptimal Control and Reinforcement Learning

### Prof. Dr. Jochen Garcke

#### Assistent: Glenn Byrenheid

 Location: Room 6.020, Wegelerstr. 6 Time: Tuesday, 10:15 - 11:45 Thursday, 8:30 - 10:00 Exercise: Thursday, 10:15-11:45 Office hours: on appointment, e-mail garckeins.uni-bonn.de

### Content of the lecture:

• Theory and Numerics for Hamilton-Jacobi-Bellmann Equations

The first part of lecture concerns Semi-Lagrangian approximation schemes of first-order PDEs with a special focus on Hamilton-Jacobi equations, reviewing their construction and theory on model equations. The analysis of Hamilton–Jacobi equations will require the analytical tool of viscosity solutions, which we will introduce in the beginning. One of the most typical applications of the theory of HJ equations is in the field of optimal control problems and differential games. Via the Dynamic Programming Principle (DPP) many optimal control problems can be characterized by means of the associated value function, which can be shown in turn to be the unique viscosity solution of a PDE of convex HJ type, usually called the Bellman equation, the Dynamic Programming equation or the Hamilton-Jacobi-Bellmann Equation equation.

At the numerical level, the Semi-Lagrangian approximation mimics the method of characteristics looking for the foot of the characteristic curve passing through every node and following this curve for a single time step. In order to derive a numerical method from this general idea, several ingredients should be put together, mainly a technique for ODEs to track characteristics and a reconstruction technique to recover pointwise values of the numerical solution.

• Reinforcement Learning

In the reinforcement learning setting, we consider a system in interaction with some a priori (at least partially) unknown environment, which learns "from experience', i.e. the underlying first order PDE is not perfectly known, but its effects have to be approximated during learning. Reinforcement learning is in its basic form very general, it is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, statistics, and genetic algorithms. We will address RL from the viewpoint of HJB equations, Semi-Lagrangian schemes and function approximation, if time allows including Deep Learning approaches.

### News:

Attached you can find corrected remarks for the end of the proof of Theorem 45.

### Exercise sheets:

The homework has to be handed in at the beginning of the following exercise. Not each student separately hands in his/her solutions, but you form groups of two students, which hand in one written report together. Each student needs to at least once present solutions in the exercise.

 Nr: Link: Due Remarks and errata 1 sheet_1.pdf 27.04.2017 2 sheet_2.pdf 04.05.2017 3 sheet_3_2.pdf 11.05.2017 Exercise 1. Assume additionally the (uniform) continuity of the Hamiltonian H in the first and second variable 4 sheet_4.pdf 18.05.2017 5 sheet_5.pdf 30.05.2017 To be handed in on Tuesday, 30.05.2017, after the lecture. 6 sheet_6_2.pdf 13.06.2017 Exercise 1 updated. Opportunity to reinsert it with Sheet 7. 7 sheet_7.pdf 20.06.2017 8 sheet_8.pdf 27.06.2017 9 sheet_9.pdf 04.07.2017 10 sheet_10.pdf 13.07.2017 Exercise 2 updated. Assumption concering global extreme points added. 11 sheet_11.pdf 20.07.2017 Exercise 1 updated.

### Prerequisites:

The content of the two lectures on Numerische Mathematik from the second year of the bachelor studies are expected. In particular knowledge of (nonlinear) optimization and numerical methods for ODEs is recommended, the (German) lecture notes from the course in 2015 are available on request. Furthermore, (Lagrange) interpolation is expected, function discretization by finite elementes is helpful, although for HJB-equations one cannot use the mathematical ideas from the field of numerical solution of PDEs (e.g. Sobolev spaces or Galerkin methods do not play a role here). Parts of the prerequisites will be freshened up in the exercises. In the second half we might do some numerical exercises / experiments for reinforcement learning using existing python-based frameworks.

### Exams:

The oral exams will take place between 01.08.17 and 04.08.17. Admittance for oral exam based on homework assignments requiring 50% of the points from the exercise sheets.

### Selected Literature:

• Falcone, M., & Ferretti, R. Semi – Lagrangian Approximation Schemes for Linear and Hamilton – Jacobi Equations, SIAM, 2014.
• Sutton, R., & Barto, A. Reinforcement Learning, MIT Press, 1998. Draft of the second edition.
• Bertsekas, D.P., Vol. II, 4th Edition: Approximate Dynamic Programming 2012.