# Legacy display Course

This is an archived course. The content might be broken.

#### Lecture course in sommer semester 2017:

# V4E2 - Numerical Simulation

Optimal Control and Reinforcement Learning

### Prof. Dr. Jochen Garcke

#### Assistent: Glenn Byrenheid

Location: |
Room 6.020, Wegelerstr. 6 |

Time: |
Tuesday, 10:15 - 11:45 |

Thursday, 8:30 - 10:00 | |

Exercise: |
Thursday, 10:15-11:45 |

Office hours: |
on appointment, e-mail garckeins.uni-bonn.de |

### Content of the lecture:

- Theory and Numerics for Hamilton-Jacobi-Bellmann Equations
The first part of lecture concerns Semi-Lagrangian approximation schemes of first-order PDEs with a special focus on Hamilton-Jacobi equations, reviewing their construction and theory on model equations. The analysis of Hamilton–Jacobi equations will require the analytical tool of viscosity solutions, which we will introduce in the beginning. One of the most typical applications of the theory of HJ equations is in the field of optimal control problems and differential games. Via the Dynamic Programming Principle (DPP) many optimal control problems can be characterized by means of the associated value function, which can be shown in turn to be the unique viscosity solution of a PDE of convex HJ type, usually called the Bellman equation, the Dynamic Programming equation or the Hamilton-Jacobi-Bellmann Equation equation.

At the numerical level, the Semi-Lagrangian approximation mimics the method of characteristics looking for the foot of the characteristic curve passing through every node and following this curve for a single time step. In order to derive a numerical method from this general idea, several ingredients should be put together, mainly a technique for ODEs to track characteristics and a reconstruction technique to recover pointwise values of the numerical solution.

- Reinforcement Learning
In the reinforcement learning setting, we consider a system in interaction with some a priori (at least partially) unknown environment, which learns "from experience', i.e. the underlying first order PDE is not perfectly known, but its effects have to be approximated during learning. Reinforcement learning is in its basic form very general, it is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, statistics, and genetic algorithms. We will address RL from the viewpoint of HJB equations, Semi-Lagrangian schemes and function approximation, if time allows including Deep Learning approaches.

### News:

Attached you can find corrected remarks for the end of the proof of Theorem 45.### Exercise sheets:

The homework has to be handed in at the beginning of the following exercise. Not each student separately hands in his/her solutions, but you form groups of two students, which hand in one written report together. Each student needs to at least once present solutions in the exercise.

Nr: |
Link: |
Due |
Remarks and errata |

1 | sheet_1.pdf | 27.04.2017 | |

2 | sheet_2.pdf | 04.05.2017 | |

3 | sheet_3_2.pdf | 11.05.2017 | Exercise 1. Assume additionally the (uniform) continuity of the Hamiltonian H in the first and second variable |

4 | sheet_4.pdf | 18.05.2017 | |

5 | sheet_5.pdf | 30.05.2017 | To be handed in on Tuesday, 30.05.2017, after the lecture. |

6 | sheet_6_2.pdf | 13.06.2017 | Exercise 1 updated. Opportunity to reinsert it with Sheet 7. |

7 | sheet_7.pdf | 20.06.2017 | |

8 | sheet_8.pdf | 27.06.2017 | |

9 | sheet_9.pdf | 04.07.2017 | |

10 | sheet_10.pdf | 13.07.2017 | Exercise 2 updated. Assumption concering global extreme points added. |

11 | sheet_11.pdf | 20.07.2017 | Exercise 1 updated. |

### Prerequisites:

The content of the two lectures on*Numerische Mathematik*from the second year of the bachelor studies are expected. In particular knowledge of (nonlinear) optimization and numerical methods for ODEs is recommended, the (German) lecture notes from the course in 2015 are available on request. Furthermore, (Lagrange) interpolation is expected, function discretization by finite elementes is helpful, although for HJB-equations one cannot use the mathematical ideas from the field of numerical solution of PDEs (e.g. Sobolev spaces or Galerkin methods do

**not**play a role here). Parts of the prerequisites will be freshened up in the exercises. In the second half we might do some numerical exercises / experiments for reinforcement learning using existing python-based frameworks.

### Exams:

The**oral exams**will take place between

**01.08.17**and

**04.08.17**. Admittance for oral exam based on homework assignments requiring

**50%**of the

**points**from the exercise sheets.

### Selected Literature:

- Falcone, M., & Ferretti, R. Semi – Lagrangian Approximation Schemes for Linear and Hamilton – Jacobi Equations, SIAM, 2014.
- Sutton, R., & Barto, A. Reinforcement Learning, MIT Press, 1998. Draft of the second edition.
- Bertsekas, D.P., Vol. II, 4th Edition: Approximate Dynamic Programming 2012.