Practical Lab WS 18/19 Practical Lab Numerical Simulation
Algorithms in Machine Learning and Their Application
Material
- The work sheets can be found in the download tab on the left.
- The HowTo.txt file with the instructions on how to install everything in Linux.
- The requirements.txt file with the necessary Python 3 packages.
- A Python tutorial Jupyter notebook.
- The introduction slides from the first meeting.
- The supplementary material for sheet 3.
- The supplementary material for sheet 4.
- The jupyter notebook template for sheet 5.
- The rhine level data set (more covenient Pandas DataFrame, see below) with template.
- The car crash data set.
To read the dataset for the river levels task, you may also use the Pandas DataFrame and read it with
import pandas as pd df = pd.read_pickle("riverlevels.pandas.pickle")
Submissions to the exercise sheets
Each working group should send their submissions to the exercise sheets to ed tod nnob-inu tod sni ta ballma tod b@foo tod de. The solutions to the first sheet will not be discussed in detail with each group separately, but if you have certain questions or would like to check if you did things the right way, you can come to the tutorial dates and discuss your solutions with the tutor. The solutions to all other worksheets will be disussed with each group separately. Appointments will be made on short notice.
Content
In this practical lab, we teach the basic mathematical and technical tools needed to understand a range of basic data mining and machine learning methods. A strong emphasis is put on algorithms and efficient implementation.
Roughly every two weeks a new practice sheet is given to the participants. The tasks will be worked on in small groups. Depending on the technical proficiency, the time needed will be about 6 hours a week.
Background
Nowadays, data mining and machine learning algorithms are the backbone of decision making processes in all major enterprises. Their applicability seems almost endless and ranges from selective advertising over prototype design to autonomous production chains. Due to the availability of very large datasets (“Big Data”) it has become crucial to understand the mechanics of the different types of learning methods and to be able to develop and implement efficient algorithms to meet the requirements of the task at hand.
Requirements
Basic experience in Python is a necessary requirement. Further, the Python packages Numpy and Matplotlib will be used. The corresponding websites provide introductions which are sufficient for our purposes. All programming tasks are done using Jupyter notebooks. For the case of having no experience in the mentioned tools we recommened to spend a little time familiarizing yourself with these before the course starts.