ML-Kurs-SS2023/slides/introduction.md
2023-04-03 10:37:47 +02:00

203 lines
5.3 KiB
Markdown

% Introduction to Data Analysis and Machine Learning in Physics
% Jörg Marks, Klaus Reygers
% 11-14 April 2023 \newline 9:00 - 12:00 and 14:00 - 17:00
## Outline
* **Day 1**
- Introduction, software and data fitting
* **Day 2**
- Machine learning - basics
* **Day 3**
- Machine learning - decision trees
* **Day 4**
- Machine learning - convolutional networks graph neural networks
* **Organization** and **Objective**
- \textcolor{red} {2 ETC: Compulsory attendance is required} \newline
\textcolor{red} {Active participation in the exercises}
- \textcolor{blue}{Course in CIP pool in a tutorial style}
- \textcolor{blue}{Obtain basic knowledge for problem-oriented self-studies}
## Course Information (1)
* Course requirements
- Python knowledge needed / good C++ knowledge might work
- Userid to use the CIP Pool of the faculty of physics
* Course structure
- \textcolor{red}{Course in CIP pool} using the \textcolor{red}{jupyter3 hub}
- Lectures are interleaved with tutorial/exercise sessions in small groups
(up to 5 persons / group)
* Course homepage which includes and distributes all material
\small
[https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/) \normalsize
/transparencies      \textcolor{blue}{Transparencies of the lectures}
/examples             \textcolor{blue}{iPython files shown in the lectures}
/exercises             \textcolor{blue}{Exercises to be solved during the course}
/solutions             \textcolor{blue}{Solutions of the exercises}
## Course Information (2)
`TensorFlow` and `Keras` are now also installed in the CIP jupyter hub. In addition, with a google account you can run jupyter notebooks on Google Colab:
\vspace{3ex}
[https://colab.research.google.com/](https://colab.research.google.com/)
\vfill
Missing python libraries can be included by adding the following to a cell (here for the pypng library):
```
!pip install pypng
```
## Course Information (3)
* Your installation at home:
* \textcolor{blue}{Web Browser to access jupyter3}
* \textcolor{blue}{Access to the CIP pool via an ssh client on your home PC}
* No requirements for a special operating system
* Software:
* firefox or similar
* Cisco AnyConnect
* ssh client (MobaXterm on Windows, integrated in Linux/Mac)
* Local execution of python / iPython
* Install ``anaconda3`` and download / run the iPython notebooks (also python scripts are available)
* \textcolor{red}{Hints for software installations and CIP pool access}
\small
[https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.PDF](https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.PDF) \normalsize
## Course Information (4)
Alternatively, you can install the libraries needed on your local computer.
\vfill
Here are the relevant instruction for macOS using `pip`:
\vfill
Assumptions: `homebrew` is installed.
\vfill
Install python3 (see https://docs.python-guide.org/starting/install3/osx/)
\footnotesize
```
$ brew install python
$ python --version
Python 3.8.5
```
\normalsize
Make sure pip3 is up-to-date (alternative: conda $\rightarrow$ don't mix conda and pip installations)
\footnotesize
```
$ pip3 install --upgrade pip
```
\normalsize
Install modules needed:
\footnotesize
```
$ pip3 install --upgrade jupyter matplotlib numpy pandas
scipy scikit-learn xgboost iminuit tensorflow tensorflow_datasets Keras
```
\normalsize
## Topcics and file name conventions
0. Introduction (this file) \hspace{0.1cm} \footnotesize (\textcolor{gray}{introduction.pdf}) \normalsize
1. Introduction to python \hspace{0.1cm} \footnotesize (\textcolor{gray}{01\_intro\_python\_*}) \normalsize
2. Data modeling and fitting \hspace{0.1cm} \footnotesize (\textcolor{gray}{02\_fit\_intro\_*}) \normalsize
3. Machine learning basics \hspace{0.1cm} \footnotesize (\textcolor{gray}{03\_ml\_basics\_*}) \normalsize
4. Decisions trees \hspace{0.1cm} \footnotesize (\textcolor{gray}{04\_decision\_trees\_*}) \normalsize
5. Neural networks \hspace{0.1cm} \footnotesize (\textcolor{gray}{05\_neural\_networks\_*}) \normalsize
\vspace{3.5cm}
## Programm Day 1
* Technicalities
* Summary of NumPy
* Plotting with matplotlib
* Input / output of data
* Summary of pandas
* Fitting with iminuit and PyROOT
* Transparencies with activated links, examples and exercises
* Software: [\textcolor{violet}{01\_intro\_python.pdf}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/transparencies/01_intro_python.pdf)
* Fitting:
[\textcolor{violet}{02\_fit\_intro.pdf}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/transparencies/02_fit_intro.pdf)
\vspace{2cm}
## Programm Day 2
* Supervised learning
* Classification and regression
* Linear regression
* Logistic regression
* Softmax regression (multi-class classification)
\vspace{4cm}
## Programm Day 3
* Decision trees
* Bagging and boosting
* Random forest
* XGBoost
\vspace{5cm}
## Programm Day 4
* Neural networks
* Convolutional neural networks
* TensorFlow and Keras
* Hand-written digit recognition with Keras
\vspace{5cm}