203 lines
5.3 KiB
Markdown
203 lines
5.3 KiB
Markdown
% Introduction to Data Analysis and Machine Learning in Physics
|
|
% Jörg Marks, Klaus Reygers
|
|
% 11-14 April 2023 \newline 9:00 - 12:00 and 14:00 - 17:00
|
|
|
|
|
|
## Outline
|
|
|
|
* **Day 1**
|
|
- Introduction, software and data fitting
|
|
|
|
* **Day 2**
|
|
- Machine learning - basics
|
|
|
|
* **Day 3**
|
|
- Machine learning - decision trees
|
|
|
|
* **Day 4**
|
|
- Machine learning - convolutional networks and graph neural networks
|
|
|
|
* **Organization** and **Objective**
|
|
- \textcolor{red} {2 ETC: Compulsory attendance is required} \newline
|
|
\textcolor{red} {Active participation in the exercises}
|
|
- \textcolor{blue}{Course in CIP pool in a tutorial style}
|
|
- \textcolor{blue}{Obtain basic knowledge for problem-oriented self-studies}
|
|
|
|
## Course Information (1)
|
|
|
|
* Course requirements
|
|
|
|
- Python knowledge needed / good C++ knowledge might work
|
|
- Userid to use the CIP Pool of the faculty of physics
|
|
|
|
* Course structure
|
|
- \textcolor{red}{Course in CIP pool} using the \textcolor{red}{jupyter3 hub}
|
|
- Lectures are interleaved with tutorial/exercise sessions in small groups
|
|
(up to 5 persons / group)
|
|
|
|
* Course homepage which includes and distributes all material
|
|
\small
|
|
[https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/) \normalsize
|
|
|
|
/transparencies \textcolor{blue}{Transparencies of the lectures}
|
|
|
|
/examples \textcolor{blue}{iPython files shown in the lectures}
|
|
|
|
/exercises \textcolor{blue}{Exercises to be solved during the course}
|
|
|
|
/solutions \textcolor{blue}{Solutions of the exercises}
|
|
|
|
|
|
## Course Information (2)
|
|
|
|
`TensorFlow` and `Keras` are now also installed in the CIP jupyter hub. In addition, with a google account you can run jupyter notebooks on Google Colab:
|
|
|
|
\vspace{3ex}
|
|
[https://colab.research.google.com/](https://colab.research.google.com/)
|
|
|
|
\vfill
|
|
|
|
Missing python libraries can be included by adding the following to a cell (here for the pypng library):
|
|
|
|
```
|
|
!pip install pypng
|
|
```
|
|
|
|
|
|
## Course Information (3)
|
|
|
|
* Your installation at home:
|
|
* \textcolor{blue}{Web Browser to access jupyter3}
|
|
* \textcolor{blue}{Access to the CIP pool via an ssh client on your home PC}
|
|
|
|
* No requirements for a special operating system
|
|
|
|
* Software:
|
|
* firefox or similar
|
|
* Cisco AnyConnect
|
|
* ssh client (MobaXterm on Windows, integrated in Linux/Mac)
|
|
|
|
* Local execution of python / iPython
|
|
* Install ``anaconda3`` and download / run the iPython notebooks (also python scripts are available)
|
|
|
|
* \textcolor{red}{Hints for software installations and CIP pool access}
|
|
\small
|
|
|
|
[https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.PDF](https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.PDF) \normalsize
|
|
|
|
## Course Information (4)
|
|
Alternatively, you can install the libraries needed on your local computer.
|
|
|
|
\vfill
|
|
|
|
Here are the relevant instruction for macOS using `pip`:
|
|
|
|
\vfill
|
|
|
|
Assumptions: `homebrew` is installed.
|
|
|
|
\vfill
|
|
|
|
Install python3 (see https://docs.python-guide.org/starting/install3/osx/)
|
|
\footnotesize
|
|
```
|
|
$ brew install python
|
|
$ python --version
|
|
Python 3.8.5
|
|
```
|
|
\normalsize
|
|
|
|
|
|
Make sure pip3 is up-to-date (alternative: conda $\rightarrow$ don't mix conda and pip installations)
|
|
\footnotesize
|
|
```
|
|
$ pip3 install --upgrade pip
|
|
```
|
|
\normalsize
|
|
|
|
|
|
Install modules needed:
|
|
\footnotesize
|
|
```
|
|
$ pip3 install --upgrade jupyter matplotlib numpy pandas
|
|
scipy scikit-learn xgboost iminuit tensorflow tensorflow_datasets Keras
|
|
```
|
|
\normalsize
|
|
|
|
|
|
|
|
|
|
## Topcics and file name conventions
|
|
|
|
0. Introduction (this file) \hspace{0.1cm} \footnotesize (\textcolor{gray}{introduction.pdf}) \normalsize
|
|
1. Introduction to python \hspace{0.1cm} \footnotesize (\textcolor{gray}{01\_intro\_python\_*}) \normalsize
|
|
2. Data modeling and fitting \hspace{0.1cm} \footnotesize (\textcolor{gray}{02\_fit\_intro\_*}) \normalsize
|
|
3. Machine learning basics \hspace{0.1cm} \footnotesize (\textcolor{gray}{03\_ml\_basics\_*}) \normalsize
|
|
4. Decisions trees \hspace{0.1cm} \footnotesize (\textcolor{gray}{04\_decision\_trees\_*}) \normalsize
|
|
5. Neural networks \hspace{0.1cm} \footnotesize (\textcolor{gray}{05\_neural\_networks\_*}) \normalsize
|
|
\vspace{3.5cm}
|
|
|
|
## Programm Day 1
|
|
|
|
|
|
|
|
* Technicalities
|
|
|
|
* Summary of NumPy
|
|
|
|
* Plotting with matplotlib
|
|
|
|
* Input / output of data
|
|
|
|
* Summary of pandas
|
|
|
|
* Fitting with iminuit and PyROOT
|
|
|
|
* Transparencies with activated links, examples and exercises
|
|
|
|
* Software: [\textcolor{violet}{01\_intro\_python.pdf}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/transparencies/01_intro_python.pdf)
|
|
|
|
* Fitting:
|
|
[\textcolor{violet}{02\_fit\_intro.pdf}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/transparencies/02_fit_intro.pdf)
|
|
|
|
\vspace{2cm}
|
|
|
|
|
|
## Programm Day 2
|
|
|
|
* Supervised learning
|
|
|
|
* Classification and regression
|
|
|
|
* Linear regression
|
|
|
|
* Logistic regression
|
|
|
|
* Softmax regression (multi-class classification)
|
|
|
|
\vspace{4cm}
|
|
|
|
## Programm Day 3
|
|
|
|
* Decision trees
|
|
|
|
* Bagging and boosting
|
|
|
|
* Random forest
|
|
|
|
* XGBoost
|
|
|
|
\vspace{5cm}
|
|
|
|
## Programm Day 4
|
|
|
|
* Neural networks
|
|
|
|
* Convolutional neural networks
|
|
|
|
* TensorFlow and Keras
|
|
|
|
* Hand-written digit recognition with Keras
|
|
|
|
\vspace{5cm}
|