diff --git a/slides/CIPpoolAccess.PDF b/slides/CIPpoolAccess.PDF new file mode 100644 index 0000000..478d6b0 Binary files /dev/null and b/slides/CIPpoolAccess.PDF differ diff --git a/slides/introduction.md b/slides/introduction.md new file mode 100644 index 0000000..f1403fe --- /dev/null +++ b/slides/introduction.md @@ -0,0 +1,202 @@ +% Introduction to Data Analysis and Machine Learning in Physics +% Martino Borsato, Jörg Marks, Klaus Reygers +% 11-14 April 2023 \newline 9:00 - 12:00 and 14:00 - 17:00 + + +## Outline + + * **Day 1** + - Introduction, software and data fitting + + * **Day 2** + - Machine learning - basics + + * **Day 3** + - Machine learning - decision tree + + * **Day 4** + - Machine learning - convolutional networks + + * **Organization** and **Objective** + - \textcolor{red} {2 ETC: Compulsory attendance is required} \newline + \textcolor{red} {Active participation in the exercises} + - \textcolor{blue}{Course in CIP pool in a tutorial style} + - \textcolor{blue}{Obtain basic knowledge for problem-oriented self-studies} + +## Course Information (1) + + * Course requirements + + - Python knowledge needed / good C++ knowledge might work + - Userid to use the CIP Pool of the faculty of physics + + * Course structure + - \textcolor{red}{Course in CIP pool} using the \textcolor{red}{jupyter3 hub} + - Lectures are interleaved with tutorial/exercise sessions in small groups + (up to 5 persons / group) + + * Course homepage which includes and distributes all material + \small + [https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/) \normalsize + + /transparencies      \textcolor{blue}{Transparencies of the lectures} + + /examples             \textcolor{blue}{iPython files shown in the lectures} + + /exercises             \textcolor{blue}{Exercises to be solved during the course} + + /solutions             \textcolor{blue}{Solutions of the exercises} + + +## Course Information (2) + +`TensorFlow` and `Keras` are now also installed in the CIP jupyter hub. In addition, with a google account you can run jupyter notebooks on Google Colab: + +\vspace{3ex} +[https://colab.research.google.com/](https://colab.research.google.com/) + +\vfill + +Missing python libraries can be included by adding the following to a cell (here for the pypng library): + +``` +!pip install pypng +``` + + +## Course Information (3) + + * Your installation at home: + * \textcolor{blue}{Web Browser to access jupyter3} + * \textcolor{blue}{Access to the CIP pool via an ssh client on your home PC} + + * No requirements for a special operating system + + * Software: + * firefox or similar + * Cisco AnyConnect + * ssh client (MobaXterm on Windows, integrated in Linux/Mac) + + * Local execution of python / iPython + * Install ``anaconda3`` and download / run the iPython notebooks (also python scripts are available) + + * \textcolor{red}{Hints for software installations and CIP pool access} + \small + + [https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.PDF](https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.PDF) \normalsize + +## Course Information (4) +Alternatively, you can install the libraries needed on your local computer. + +\vfill + +Here are the relevant instruction for macOS using `pip`: + +\vfill + +Assumptions: `homebrew` is installed. + +\vfill + +Install python3 (see https://docs.python-guide.org/starting/install3/osx/) +\footnotesize +``` +$ brew install python +$ python --version +Python 3.8.5 +``` +\normalsize + + +Make sure pip3 is up-to-date (alternative: conda $\rightarrow$ don't mix conda and pip installations) +\footnotesize +``` +$ pip3 install --upgrade pip +``` +\normalsize + + +Install modules needed: +\footnotesize +``` +$ pip3 install --upgrade jupyter matplotlib numpy pandas +scipy scikit-learn xgboost iminuit tensorflow tensorflow_datasets Keras +``` +\normalsize + + + + +## Topcics and file name conventions + +0. Introduction (this file) \hspace{0.1cm} \footnotesize (\textcolor{gray}{introduction.pdf}) \normalsize +1. Introduction to python \hspace{0.1cm} \footnotesize (\textcolor{gray}{01\_intro\_python\_*}) \normalsize +2. Data modeling and fitting \hspace{0.1cm} \footnotesize (\textcolor{gray}{02\_fit\_intro\_*}) \normalsize +3. Machine learning basics \hspace{0.1cm} \footnotesize (\textcolor{gray}{03\_ml\_basics\_*}) \normalsize +4. Decisions trees \hspace{0.1cm} \footnotesize (\textcolor{gray}{04\_decision\_trees\_*}) \normalsize +5. Neural networks \hspace{0.1cm} \footnotesize (\textcolor{gray}{05\_neural\_networks\_*}) \normalsize +\vspace{3.5cm} + +## Programm Day 1 + + + +* Technicalities + +* Summary of NumPy + +* Plotting with matplotlib + +* Input / output of data + +* Summary of pandas + +* Fitting with iminuit and PyROOT + +* Transparencies with activated links, examples and exercises + + * Software: [\textcolor{violet}{01\_intro\_python.pdf}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/transparencies/01_intro_python.pdf) + + * Fitting: + [\textcolor{violet}{02\_fit\_intro.pdf}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/transparencies/02_fit_intro.pdf) + + \vspace{2cm} + + +## Programm Day 2 + +* Supervised learning + +* Classification and regression + +* Linear regression + +* Logistic regression + +* Softmax regression (multi-class classification) + +\vspace{4cm} + +## Programm Day 3 + +* Decision trees + +* Bagging and boosting + +* Random forest + +* XGBoost + +\vspace{5cm} + +## Programm Day 4 + +* Neural networks + +* Convolutional neural networks + +* TensorFlow and Keras + +* Hand-written digit recognition with Keras + +\vspace{5cm}