5.3 KiB
% Introduction to Data Analysis and Machine Learning in Physics % Martino Borsato, Jörg Marks, Klaus Reygers % 11-14 April 2023 \newline 9:00 - 12:00 and 14:00 - 17:00
Outline
-
Day 1
- Introduction, software and data fitting
-
Day 2
- Machine learning - basics
-
Day 3
- Machine learning - decision tree
-
Day 4
- Machine learning - convolutional networks
-
Organization and Objective
- \textcolor{red} {2 ETC: Compulsory attendance is required} \newline \textcolor{red} {Active participation in the exercises}
- \textcolor{blue}{Course in CIP pool in a tutorial style}
- \textcolor{blue}{Obtain basic knowledge for problem-oriented self-studies}
Course Information (1)
-
Course requirements
- Python knowledge needed / good C++ knowledge might work
- Userid to use the CIP Pool of the faculty of physics
-
Course structure
- \textcolor{red}{Course in CIP pool} using the \textcolor{red}{jupyter3 hub}
- Lectures are interleaved with tutorial/exercise sessions in small groups (up to 5 persons / group)
-
Course homepage which includes and distributes all material \small https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/ \normalsize
/transparencies \textcolor{blue}{Transparencies of the lectures}
/examples \textcolor{blue}{iPython files shown in the lectures}
/exercises \textcolor{blue}{Exercises to be solved during the course}
/solutions \textcolor{blue}{Solutions of the exercises}
Course Information (2)
TensorFlow
and Keras
are now also installed in the CIP jupyter hub. In addition, with a google account you can run jupyter notebooks on Google Colab:
\vspace{3ex} https://colab.research.google.com/
\vfill
Missing python libraries can be included by adding the following to a cell (here for the pypng library):
!pip install pypng
Course Information (3)
-
Your installation at home:
- \textcolor{blue}{Web Browser to access jupyter3}
- \textcolor{blue}{Access to the CIP pool via an ssh client on your home PC}
-
No requirements for a special operating system
-
Software:
- firefox or similar
- Cisco AnyConnect
- ssh client (MobaXterm on Windows, integrated in Linux/Mac)
-
Local execution of python / iPython
- Install
anaconda3
and download / run the iPython notebooks (also python scripts are available)
- Install
-
\textcolor{red}{Hints for software installations and CIP pool access} \small
https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.PDF \normalsize
Course Information (4)
Alternatively, you can install the libraries needed on your local computer.
\vfill
Here are the relevant instruction for macOS using pip
:
\vfill
Assumptions: homebrew
is installed.
\vfill
Install python3 (see https://docs.python-guide.org/starting/install3/osx/) \footnotesize
$ brew install python
$ python --version
Python 3.8.5
\normalsize
Make sure pip3 is up-to-date (alternative: conda \rightarrow
don't mix conda and pip installations)
\footnotesize
$ pip3 install --upgrade pip
\normalsize
Install modules needed: \footnotesize
$ pip3 install --upgrade jupyter matplotlib numpy pandas
scipy scikit-learn xgboost iminuit tensorflow tensorflow_datasets Keras
\normalsize
Topcics and file name conventions
- Introduction (this file) \hspace{0.1cm} \footnotesize (\textcolor{gray}{introduction.pdf}) \normalsize
- Introduction to python \hspace{0.1cm} \footnotesize (\textcolor{gray}{01_intro_python_*}) \normalsize
- Data modeling and fitting \hspace{0.1cm} \footnotesize (\textcolor{gray}{02_fit_intro_*}) \normalsize
- Machine learning basics \hspace{0.1cm} \footnotesize (\textcolor{gray}{03_ml_basics_*}) \normalsize
- Decisions trees \hspace{0.1cm} \footnotesize (\textcolor{gray}{04_decision_trees_*}) \normalsize
- Neural networks \hspace{0.1cm} \footnotesize (\textcolor{gray}{05_neural_networks_*}) \normalsize \vspace{3.5cm}
Programm Day 1
-
Technicalities
-
Summary of NumPy
-
Plotting with matplotlib
-
Input / output of data
-
Summary of pandas
-
Fitting with iminuit and PyROOT
-
Transparencies with activated links, examples and exercises
-
Software: \textcolor{violet}{01_intro_python.pdf}
-
Fitting: \textcolor{violet}{02_fit_intro.pdf}
-
\vspace{2cm}
Programm Day 2
-
Supervised learning
-
Classification and regression
-
Linear regression
-
Logistic regression
-
Softmax regression (multi-class classification)
\vspace{4cm}
Programm Day 3
-
Decision trees
-
Bagging and boosting
-
Random forest
-
XGBoost
\vspace{5cm}
Programm Day 4
-
Neural networks
-
Convolutional neural networks
-
TensorFlow and Keras
-
Hand-written digit recognition with Keras
\vspace{5cm}