ML-Kurs-SS2023/slides/introduction.md
2023-04-07 08:14:19 +02:00

5.4 KiB

% Introduction to Data Analysis and Machine Learning in Physics % Jörg Marks, Klaus Reygers % 11-14 April 2023 \newline 9:00 - 12:00 and 14:00 - 17:00

Outline

  • Day 1

    • Introduction, software and data fitting
  • Day 2

    • Machine learning - basics
  • Day 3

    • Machine learning - decision trees
  • Day 4

    • Machine learning - convolutional networks and graph neural networks
  • Organization and Objective

    • \textcolor{red} {2 ETC: Compulsory attendance is required} \newline \textcolor{red} {Active participation in the exercises}
    • \textcolor{blue}{Course in CIP pool in a tutorial style}
    • \textcolor{blue}{Obtain basic knowledge for problem-oriented self-studies}

Course Information (1)

  • Course requirements

    • Python knowledge needed / good C++ knowledge might work
    • Userid to use the CIP Pool of the faculty of physics
  • Course structure

    • \textcolor{red}{Course in CIP pool} using the \textcolor{red}{jupyter3 hub}
    • Lectures are interleaved with tutorial/exercise sessions in small groups (up to 5 persons / group)
  • Course homepage which includes and distributes all material \small https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/ \normalsize

    /transparencies      \textcolor{blue}{Transparencies of the lectures}

    /examples             \textcolor{blue}{iPython files shown in the lectures}

    /exercises             \textcolor{blue}{Exercises to be solved during the course}

    /solutions             \textcolor{blue}{Solutions of the exercises}

Course Information (2)

TensorFlow and Keras are now also installed in the CIP jupyter hub. In addition, with a google account you can run jupyter notebooks on Google Colab:

\vspace{3ex} https://colab.research.google.com/

\vfill

Missing python libraries can be included by adding the following to a cell (here for the pypng library):

!pip install pypng

Course Information (3)

  • Your installation at home:

    • \textcolor{blue}{Web Browser to access jupyter3}
    • \textcolor{blue}{Access to the CIP pool via an ssh client on your home PC}
  • No requirements for a special operating system

  • Software:

    • firefox or similar
    • Cisco AnyConnect
    • ssh client (MobaXterm on Windows, integrated in Linux/Mac)
  • Local execution of python / iPython

    • Install anaconda3 and download / run the iPython notebooks (also python scripts are available)
  • \textcolor{red}{Hints for software installations and CIP pool access} \small

    https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.PDF \normalsize

Course Information (4)

Alternatively, you can install the libraries needed on your local computer.

\vfill

Here are the relevant instruction for macOS using pip:

\vfill

Assumptions: homebrew is installed.

\vfill

Install python3 (see https://docs.python-guide.org/starting/install3/osx/) \footnotesize

$ brew install python
$ python --version
Python 3.8.5

\normalsize

Make sure pip3 is up-to-date (alternative: conda \rightarrow don't mix conda and pip installations) \footnotesize

$ pip3 install --upgrade pip

\normalsize

Install modules needed: \footnotesize

$ pip3 install --upgrade jupyter matplotlib numpy pandas 
scipy scikit-learn xgboost iminuit tensorflow tensorflow_datasets Keras

\normalsize

Topcics and file name conventions

  1. Introduction (this file) \hspace{0.1cm} \footnotesize (\textcolor{gray}{introduction.pdf}) \normalsize
  2. Introduction to python \hspace{0.1cm} \footnotesize (\textcolor{gray}{01_intro_python_*}) \normalsize
  3. Data modeling and fitting \hspace{0.1cm} \footnotesize (\textcolor{gray}{02_fit_intro_*}) \normalsize
  4. Machine learning basics \hspace{0.1cm} \footnotesize (\textcolor{gray}{03_ml_basics_*}) \normalsize
  5. Decisions trees \hspace{0.1cm} \footnotesize (\textcolor{gray}{04_decision_trees_*}) \normalsize
  6. Neural networks \hspace{0.1cm} \footnotesize (\textcolor{gray}{05_neural_networks_*}) \normalsize \vspace{3.5cm}

Programm Day 1

\vspace{2cm}

Programm Day 2

  • Introduction to machine learning

    • Tensorflow / Keras, datasets
    • Supervised learning
    • Classification

\vspace{0.5cm}

  • Multivariate analysis

    • Regression
    • Linear regression
    • Logistic regression
    • Softmax regression (multi-class classification)

\vspace{4cm}

Programm Day 3

  • Decision trees

  • Bagging and boosting

  • Random forest

  • XGBoost

\vspace{5cm}

Programm Day 4

  • Neural networks

  • Convolutional neural networks

  • Hand-written digit recognition with Keras

\vspace{5cm}