Machine Learning Kurs im Rahmen der Studierendentage im SS 2023
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

202 lines
5.3 KiB

  1. % Introduction to Data Analysis and Machine Learning in Physics
  2. % Martino Borsato, Jörg Marks, Klaus Reygers
  3. % 11-14 April 2023 \newline 9:00 - 12:00 and 14:00 - 17:00
  4. ## Outline
  5. * **Day 1**
  6. - Introduction, software and data fitting
  7. * **Day 2**
  8. - Machine learning - basics
  9. * **Day 3**
  10. - Machine learning - decision tree
  11. * **Day 4**
  12. - Machine learning - convolutional networks
  13. * **Organization** and **Objective**
  14. - \textcolor{red} {2 ETC: Compulsory attendance is required} \newline
  15. \textcolor{red} {Active participation in the exercises}
  16. - \textcolor{blue}{Course in CIP pool in a tutorial style}
  17. - \textcolor{blue}{Obtain basic knowledge for problem-oriented self-studies}
  18. ## Course Information (1)
  19. * Course requirements
  20. - Python knowledge needed / good C++ knowledge might work
  21. - Userid to use the CIP Pool of the faculty of physics
  22. * Course structure
  23. - \textcolor{red}{Course in CIP pool} using the \textcolor{red}{jupyter3 hub}
  24. - Lectures are interleaved with tutorial/exercise sessions in small groups
  25. (up to 5 persons / group)
  26. * Course homepage which includes and distributes all material
  27. \small
  28. [https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/) \normalsize
  29. /transparencies      \textcolor{blue}{Transparencies of the lectures}
  30. /examples             \textcolor{blue}{iPython files shown in the lectures}
  31. /exercises             \textcolor{blue}{Exercises to be solved during the course}
  32. /solutions             \textcolor{blue}{Solutions of the exercises}
  33. ## Course Information (2)
  34. `TensorFlow` and `Keras` are now also installed in the CIP jupyter hub. In addition, with a google account you can run jupyter notebooks on Google Colab:
  35. \vspace{3ex}
  36. [https://colab.research.google.com/](https://colab.research.google.com/)
  37. \vfill
  38. Missing python libraries can be included by adding the following to a cell (here for the pypng library):
  39. ```
  40. !pip install pypng
  41. ```
  42. ## Course Information (3)
  43. * Your installation at home:
  44. * \textcolor{blue}{Web Browser to access jupyter3}
  45. * \textcolor{blue}{Access to the CIP pool via an ssh client on your home PC}
  46. * No requirements for a special operating system
  47. * Software:
  48. * firefox or similar
  49. * Cisco AnyConnect
  50. * ssh client (MobaXterm on Windows, integrated in Linux/Mac)
  51. * Local execution of python / iPython
  52. * Install ``anaconda3`` and download / run the iPython notebooks (also python scripts are available)
  53. * \textcolor{red}{Hints for software installations and CIP pool access}
  54. \small
  55. [https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.PDF](https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.PDF) \normalsize
  56. ## Course Information (4)
  57. Alternatively, you can install the libraries needed on your local computer.
  58. \vfill
  59. Here are the relevant instruction for macOS using `pip`:
  60. \vfill
  61. Assumptions: `homebrew` is installed.
  62. \vfill
  63. Install python3 (see https://docs.python-guide.org/starting/install3/osx/)
  64. \footnotesize
  65. ```
  66. $ brew install python
  67. $ python --version
  68. Python 3.8.5
  69. ```
  70. \normalsize
  71. Make sure pip3 is up-to-date (alternative: conda $\rightarrow$ don't mix conda and pip installations)
  72. \footnotesize
  73. ```
  74. $ pip3 install --upgrade pip
  75. ```
  76. \normalsize
  77. Install modules needed:
  78. \footnotesize
  79. ```
  80. $ pip3 install --upgrade jupyter matplotlib numpy pandas
  81. scipy scikit-learn xgboost iminuit tensorflow tensorflow_datasets Keras
  82. ```
  83. \normalsize
  84. ## Topcics and file name conventions
  85. 0. Introduction (this file) \hspace{0.1cm} \footnotesize (\textcolor{gray}{introduction.pdf}) \normalsize
  86. 1. Introduction to python \hspace{0.1cm} \footnotesize (\textcolor{gray}{01\_intro\_python\_*}) \normalsize
  87. 2. Data modeling and fitting \hspace{0.1cm} \footnotesize (\textcolor{gray}{02\_fit\_intro\_*}) \normalsize
  88. 3. Machine learning basics \hspace{0.1cm} \footnotesize (\textcolor{gray}{03\_ml\_basics\_*}) \normalsize
  89. 4. Decisions trees \hspace{0.1cm} \footnotesize (\textcolor{gray}{04\_decision\_trees\_*}) \normalsize
  90. 5. Neural networks \hspace{0.1cm} \footnotesize (\textcolor{gray}{05\_neural\_networks\_*}) \normalsize
  91. \vspace{3.5cm}
  92. ## Programm Day 1
  93. * Technicalities
  94. * Summary of NumPy
  95. * Plotting with matplotlib
  96. * Input / output of data
  97. * Summary of pandas
  98. * Fitting with iminuit and PyROOT
  99. * Transparencies with activated links, examples and exercises
  100. * Software: [\textcolor{violet}{01\_intro\_python.pdf}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/transparencies/01_intro_python.pdf)
  101. * Fitting:
  102. [\textcolor{violet}{02\_fit\_intro.pdf}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2023/ml/transparencies/02_fit_intro.pdf)
  103. \vspace{2cm}
  104. ## Programm Day 2
  105. * Supervised learning
  106. * Classification and regression
  107. * Linear regression
  108. * Logistic regression
  109. * Softmax regression (multi-class classification)
  110. \vspace{4cm}
  111. ## Programm Day 3
  112. * Decision trees
  113. * Bagging and boosting
  114. * Random forest
  115. * XGBoost
  116. \vspace{5cm}
  117. ## Programm Day 4
  118. * Neural networks
  119. * Convolutional neural networks
  120. * TensorFlow and Keras
  121. * Hand-written digit recognition with Keras
  122. \vspace{5cm}