initial set of slides corresponding to 2022 version

2023-03-10 14:19:06 +01:00 · 2023-03-10 14:19:06 +01:00 · 46fae93569
commit 46fae93569
parent 702e034438
104 changed files with 3988 additions and 0 deletions
--- a/slides/.gitignore
+++ b/slides/.gitignore
@ -0,0 +1 @@
 .DS_Store
--- a/slides/Makefile
+++ b/slides/Makefile
@ -0,0 +1,10 @@
 # make     creates pdf files of all newly edited .md files
 SRCS := $(wildcard *.md)
 PDF := $(SRCS:%.md=%.pdf)
 OPT := --pdf-engine=xelatex --variable mainfont="Helvetica" --variable sansfont="Helvetica" -t beamer -s -fmarkdown-implicit_figures --template=template.beamer --highlight-style=kate 
 all: ${PDF}
 %.pdf: %.md
 	pandoc $(OPT) --output=$@ $<
--- a/slides/README.md
+++ b/slides/README.md
@ -0,0 +1,2 @@
 Pandoc slides example following style of [Stefan Wunsch's CERN IML workhsop presenation](https://github.com/stwunsch/iml_keras_workshop) on [keras](https://keras.io/) (see slides folder)
--- a/slides/copy_slides.sh
+++ b/slides/copy_slides.sh
@ -0,0 +1,6 @@
 # slides (do chgrp machlearn <file> later)
 # scp CIPpoolAccess.PDF reygers@rho0:public_html/lectures/2021/ml/transparencies/
 # scp 03_ml_basics.pdf reygers@rho0:public_html/lectures/2021/ml/transparencies/
 # scp 04_decision_trees.pdf reygers@rho0:public_html/lectures/2021/ml/transparencies/
 scp 05_neural_networks.pdf reygers@rho0:public_html/lectures/2021/ml/transparencies/
--- a/slides/decision_trees.md
+++ b/slides/decision_trees.md
@ -0,0 +1,347 @@
 ---
 title: |
  | Introduction to Data Analysis and Machine Learning in Physics:  
  | 4. Decisions Trees  
 author: "Martino Borsato, Jörg Marks, Klaus Reygers"
 date: "Studierendentage, 11-14 April 2022"
 ---
 ## Exercises
 * Exercise 1: Compare different decision tree classifiers
 	* [`04_decision_trees_ex_1_compare_tree_classifiers.ipynb`](https://nbviewer.jupyter.org/urls/www.physi.uni-heidelberg.de/~reygers/lectures/2022/ml/exercises/04_decision_trees_ex_1_compare_tree_classifiers.ipynb)
 * Exercise 2: Apply XGBoost classifier to MAGIC data set
 	* [`04_decision_trees_ex_2_magic_xgboost_and_random_forest.ipynb`](https://nbviewer.jupyter.org/urls/www.physi.uni-heidelberg.de/~reygers/lectures/2022/ml/exercises/04_decision_trees_ex_2_magic_xgboost_and_random_forest.ipynb)
 * Exercise 3: Feature importance
 * Exercise 4: Interpret a classifier with SHAP values
 ## Decision trees
 \begin{figure}
 \centering
 \includegraphics[width=0.85\textwidth]{figures/mini_boone_decisions_tree.png}
 \end{figure}
 \begin{center}
 Leaf nodes classify events as either signal or background
 \end{center}
 ## Decision trees: Rectangular volumes in feature space
 \begin{figure}
 \centering
 \includegraphics[width=0.75\textwidth]{figures/decision_trees_feature_space.png}
 \end{figure}
 * Easy to interpret and visualize: Space of feature vectors split up into rectangular volumes (attributed to either signal or background)
 * How to build a decision tree in an optimal way?
 ## Finding optimal cuts
 Separation btw. signal and background is often measured with the Gini index (or Gini impurity):
 $$ G = p (1-p) $$
 Here $p$ is the purity:
 $$ p = \frac{\sum_\mathrm{signal} w_i}{\sum_\mathrm{signal} w_i + \sum_\mathrm{background} w_i}, \quad w_i = \text{weight of event}\; i$$
 \vfill
 \textcolor{gray}{Usefulness of weights will become apparent soon.}
 \vfill
 Improvement in signal/background separation after splitting a set A into two sets B and C:
 $$ \Delta = W_A G_A - W_B G_B - W_C G_C \quad \text{where} \quad W_X = \sum_{X} w_i $$
 ## Gini impurity and other purity measures
 \begin{figure}
 \centering
 \includegraphics[width=0.7\textwidth]{figures/signal_purity.png}
 \end{figure}
 ## Decision tree pruning
 ::: columns
 :::: {.column width=50%}
 When to stop growing a tree?
 * When all nodes are essentially pure?
 * Well, that's overfitting!
 \vspace{3ex}
 Pruning
 * Cut back fully grown tree to avoid overtraining, i.e., replace nodes and subtrees by leaves
 ::::
 :::: {.column width=50%}
 \begin{figure}
 \centering
 \includegraphics[width=0.85\textwidth]{figures/tree_pruning_slides.png}
 \end{figure}
 ::::
 :::
 ## Single decision trees: Pros and cons
 \textcolor{green}{Pros:}
 * Requires little data preparation (unlike neural networks)
 * Can use continuous and categorical inputs
 \vfill
 \textcolor{red}{Cons:}
 * Danger of overfitting training data
 * Sensitive to fluctuations in the training data
 * Hard to find global optimum
 * When to stop splitting?
 ## Ensemble methods: Combine weak learners
 ::: columns
 :::: {.column width=70%}
 * Bootstrap Aggregating (Bagging)
 	* Sample training data (with replacement) and train a separate model on each of the derived training sets
 	* Classify example with majority vote, or compute average output from each tree as model output
 ::::
 :::: {.column width=30%}
 $$ y(\vec x) = \frac{1}{N_\mathrm{trees}} \sum_{i=1}^{N_{trees}} y_i(\vec x) $$ 
 ::::
 :::
 \vfill
 ::: columns
 :::: {.column width=70%}
 * Boosting
 	* Train $N$ models in sequence, giving more weight to examples not correctly classified by previous model
 	* Take weighted average to classify examples
 ::::
 :::: {.column width=30%}
 $$ y(\vec x) = \frac{\sum_{i=1}^{N_\mathrm{trees}} \alpha_i y_i(\vec x)}{\sum_{i=1}^{N_\mathrm{trees}} \alpha_i} $$ 
 ::::
 :::
 ## Random forests
 * "One of the most widely used and versatile algorithms in data science and machine learning" 
 \tiny \textcolor{gray}{arXiv:1803.08823v3} \normalsize
 \vfill
 * Use bagging to select random example subset
 \vfill
 * Train a tree, but only use random subset of features at each split
 	* this reduces the correlation between different trees
 	* makes the decision more robust to missing data
 ## Boosted decision trees: Idea
 \begin{figure}
 \centering
 \includegraphics[width=0.75\textwidth]{figures/bdt.png}
 \end{figure}
 ## AdaBoost (short for Adaptive Boosting)
 Initial training sample
 \begin{center}
 \begin{tabular}{l l}
 $\vec x_1, ..., \vec x_n$: & multivariate event data \\
 $y_1, ..., y_n$: & true class labels, $+1$ or $-1$ \\
 $w_1^{(1)}, ..., w_n^{(1)}$ & event weights
 \end{tabular}
 \end{center}
 with equal weights normalized as
 $$ \sum_{i=1}^n w_i^{(1)} = 1 $$
 Train first classifier $f_1$:
 \begin{center}
 \begin{tabular}{l l}
 $f_1(\vec x_i) > 0$ & classify as signal \\
 $f_1(\vec x_i) < 0$ & classify as background
 \end{tabular}
 \end{center}
 ## AdaBoost: Updating events weights
 Define training sample $k+1$ from training sample $k$ by updating weights:
 $$ w_i^{(k+1)} = w_i^{(k)} \frac{e^{- \alpha_k f_k(\vec x_i) y_i/2}}{Z_k} $$
 \footnotesize
 \textcolor{gray}{$$ i = \text{event index}, \quad Z_k:\; \text{normalization factor so that } \sum_{i=1}^n w_i^{(k)} = 1$$}
 \normalsize
 Weight is increased if event was misclassified by the previous classifier
 $\to$ "Next classifier should pay more attention to misclassified events"
 \vfill
 At each step the classifier $f_k$ minimizes error rate:
 $$ \varepsilon_k = \sum_{i=1}^n w_i^{(k)} I(y_i f_k( \vec x_i) \le 0), 
 \quad I(X) = 1 \; \text{if} \; X \; \text{is true, 0 otherwise}  $$
 ## AdaBoost: Assigning the classifier score
 Assign score to each classifier according to its error rate:
 $$ \alpha_k = \ln \frac{1 - \varepsilon_k}{\varepsilon_k} $$
 \vfill
 Combined classifier (weighted average):
 $$ f(\vec x) = \sum_{k=1}^K \alpha_k f_k(\vec x) $$
 ## Gradient boosting
 Basic idea:
 * Train a first decision tree
 * Then train a second one on the residual errors made by the first tree
 * And so on
 \vfill
 In slightly more detail:
 * \color{gray} Consider labeled training data: $\{\vec x_i, y_i\}$
 * Model prediction at iteration $m$: $F_m(\vec x_i)$
 * New model: $F_{m+1}(\vec x) = F_m(\vec x) + h_m(\vec x)$
 * Find $h_m(\vec x)$ by fitting it to 
 $\{(\vec x_1, y_1 - F_m(\vec x_1)), \; (\vec x_2, y_2 - F_m(\vec x_2)), \; ... \; (\vec x_n, y_n - F_m(\vec x_n)) \}$
 \color{black}
 ## Example 1: Predict critical temperature for superconductivty (Regression with XGBoost) (1)
 \small
 [\textcolor{gray}{04\_decision\_trees\_critical\_temp\_regression.ipynb}](https://nbviewer.jupyter.org/urls/www.physi.uni-heidelberg.de/~reygers/lectures/2022/ml/examples/04_decision_trees_critical_temp_regression.ipynb)
 \normalsize
 \vfill
 Superconductivty data set: 
 Predict the critical temperature based on 81 material features.
 \footnotesize
 [\textcolor{gray}{https://archive.ics.uci.edu/ml/datasets/Superconductivty+Data}](https://archive.ics.uci.edu/ml/datasets/Superconductivty+Data)
 \normalsize
 \vfill
 From the abstract:
 We estimate a statistical model to predict the superconducting critical temperature based on the features extracted from the superconductor’s chemical formula. The statistical model gives reasonable out-of-sample predictions: ±9.5 K based on root-mean-squared-error. Features extracted based on thermal conductivity, atomic radius, valence, electron affinity, and atomic mass contribute the most to the model’s predictive accuracy.
 \vfill
 \tiny 
 [\textcolor{gray}{https://doi.org/10.1016/j.commatsci.2018.07.052}](https://doi.org/10.1016/j.commatsci.2018.07.052)
 \normalsize
 ## Example 1: Predict critical temperature for superconductivty (Regression with XGBoost) (2)
 ::: columns
 :::: {.column width=60%}
 \footnotesize
 ```python
 import xgboost as xgb
 XGBreg = xgb.sklearn.XGBRegressor()
 XGBreg.fit(X_train, y_train)
 y_pred = XGBreg.predict(X_test)
 from sklearn.metrics import mean_squared_error
 rms = np.sqrt(mean_squared_error(y_test, y_pred))
 print(f"root mean square error {rms:.2f}")
 ```
 \textcolor{gray}{This gives:}
 `root mean square error 9.68`
 ::::
 :::: {.column width=40%}
 \vspace{6ex}
 ![](figures/critical_temperature.pdf)
 ::::
 :::
 ## Exercise 1: Compare different decision tree classifiers
 \small
 [\textcolor{gray}{04\_decision\_trees\_ex\_1\_compare\_tree\_classifiers.ipynb}](https://nbviewer.jupyter.org/urls/www.physi.uni-heidelberg.de/~reygers/lectures/2022/ml/exercises/04_decision_trees_ex_1_compare_tree_classifiers.ipynb)
 \vspace{5ex}
 Compare scikit-learns's `AdaBoostClassifier`, `RandomForestClassifier`, and `GradientBoostingClassifier` by plotting their ROC curves for the heart disease data set. \newline
 \vspace{2ex}
 Is there a classifier that clearly performs best?
 ## Exercise 2: Apply XGBoost classifier to MAGIC data set
 \small
 [\textcolor{gray}{04\_decision\_trees\_ex\_2\_magic\_xgboost\_and\_random\_forest.ipynb}](https://nbviewer.jupyter.org/urls/www.physi.uni-heidelberg.de/~reygers/lectures/2022/ml/exercises/04_decision_trees_ex_2_magic_xgboost_and_random_forest.ipynb)
 \normalsize
 \footnotesize
 ```python
 # train XGBoost boosted decision tree
 import xgboost as xgb
 XGBclassifier = xgb.sklearn.XGBClassifier(nthread=-1, seed=1, n_estimators=1000)
 ```
 \normalsize
 \small
 a) Plot predicted probabilities for the test sample for signal and background events (\texttt{plt.hist})
 b) Which is the most important feature for discriminating signal and background according to XGBoost? \ 
 Hint: use plot_impartance from XGBoost (see [XGBoost plotting API](https://xgboost.readthedocs.io/en/latest/python/python_api.html)). Do you get the same answer for all three performance measures provided by XGBoost (“weight”, “gain”, or “cover”)?
 c) Visualize one decision tree from the ensemble (let's say tree number 10). For this you need the the graphviz package (`pip3 install graphviz`)
 d) Compare the performance of XGBoost with the [**random forest classifier**](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) from [**scikit learn**](https://scikit-learn.org/stable/index.html). Plot signal and background efficiency for both classifiers in one plot. Which classifier performs better?
 \normalsize
 ## Exercise 3: Feature importance
 \small
 [\textcolor{gray}{04\_decision\_trees\_ex\_3\_magic\_feature\_importance.ipynb}](https://nbviewer.jupyter.org/urls/www.physi.uni-heidelberg.de/~reygers/lectures/2022/ml/exercises/04_decision_trees_ex_3_magic_feature_importance.ipynb)
 \normalsize
 \vspace{3ex}
 Evaluate the importance of each of the $n$ features in the training of the XGBoost classifier for the MAGIC data set by dropping one of the features. This gives $n$ different classifiers. Compare the performance of these classifiers using the AUC score. 
 ## Exercise 4: Interpret a classifier with SHAP values
 SHAP (SHapley Additive exPlanations) are a means to explain the output of any machine learning model. [Shapeley values](https://en.wikipedia.org/wiki/Shapley_value) are a concept that is used in cooperative game theory. They are named after Lloyd Shapley who won the Nobel Prize in Economics in 2012.
 \vfill
 Use the Python library [`SHAP`](https://shap.readthedocs.io/en/latest/index.html) to quantify the feature importance.
 a) Study the documentation at [https://shap.readthedocs.io/en/latest/tabular_examples.html](https://shap.readthedocs.io/en/latest/tabular_examples.html)
 b) Create a summary plot of the feature importance in the MAGIC data set with `shap.summary_plot` for the XGBoost classifier of exercise 2. What are the three most important features?
 c) Do the same for the superconductivity data set? What are the three most important features? 
--- a/slides/figures/03_ml_basics_galton_linear_regression_iminuit.pdf
+++ b/slides/figures/03_ml_basics_galton_linear_regression_iminuit.pdf
--- a/slides/figures/03_ml_basics_log_regr_heart_disease.pdf
+++ b/slides/figures/03_ml_basics_log_regr_heart_disease.pdf
--- a/slides/figures/03_ml_basics_logistic_regression.pdf
+++ b/slides/figures/03_ml_basics_logistic_regression.pdf
--- a/slides/figures/L1vsL2.pdf
+++ b/slides/figures/L1vsL2.pdf
--- a/slides/figures/activation_functions.png
+++ b/slides/figures/activation_functions.png
--- a/slides/figures/adversarial_attack.png
+++ b/slides/figures/adversarial_attack.png
--- a/slides/figures/ai_history.png
+++ b/slides/figures/ai_history.png
--- a/slides/figures/ai_ml_dl.pdf
+++ b/slides/figures/ai_ml_dl.pdf
--- a/slides/figures/ann.png
+++ b/slides/figures/ann.png
--- a/slides/figures/anomaly_detection.png
+++ b/slides/figures/anomaly_detection.png
--- a/slides/figures/autoencoder_example.pdf
+++ b/slides/figures/autoencoder_example.pdf
--- a/slides/figures/bdt.png
+++ b/slides/figures/bdt.png
--- a/slides/figures/book-murphy.png
+++ b/slides/figures/book-murphy.png
--- a/slides/figures/book_deep_learning_for_physics_research.png
+++ b/slides/figures/book_deep_learning_for_physics_research.png
--- a/slides/figures/boston_house_prices.pdf
+++ b/slides/figures/boston_house_prices.pdf
--- a/slides/figures/cnn.png
+++ b/slides/figures/cnn.png
--- a/slides/figures/cnn_conv_layer.png
+++ b/slides/figures/cnn_conv_layer.png
--- a/slides/figures/cnn_fully_connected.png
+++ b/slides/figures/cnn_fully_connected.png
--- a/slides/figures/cnn_pooling.png
+++ b/slides/figures/cnn_pooling.png
--- a/slides/figures/cnn_sliding_filter.png
+++ b/slides/figures/cnn_sliding_filter.png
--- a/slides/figures/critical_temperature.pdf
+++ b/slides/figures/critical_temperature.pdf
--- a/slides/figures/cross_val.png
+++ b/slides/figures/cross_val.png
--- a/slides/figures/decision_boundaries.png
+++ b/slides/figures/decision_boundaries.png
--- a/slides/figures/decision_trees_feature_space.png
+++ b/slides/figures/decision_trees_feature_space.png
--- a/slides/figures/deep_learning_book.png
+++ b/slides/figures/deep_learning_book.png
--- a/slides/figures/deep_learning_with_python.png
+++ b/slides/figures/deep_learning_with_python.png
--- a/slides/figures/deepl.png
+++ b/slides/figures/deepl.png
--- a/slides/figures/dnn.png
+++ b/slides/figures/dnn.png
--- a/slides/figures/dropout.png
+++ b/slides/figures/dropout.png
--- a/slides/figures/example_overtraining.png
+++ b/slides/figures/example_overtraining.png
--- a/slides/figures/feature_transformation.png
+++ b/slides/figures/feature_transformation.png
--- a/slides/figures/fisher.png
+++ b/slides/figures/fisher.png
--- a/slides/figures/fisher_linear_decision_boundary.png
+++ b/slides/figures/fisher_linear_decision_boundary.png
--- a/slides/figures/gan.png
+++ b/slides/figures/gan.png
--- a/slides/figures/gradient_descent.png
+++ b/slides/figures/gradient_descent.png
--- a/slides/figures/gradient_descent_cmp.png
+++ b/slides/figures/gradient_descent_cmp.png
--- a/slides/figures/hands_on_machine_learning.png
+++ b/slides/figures/hands_on_machine_learning.png
--- a/slides/figures/handwritten_digits.png
+++ b/slides/figures/handwritten_digits.png
--- a/slides/figures/heart_table.png
+++ b/slides/figures/heart_table.png
--- a/slides/figures/imagenet.png
+++ b/slides/figures/imagenet.png
--- a/slides/figures/imagenet_challenge.png
+++ b/slides/figures/imagenet_challenge.png
--- a/slides/figures/iminuit_minos_scan-1.png
+++ b/slides/figures/iminuit_minos_scan-1.png
--- a/slides/figures/iminuit_minos_scan-2.png
+++ b/slides/figures/iminuit_minos_scan-2.png
--- a/slides/figures/iris_dataset.png
+++ b/slides/figures/iris_dataset.png
--- a/slides/figures/keras.png
+++ b/slides/figures/keras.png
--- a/slides/figures/knn.png
+++ b/slides/figures/knn.png
--- a/slides/figures/logistic_fct.png
+++ b/slides/figures/logistic_fct.png
--- a/slides/figures/loss_fct.png
+++ b/slides/figures/loss_fct.png
--- a/slides/figures/magic_photo.png
+++ b/slides/figures/magic_photo.png
--- a/slides/figures/magic_photo_small.png
+++ b/slides/figures/magic_photo_small.png
--- a/slides/figures/magic_shower_em_had.png
+++ b/slides/figures/magic_shower_em_had.png
--- a/slides/figures/magic_shower_em_had_small.png
+++ b/slides/figures/magic_shower_em_had_small.png
--- a/slides/figures/magic_shower_parameters.png
+++ b/slides/figures/magic_shower_parameters.png
--- a/slides/figures/magic_sketch.png
+++ b/slides/figures/magic_sketch.png
--- a/slides/figures/matplotlib_Figure_1.png
+++ b/slides/figures/matplotlib_Figure_1.png
--- a/slides/figures/matplotlib_Figure_2.png
+++ b/slides/figures/matplotlib_Figure_2.png
--- a/slides/figures/matplotlib_Figure_3.png
+++ b/slides/figures/matplotlib_Figure_3.png
--- a/slides/figures/matplotlib_Figure_4.png
+++ b/slides/figures/matplotlib_Figure_4.png
--- a/slides/figures/mini_boone_decisions_tree.png
+++ b/slides/figures/mini_boone_decisions_tree.png
--- a/slides/figures/ml_example_spam.png
+++ b/slides/figures/ml_example_spam.png
--- a/slides/figures/mlp.png
+++ b/slides/figures/mlp.png
--- a/slides/figures/mnist.png
+++ b/slides/figures/mnist.png
--- a/slides/figures/monitoring_overtraining.png
+++ b/slides/figures/monitoring_overtraining.png
--- a/slides/figures/mva.png
+++ b/slides/figures/mva.png
--- a/slides/figures/mva_nn.png
+++ b/slides/figures/mva_nn.png
--- a/slides/figures/neuron.png
+++ b/slides/figures/neuron.png
--- a/slides/figures/nn_decision_boundary.png
+++ b/slides/figures/nn_decision_boundary.png
--- a/slides/figures/pandas_crosstabplot.png
+++ b/slides/figures/pandas_crosstabplot.png
--- a/slides/figures/pandas_histogramm.png
+++ b/slides/figures/pandas_histogramm.png
--- a/slides/figures/pandas_scatterplot.png
+++ b/slides/figures/pandas_scatterplot.png
--- a/slides/figures/pdf_from_2d_histogram.png
+++ b/slides/figures/pdf_from_2d_histogram.png
--- a/slides/figures/perceptron_photo.png
+++ b/slides/figures/perceptron_photo.png
--- a/slides/figures/perceptron_retina.png
+++ b/slides/figures/perceptron_retina.png
--- a/slides/figures/perceptron_weighted_sum.png
+++ b/slides/figures/perceptron_weighted_sum.png
--- a/slides/figures/perceptron_with_threshold.png
+++ b/slides/figures/perceptron_with_threshold.png
--- a/slides/figures/regularization.png
+++ b/slides/figures/regularization.png
--- a/slides/figures/relu.png
+++ b/slides/figures/relu.png
--- a/slides/figures/rootOptions.png
+++ b/slides/figures/rootOptions.png
--- a/slides/figures/scikit-learn.png
+++ b/slides/figures/scikit-learn.png
--- a/slides/figures/sigmoid.png
+++ b/slides/figures/sigmoid.png
--- a/slides/figures/signal_background_distr.png
+++ b/slides/figures/signal_background_distr.png
--- a/slides/figures/signal_purity.png
+++ b/slides/figures/signal_purity.png
--- a/slides/figures/stochastic_gradient_descent.png
+++ b/slides/figures/stochastic_gradient_descent.png
--- a/slides/figures/supervised_learning_car_plane.png
+++ b/slides/figures/supervised_learning_car_plane.png
--- a/slides/figures/supervised_nutshell.png
+++ b/slides/figures/supervised_nutshell.png
--- a/slides/figures/tensorflow.png
+++ b/slides/figures/tensorflow.png
--- a/slides/figures/tf_playground.png
+++ b/slides/figures/tf_playground.png
--- a/slides/figures/tree_pruning_slides.png
+++ b/slides/figures/tree_pruning_slides.png
--- a/slides/figures/underfitting_overfitting.pdf
+++ b/slides/figures/underfitting_overfitting.pdf
--- a/slides/figures/underfitting_overfitting_001.png
+++ b/slides/figures/underfitting_overfitting_001.png
--- a/slides/figures/videogame.png
+++ b/slides/figures/videogame.png
--- a/slides/figures/xor.png
+++ b/slides/figures/xor.png
--- a/slides/figures/xor_like_data.pdf
+++ b/slides/figures/xor_like_data.pdf
--- a/slides/fit_intro.md
+++ b/slides/fit_intro.md
@ -0,0 +1,563 @@
 ---
 title: |
  | Introduction to Data Analysis and Machine Learning in Physics:  
  | 2. Data modeling and fitting  
 author: "Martino Borsato, Jörg Marks, Klaus Reygers"
 date: "Studierendentage, 11-14 April 2022"
 ---
 ## Data modeling and fitting  - introduction
 Data analysis is a process of understanding and modeling measured
 data. The goal is to find patterns and to obtain inferences allowing to
 observe underlying patterns.
 * There are 2 approaches to statistical data modeling
   * Hypothesis testing: is our data compatible with a certain model?
   * Determination of model parameter: use the data to determine the parameters
     of a (theoretical) model
 * For the determination of model parameter 
   * Analysis of data distributions $\rightarrow$ mean, variance,
     median, FWHM, .... \newline
     allows for an approximate determination of model parameter
   * Data fitting with the least square method $\rightarrow$ an iterative
     process which minimizes the deviation of a model decribed by parameters
     from data. This determines the optimal values and uncertainties
     of the parameters.
   * Maximum likelihood fitting $\rightarrow$ find a set of model parameters
     which most likely describe the data by maximizing the probability
     distributions.
 The parameter determination by minimization is an integral part of machine
 learning approaches, here a system learns patterns and predicts
 related ones. This is the focus in the upcoming days.
 ## Data modeling and fitting  - introduction
 Data analysis is a process of understanding and modeling measured
 data. The goal is to find patterns and to obtain inferences allowing to
 observe underlying patterns.
 * There are 2 approaches to statistical data modeling
   * Hypothesis testing: is our data compatible with a certain model?
   * Determination of model parameter: use the data to determine the parameters
     of a (theoretical) model
 * For the determination of model parameter 
   * Analysis of data distributions $\rightarrow$ mean, variance,
     median, FWHM, .... \newline
     allows for an approximate determination of model parameter
    \setbeamertemplate{itemize subitem}{\color{red}\tiny$\blacksquare$}
   * \textcolor{blue}{Data fitting with the least square method
     $\rightarrow$ an iterative
     process which minimizes the deviation of a model decribed by parameters
     from data. This determines the optimal values and uncertainties
     of the parameters.}
    \setbeamertemplate{itemize subitem}{\color{blue}\tiny$\blacktriangleright$} 
   * Maximum likelihood fitting $\rightarrow$ find a set of model parameters
     which most likely describe the data by maximizing the probability
     distributions.
 The parameter determination by minimization is an integral part of machine
 learning approaches, here a system learns patterns and predicts
 related ones. This is the focus in the upcoming days.
 ## Least Square (LS) Method (1)
 The method determines the \textcolor{blue}{optimal parameters of functions
     to gaussian distributed measurements}.
 Lets consider a sample of $n$ measurements $y_{i}$ and a parametrized
 description of the measurement $\eta_{i} = f(x_{i} | \theta)$ 
 with a parameter set $\theta = \theta_{1}, \theta_{2} ,.... \theta_{k}$,
 dependent values $x_{i}$ and measurement errors $\sigma_{i}$.
 The parameter set should be determined such that
 \begin{equation*}
 \color{blue}{S = \sum \limits_{i=1}^{n} \frac{(y_i-\eta_i)^2}{\sigma_i^2}  = \sum \limits_{i=1}^{n} \frac{(y_i- f(x_i|\theta))^2}{\sigma_i^2}    \longrightarrow \, minimal }
 \end{equation*}
 In case of correlated measurements the covariance matrix of the $y_{i}$ has to
 be taken into account. This is accomplished by defining a weight matrix from
 the covariance matrix of the input data. A decorrelation of the input data
 should be considered.
 \vspace{0.2cm}
 $S$ follows a $\chi^{2}$-distribution with $(n-k)$ degrees of freedom.
 ## Least Square (LS) Method (2)
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * Example LS-method
   \vspace{0.2cm}
  Often the fit function $f(x, \theta)$ is linear in
  $\theta = \theta_{1}, \theta_{2} ,.... \theta_{k}$
  \vspace{0.2cm}
  $f(x | \theta) = \theta_{1} f_{1}(x) + .... + \theta_{k} f_{k}(x)$
  \vspace{0.2cm}
  If the model is a straight line and our parameters are $\theta_{1}$ and
  $\theta_{2}$ $(f_{1}(x) = 1,$  $f_{2}(x) = x)$ we have
  $f(x | \theta) =  \theta_{1} + \theta_{2} x$
  \vspace{0.2cm}
  The LS equation is
  \vspace{0.2cm}
  $\color{blue}{S = \sum \limits_{i=1}^{n} \frac{(y_i-\eta_i)^2}{\sigma_i^2} } \color{black} {= \sum
  \limits_{i=1}^{n}  \frac{(y_{i} -  \theta_{1} -  x_{i}
  \theta_{2})^2}{\sigma_i^2 }}$   \hspace{0.4cm} and with
  \vspace{0.2cm}
  $\frac{\partial S}{\partial \theta_1} =   \sum\limits_{i=1}^{n} \frac{-2
  (y_i - \theta_1 -  x_i \theta_2)}{\sigma_i^2} = 0$  \hspace{0.4cm}  and  \hspace{0.4cm} 
   $\frac{\partial S}{\partial \theta_2} =   \sum\limits_{i=1}^{n} \frac{-2 x_i (y_i - \theta_1 -  x_i \theta_2)}{\sigma_i^2} = 0$
   \vspace{0.2cm}
   the parameters $\theta_{1}$ and $\theta_{2}$ can be determined.
   \vspace{0.2cm}
   \textcolor{olive}{In case of linear fit functions solutions can be found by matrix inversion}
   \vfill
 ## Least Square (LS) Method (3)
   \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * Use of a nonlinear fit function $f(x, \theta)$ like \hspace{0.4cm}
  $f(x | \theta) =  \theta_{1} \cdot e^{-\theta_{2} x}$
  \vspace{0.2cm}
  results in the LS equation 
  \vspace{0.2cm}
  $\color{blue}{S = \sum \limits_{i=1}^{n} \frac{(y_i-\eta_i)^2}{\sigma_i^2} } \color{black} {= \sum \limits_{i=1}^{n}  \frac{(y_{i} - \theta_{1} \cdot  e^{-\theta_{2} x_{i}})^2}{\sigma_i^2 }}$   \hspace{0.4cm} 
  \vspace{0.2cm}
  which we have to minimize
  \vspace{0.2cm}
  $\frac{\partial S}{\partial \theta_1} =   \sum\limits_{i=1}^{n} \frac{ 2 e^{-2 \theta_2 x_i} ( \theta_1 - y_i e^{\theta_2 x_i} )} {\sigma_i^2 } = 0$   \hspace{0.4cm}  and  \hspace{0.4cm}
  $\frac{\partial S}{\partial \theta_2} =   \sum\limits_{i=1}^{n} \frac{ 2  \theta_1 x_I e^{-2 \theta_2 x_i} (y_i e^{\theta_2 x_i} - \theta_1)} {\sigma_i^2 } = 0$
  \vspace{0.4cm}
  In a nonlinear system, the LS Ansatz leads to derivatives which are
  functions of the independent variable and the parameters $\color{red}\rightarrow$ \textcolor{olive}{no closed solutions}
  \vspace{0.4cm}
  In general, we have gradient equations which don't have closed solutions.
  There are a couple of methods including approximations which allow together
  with numerical methods to find a global minimum, Gauss–Newton algorithm,
  Levenberg–Marquardt algorithm,  gradient descend methods and also direct
  search methods.
 ## Minuit - a programm package for minimization (1)
 In general data fitting and also solving machine learning algorithms lead
 to a minimization problem of functions. In the
 1975-1980 F. James (CERN) developed
 a FORTRAN-based package, [\textcolor{violet}{MINUIT}](http://seal.web.cern.ch/seal/documents/minuit/mntutorial.pdf), which is a framework to handle
 multiparameter minimization and compute the best-fit parameter values and
 uncertainties, including correlations between the parameters.
 \vspace{0.2cm}
 The user provides a minimization function
 $F(X,P)$ with the parameter space $P=(p_1,....p_k)$ and
 variable space $X$ (also multi-dimensional). There is an interface via
 functions which influences the
 the minimization process. MINUIT provides
 [\textcolor{violet}{error calculations}](http://seal.web.cern.ch/seal/documents/minuit/mnerror.pdf) including correlations for the parameter space by evaluating the shape of the function in some neighbourhood of the minimum.
 \vspace{0.2cm}
 The package
 has now a new object-oriented implementation as [\textcolor{violet}{Minuit2 library}](https://root.cern.ch/doc/master/Minuit2Page.html) , written
 in C++.
 \vspace{0.2cm}
 During the minimization $F(X,P)$ is evaluated for various $X$. For the
 choice of $P=(p_1,....p_k)$ different methods are used 
 ## Minuit - a programm package for minimization (2)
 \vspace{0.4cm}
 \textcolor{olive}{SEEK}: Search for the minimum with Monte Carlo methods, mostly used at the start
  of the minimization with unknown starting values. It is not a converging
  algorithm.
  \vspace{0.2cm}
 \textcolor{olive}{SIMPLX}:
  Uses the simplex method of Nelder and Mead. Function values are compared
  in the parameter space. Via step size control the minimum is approached.
  Parameter errors are only approximate, no covariance matrix is calculated.
 \vspace{0.2cm}
 <!---
 A simplex is the smallest n dimensional figure with n+1 corners. By reflecting
 one point in the hyperplane of the other point and adopts itself to the
 function plane.
 -->
 \textcolor{olive}{MIGRAD}:
  Uses an algorithm of R. Fletcher, which takes the function and the gradient
  to  approach the minimum with a variable metric method. An error matrix and
  correlation coefficients are available
 \vspace{0.2cm}
 \textcolor{olive}{HESSE}:
  Calculates the hessian matrix of second derivatives and determines the
  covariance matrix.
 \vspace{0.2cm}
 \textcolor{olive}{MINOS}:
  Calculates (asymmetric) errors using likelihood profiles.
  The algorithm for finding the positive and negative MINOS errors for parameter
  $n$ consists of varying $n$ each time minimizing $F(X,P)$ with respect to
  all the others.
   \vspace{0.2cm}
 ## Minuit - a programm package for minimization (3)
 \vspace{0.4cm}
 Fit process with the minuit package
 \vspace{0.2cm}
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * The individual steps decribed above can be called several times and in different order during the minimization process.
 * Each of the parameters $p_i$ of  $P=(p_1,....p_k)$ can be set constant and
  released during the minimization steps.
 * Problems are expected in models with strong correlation between
  parameters $\rightarrow$ change model to uncorrelated definitions
 * Local minima, edges/steps or undefined ranges in $F(X,P)$ are problematic
  $\rightarrow$ simplify your model
 \vspace{3cm}
 ## Minuit2 - The iminuit package
 \vspace{0.4cm}
 [\textcolor{violet}{iminuit}](https://iminuit.readthedocs.io/en/stable/)  is
 a Jupyter-friendly Python interface for the Minuit2 C++ library.
 \vspace{0.2cm}
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * The class `iminuit.Minuit` instanciates the minuit object. The minimizer
   function is given as argument. Basic steering of the fit
   like setting start parameters, error definition and print level is also
   done here.
 \footnotesize
 ```python
     from iminuit import Minuit
     def fcn(x, y, z):                    # definition of the minimizer function
         return (x - 2) ** 2 + (y - x) ** 2 + (z - 4) ** 2
     m = Minuit(fcn, x=0, y=0, z=0, errordef=1 , print_level=1)       
 ```
 \normalsize
 * Several methods determine the interaction with the fitting process, calls
   to `migrad` , `hesse` or  printing of parameters and errors
 \footnotesize
 ```python
     ......
     m.migrad()                     # run optimiser
     print(m.values , m.errors)     # print results
     m.hesse()                      # run covariance estimator
 ```
 \normalsize
 ## Minuit2 - iminuit example
 \vspace{0.2cm}
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * The function `fcn` describes the model with parameters to be determined by
   data.`fcn` is minimal when the model parameters agree best with data.
   `fcn` has positional arguments, one for each fit parameter. `iminuit`
   example fit:
   [\textcolor{violet}{02\_fit\_exp\_fit\_iMinuit.py}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_exp_fit_iMinuit.py)
 \footnotesize
 ```python
     ......
     x =  np.array([....],dtype='d') # measurements x
     y =  np.array([....],dtype='d') # measurements y
     dy = np.array([....],dtype='d') # error in y
     def xp(a, b , c):
         return a * np.exp(b*x) + c
     # least-squares function = sum of data residuals squared
     def fcn(a,b,c):
        return np.sum((y - xp(a,b,c)) ** 2 / dy ** 2)
     # limit the range of b and fix parameter c
     m = Minuit(fcn,a=1,b=-0.7,c=1,limit_b=(-1,0.1),fix_c=True)
     m.migrad()                      # run minimizer
     m.fixed["c"] = False            # release  parameter c
     m.migrad()                      # rerun minimizer
 ```
 \normalsize
 * Might be useful to fix parameters or limit the range for some applications
 ## Minuit2 - iminuit (3)
 \vspace{0.2cm}
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * Results and control information of the fit can be printed and accessed
  in the the prorgamm.
 \footnotesize
 ```python
     ......
     m = Minuit(fcn,....,print_level=1) # set flag in the initializer
     m.migrad()                         # run minimizer
     a_fit = m.values['a']              # get parameter value a
     a_fit_error =  m.errors['a']       # get parameter error of a
     print (m.values,m.errors)          # print results
 ```
 \normalsize      
 * After processing Hesse, covariance and correlation information of the
   fit is available
 \footnotesize
 ```python
     ......
     m.hesse()                           # run covariance estimator
     m.matrix()                          # get covariance matrix
     m.matrix(correlation=True)          # get full correlation matrix
     cov = m.np_matrix()                 # save matrix to numpy
     cor = m.np_matrix(correlation=True) 
     print(cor[0, 1])      # print correlation between parameter 1 and 2
 ```
 \normalsize      
 ## Minuit2 - iminuit (4)
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * Minos provides asymmetric uncertainty intervals and parameter contours by
   scanning one parameter and minimizing the function with respect to all other
   parameters for each scan point. Results are displayed with `matplotlib`.
 \footnotesize
 ```python
     ......
     m.minos()
     print (m.get_merrors()['a'])
     m.draw_mnprofile('b')
     m.draw_mncontour('a', 'b', nsigma=4)
 ```
 ::: columns
 :::: {.column width=40%}
 ![](figures/iminuit_minos_scan-1.png)
 ::::
 :::: {.column width=40%}
 ![](figures/iminuit_minos_scan-2.png)
 ::::
 :::
 ## Exercise 3
 Plot the following data with mathplotlib as in the iminuit example:
 \footnotesize
 ```
   x:   0.2,0.4,0.6,0.8,1.,1.2,1.4,1.6,1.8,2.,2.2,2.4,2.6,2.8,3.,3.2,
        3.4,3.6, 3.8,4.
   y:   0.04,0.021,0.035,0.03,0.029,0.019,0.024,0.018,0.019,0.022,0.02,
        0.025,0.018,0.024,0.019,0.021,0.03,0.019,0.03,0.024
   dy:  1.792,1.695,1.541,1.514,1.427,1.399,1.388,1.270,1.262,1.228,1.189,
        1.182,1.121,1.129,1.124,1.089,1.092,1.084,1.058,1.057
 ```
 \normalsize
 \setbeamertemplate{itemize item}{\color{red}$\square$}
 *  Exchange in the example iminuit fit `02_fit_exp_fit_iMinuit.ipynb` the
   exponential function by a 3rd order polynomial and perform the fit
 *  Compare the correlation of the parameters of the exponential and
   the polynomial fit
 *  What defines the fit quality, give an estimate
 \small
  Solution: [\textcolor{violet}{02\_fit\_ex\_3\_sol.py}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/solutions/02_fit_ex_3_sol.py) \normalsize
 ## Exercise 4
 Plot the following data with mathplotlib:
 \footnotesize
 ```
   x:   1, 2, 3, 4, 5, 6, 7, 8, 9, 10
   dx:  0.1,0.1,0.5,0.1,0.5,0.1,0.5,0.1,0.5,0.1
   y:   1.1,2.3,2.7,3.2,3.1,2.4,1.7,1.5,1.5,1.7
   dy:  0.15,0.22,0.29,0.39,0.31,0.21,0.13,0.15,0.19,0.13
 ```
 \normalsize
 \setbeamertemplate{itemize item}{\color{red}$\square$}
  * Perform a fit with iminuit. Which model do you use?
  * Plot the resulting fit function in the graph with the data
  * Print the covariance matrix.  Can we improve the errors.
  * Can you draw a contour plot of 2 of the fit parameters.
  \small
   Solution: [\textcolor{violet}{02\_fit\_ex\_4\_sol.py}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/solutions/02_fit_ex_4_sol.py) \normalsize
 ## PyROOT 
 [\textcolor{violet}{PyROOT}](https://root.cern/manual/python/) is the python binding for the C++ data analysis toolkit [\textcolor{violet}{ROOT}](https://root.cern/) developed with and for the LHC community. You can access the full
 ROOT functionality from Python while
 benefiting from the performance of the ROOT C++ libraries. The PyROOT bindings
 are automatic and dynamic and are able to interoperate with widely-used Python
 data-science libraries as `NumPy`, `pandas`, SciPy `scikit-learn` and `tensorflow`.
 * ROOT/PyROOT can be installed easily within anaconda3 (ROOT version 6.22.02
  or later ) or is available in the
  [\textcolor{violet}{CIP jupyter2 Hub}](https://jupyter2.kip.uni-heidelberg.de/)
 * Tools for statistical analysis, a math library with optimized algorithms,
  multivariate analysis, visualization and simulation of data.
 * Storing data including objects and classes with compression in files is a
  very powerfull aspect for any data analysis project 
 * Within PyROOT Minuit2 can be accessed easily either with predefined functions
  or your own function definition
 * For advanced statistical analyses and data modeling  likelihood fitting with
  the packages **rooFit** and **rooStats** is available.
 ## 
 * Example reading the invariant mass measurements of a $D^0$ from a text file
  and determine $\mu$ and $\sigma$  \hspace{1.0cm}  \small
  [\textcolor{violet}{02\_fit\_histFit.py}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_histFit.py)
  \normalsize  
 \footnotesize
 ```python
     import numpy as np
     import math
     from ROOT import TCanvas, TFile, TH1D, TF1, TMinuit, TFitResult
     data = np.genfromtxt('D0Mass.txt', dtype='d') # read data from text file
     c = TCanvas('c','D0 Mass',200,10,700,500)     # instanciate output canvas
     d0 = TH1D('d0','D0 Mass',200,1700.,2000.)     # instanciate histogramm
     for x in data :                               # fill data into histogramm d0
          d0.Fill(x)
     def pyf_tf1_params(x, p):                     # define fit function
          return p[0] * math.exp (-0.5 * ((x[0] - p[1])**2 / p[2]**2))
     func = TF1("func",pyf_tf1_params,1840.,1880.,3)
     # func = TF1("func",'gaus',1840.,1880.)  # use predefined function   
     func.SetParameters(500.,1860.,5.5)    # set start parameters
     myfit = d0.Fit(func,"S")              # fit function to the histogramm data
     print ("Fit results: mean=",myfit.Parameter(0)," +/- ",myfit.ParError(0))
     c.Draw()                                      # draw canvas
     myfile = TFile('myOutFile.root','RECREATE')   # Open a ROOT file for output
     c.Write()                                     # Write canvas
     d0.Write()                                    # Write histogram
     myfile.Close()                                # close file
 ```
 \normalsize
 ## 
 * Fit Options 
 \vspace{0.1cm}
 ::: columns
 :::: {.column width=2%}
 ::::
 :::: {.column width=98%}
 ![](figures/rootOptions.png)
 ::::
 :::
 ## Exercise 5
 Read text file [\textcolor{violet}{FitTestData.txt}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/exercises/FitTestData.txt) and draw a histogramm using PyROOT.
 \setbeamertemplate{itemize item}{\color{red}$\square$}
 * Determine the mean and sigma of the signal distribution. Which function do
  you use for fitting?
 * The option S fills the result object.
 * Try to improve the errors of the fit values with minos using the option E
  and also try the option M to scan for a new minimum, option V provides more
  output.
 * Fit the background outside the signal region use the option R+ to add the
  function to your fit
  \small
   Solution: [\textcolor{violet}{02\_fit\_ex\_5\_sol.py}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/solutions/02_fit_ex_5_sol.py) \normalsize
 ## iPython Examples for Fitting
 The different python packages are used in
 \textcolor{blue}{example iPython notebooks}
 to demonstrate the fitting of a third order polynomial to the same data
 available as numpy arrays.
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
  * LSQ fit of a polynomial to data using Minuit2 with
  \textcolor{blue}{iminuit} and \textcolor{blue}{matplotlib} plot:
    \small
    [\textcolor{violet}{02\_fit\_iminuitFit.ipynb}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_iminuitFit.ipynb)
    \normalsize
  * Graph fitting with \textcolor{blue}{pyROOT} with options using a python
    function including confidence level plot:
    \small
    [\textcolor{violet}{02\_fit\_fitGraph.ipynb}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_fitGraph.ipynb)
    \normalsize
  * Graph fitting with  \textcolor{blue}{numpy} and confidence level
    plotting with  \textcolor{blue}{matplotlib}:
    \small
    [\textcolor{violet}{02\_fit\_numpyFit.ipynb}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_numpyFit.ipynb)   
    \normalsize
  * Graph fitting with a polynomial fit of  \textcolor{blue}{scikit-learn} and
    plotting with  \textcolor{blue}{matplotlib}:
    \normalsize
    \small
    [\textcolor{violet}{02\_fit\_scikitFit.ipynb}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_scikitFit.ipynb)
    \normalsize
--- a/slides/intro_python.md
+++ b/slides/intro_python.md
@ -0,0 +1,830 @@
 ---
 title: |
  | Introduction to Data Analysis and Machine Learning in Physics:  
  | 1. Introduction to python  
 author: "Martino Borsato, Jörg Marks, Klaus Reygers"
 date: "Studierendentage, 11-14 April 2022"
 ---
 ## Outline of the $1^{st}$ day
 * Technical instructions for your interactions with the CIP pool, like
   * using the jupyter hub
   * using python locally in your own linux environment (anaconda)
   * access the CIP pool from your own windows or linux system
   * transfer data from and to the CIP pool
  Can be found in [\textcolor{violet}{CIPpoolAccess.PDF}](https://www.physi.uni-heidelberg.de/~marks/root_einfuehrung/Folien/CIPpoolAccess.pdf)\normalsize
 * Summary of NumPy
 * Plotting with matplotlib
 * Input / output of data
 * Summary of pandas
 * Fitting with iminuit and pyROOT
 ## A glimpse into python classes
 The following python classes are important to data analysis and machine
 learning will be used during the course
 * [\textcolor{violet}{NumPy}](https://numpy.org/doc/stable/user/basics.html) - python library adding support for large,
   multi-dimensional arrays and matrices, along with high-level
   mathematical functions to operate on these arrays
 * [\textcolor{violet}{matplotlib}](https://matplotlib.org/stable/tutorials/index.html) - a python plotting library
 * [\textcolor{violet}{SciPy}](https://docs.scipy.org/doc/scipy/reference/tutorial/index.html) - extension of NumPy by a collection of
   mathematical algorithms for minimization, regression, 
   fourier transformation, linear algebra and image processing
 * [\textcolor{violet}{iminuit}](https://iminuit.readthedocs.io/en/stable/) -
   python wrapper to the data fitting toolkit
   [\textcolor{violet}{Minuit2}](https://root.cern.ch/doc/master/Minuit2Page.html)
   developed at CERN by F. James in the 1970ies 
 * [\textcolor{violet}{pyROOT}](https://root.cern/manual/python/) - python wrapper to the C++ data analysis toolkit
   ROOT used at the LHC
 * [\textcolor{violet}{scikit-learn}](https://scikit-learn.org/stable/) - machine learning library written in
   python, which makes use extensively of NumPy for high-performance
   linear algebra algorithms
 ## NumPy
   \textcolor{blue}{NumPy} (Numerical Python) is an open source Python library,
   which contains multidimensional array and matrix data structures and methods
   to efficiently operate on these. The core object is
   a homogeneous n-dimensional array object,  \textcolor{blue}{ndarray}, which
   allows for a wide variety of \textcolor{blue}{fast operations and mathematical calculations
   with arrays and matrices} due to the extensive usage of compiled code.  
   * It is heavily used in numerous scientific python packages
   * `ndarray` 's  have a fixed size at creation $\rightarrow$ changing size
     leads to recreation
   * Array elements are all required to be of the same data type
   * Facilitates advanced mathematical operations on large datasets
   * See for a summary, e.g. &nbsp;&nbsp;  
 \small
 [\textcolor{violet}{https://cs231n.github.io/python-numpy-tutorial/\#numpy}](https://cs231n.github.io/python-numpy-tutorial/#numpy) \normalsize
 \vfill
 ::: columns
 :::: {.column width=30%}
 ::::
 :::
 ::: columns
 :::: {.column width=35%}
 `c = []`
 `for i in range(len(a)):`
 &nbsp;&nbsp;&nbsp; `c.append(a[i]*b[i])`
 ::::
 :::: {.column width=35%}
 with NumPy
 `c = a * b`
 ::::
 :::
 <!---
 It seem we need to indent by hand.
 I don't manage to align under the bullet text
 If we do it with column the vertical space is with code sections not good
 If we do it without code section the vertical space is ok, but there is no
 code high lightning.
 See the different versions of the same page in the following
 -->
 ## NumPy - array basics
 * numpy arrays build a grid of \textcolor{blue}{same type} values, which are indexed.
  The *rank* is the dimension of the array.
  There are methods to create  and preset arrays.
 \footnotesize
 ```python
 	 myA = np.array([2, 5 , 11])             # create rank 1 array (vector like)
 	 type(myA)                               # <class ‘numpy.ndarray’>
 	 myA.shape                               # (3,)
 	 print(myA[2])                           # 11   access 3. element
 	 myA[0] = 12                             # set 1. element to 12
 	 myB = np.array([[1,5],[7,9]])           # create  rank 2 array
 	 myB.shape                               # (2,2)
 	 print(myB[0,0],myB[0,1],myB[1,1])       # 1 5 9
 	 myC = np.arange(6)                      # create rank 1 set to 0 - 5
 	 myC.reshape(2,3)                        # change rank to (2,3)
 	 zero = np.zeros((2,5))                  # 2 rows, 5 columns, set to 0
 	 one = np.ones((2,2))                    # 2 rows, 2 columns, set to 1
 	 five = np.full((2,2), 5)                # 2 rows, 2 columns, set to 5
 	 e = np.eye(2)                           # create 2x2 identity matrix
 ```
 \normalsize
 ##  NumPy - array indexing (1)
 * select slices of a numpy array
 \footnotesize
 ```python
     a = np.array([[1,2,3,4],
                   [5,6,7,8],                # 3 rows 4 columns array
                   [9,10,11,12]])
     b = a[:2, 1:3]                          # subarray of 2 rows and
         array([[2, 3],                      # column 1 and 2
                [6, 7]])
 ```		    
 \normalsize
 * a slice of an array points into the same data, *modifying* changes the original array!
 \footnotesize
 ```python
     b[0, 0] = 77	                         # b[0,0] and a[0,1] are 77
     r1_row = a[1, :]                        # get 2nd row ->  rank 1
     r1_row.shape	                         # (4,)
     r2_row = a[1:2, :]                      # get 2nd row -> rank 2
     r2_row.shape                            # (1,4)
     a=np.array([[1,2],[3,4],[5,6]])         # set a , 3 rows 2 cols
     d=a[[0, 1, 2], [0, 1, 1]]               # d contains [1 4 6]
     e=a[[1, 2], [1, 1]]                     # e contains [4 6]
     np.array([a[0,0],a[1,1],a[2,0]])        # address elements explicitly
 ```
 \normalsize
 ##  NumPy - array indexing (2)
 * integer array indexing by setting an array of indices $\rightarrow$ selecting/changing elements
 \footnotesize
 ```python
     a = np.array([[1,2,3,4],
                   [5,6,7,8],                # 3 rows 4 columns array
                   [9,10,11,12]])
     p_a = np.array([0,2,0])                 # Create an array of indices
     s = a[np.arange(3), p_a]                # number the rows, p_a points to cols
     print (s)                               # s contains [1 7 9]
     a[np.arange(3),p_a] += 10               # add 10 to corresponding elements
     x=np.array([[8,2],[7,4]])               # create 2x2 array
     bool = (x > 5)                          # bool : array of boolians
                                             #   [[True False]
                                             #    [True False]]
     print(x[x>5])                           # select elements, prints [8 7]
 ```		    
 \normalsize
 * data type in numpy - create according to input numbers or set explicitly
 \footnotesize
 ```python
     x = np.array([1.1, 2.1])                # create float array 
     print(x.dtype)                          # print  float64
     y=np.array([1.1,2.9],dtype=np.int64)    # create float array [1 2]
 ```
 \normalsize
 ## NumPy - functions
 * math functions operate elementwise either as operator overload or as methods
 \footnotesize
 ```python
     x=np.array([[1,2],[3,4]],dtype=np.float64)    # define 2x2 float array
     y=np.array([[3,1],[5,1]],dtype=np.float64)    # define 2x2 float array
     s = x + y                                     # elementwise sum 
     s = np.add(x,y)
     s = np.subtract(x,y)
     s = np.multiply(x,y)                          # no matrix multiplication!
     s = np.divide(x,y)
     s = np.sqrt(x), np.exp(x), ...
     x @ y , or np.dot(x, y)                       # matrix product
     np.sum(x, axis=0)                             # sum of each column
     np.sum(x, axis=1)                             # sum of each row
     xT = x.T                                      # transpose of x
     x = np.linspace(0,2*pi,100)                   # get equal spaced points in x
     r = np.random.default_rng(seed=42)            # constructor random number class
     b = r.random((2,3))                           # random 2x3 matrix
 ```
 \normalsize
 ##
 *  broadcasting in  numpy
  \vspace{0.4cm}
   The term broadcasting describes how numpy treats arrays with different
   shapes during arithmetic operations
   * add a scalar $b$ to a 1D array $a = [a_1,a_2,a_3]$ $\rightarrow$ expand $b$ to
     $[b,b,b]$
     \vspace{0.2cm}
   * add a  scalar $b$ to a 2D [2,3] array  $a =[[a_{11},a_{12},a_{13}],[a_{21},a_{22},a_{23}]]$
     $\rightarrow$ expand $b$ to $b =[[b,b,b],[b,b,b]]$ and add element wise
     \vspace{0.2cm}
   * add 1D array $b = [b_1,b_2,b_3]$ to a 2D [2,3] array $a=[[a_{11},a_{12},a_{13}],[a_{21},a_{22},a_{23}]]$   $\rightarrow$  1D array is broadcast
     across each row of the 2D array $b =[[b_1,b_2,b_3],[b_1,b_2,b_3]]$ and added  element wise 
 \vspace{0.2cm}
   Arithmetic operations can only be performed when the shape of each
   dimension in the arrays are equal or one has the dimension size of 1. Look
   [\textcolor{violet}{here}](https://numpy.org/doc/stable/user/basics.broadcasting.html) for more details 
 \footnotesize
 ```python
     # Add a vector to each row of a matrix
     x = np.array([[1,2,3], [4,5,6]]) # x has shape (2, 3)
     v = np.array([1,2,3])            # v has shape (3,)
     x + v     # [[2 4 6]
               #  [5 7 9]]    
 ```
 \normalsize
 ## Plot data
 A popular library to present data is the `pyplot` module of `matplotlib`.
 * Drawing a function in one plot
 \footnotesize
 ::: columns
 :::: {.column width=35%}
 ```python
 import numpy as np
 import matplotlib.pyplot as plt
 # generate 100 points from 0 to 2 pi
 x = np.linspace( 0, 10*np.pi, 100 )
 f = np.sin(x)**2
 # plot function
 plt.plot(x,f,'blueviolet',label='sine')
 plt.xlabel('x [radian]')
 plt.ylabel('f(x)')
 plt.title('Plot sin^2')
 plt.legend(loc='upper right')
 plt.axis([0,30,-0.1,1.2]) # limit the plot range
 # show the plot
 plt.show()
 ```
 ::::
 :::: {.column width=40%}
 ![](figures/matplotlib_Figure_1.png)
 ::::
 :::
 \normalsize
 ##
 * Drawing subplots in one canvas
 \footnotesize
 ::: columns
 :::: {.column width=35%}
 ```python
 ...
 g = np.exp(-0.2*x)
 # create figure
 plt.figure(num=2,figsize=(10.0,7.5),dpi=150,facecolor='lightgrey')
 plt.suptitle('1 x 2 Plot')
 # create subplot and plot first one
 plt.subplot(1,2,1)
 # plot first one
 plt.title('exp(x)')
 plt.xlabel('x')
 plt.ylabel('g(x)')
 plt.plot(x,g,'blueviolet')
 # create subplot and plot second one 
 plt.subplot(1,2,2)
 plt.plot(x,f,'orange')
 plt.plot(x,f*g,'red')
 plt.legend(['sine^2','exp*sine'])
 # show the plot
 plt.show()
 ```
 ::::
 :::: {.column width=40%}
 \vspace{3cm}
 ![](figures/matplotlib_Figure_2.png)
 ::::
 :::
 \normalsize
 ## Image data 
 The `image` class of the `matplotlib` library can be used to load the image
 to numpy arrays and to render the image.
 * There are 3 common formats for the numpy array  
  * (M, N) scalar data used for greyscale images
  * (M, N, 3) for RGB images (each pixel has an array with RGB color attached) 
  * (M, N, 4) for RGBA images (each pixel has an array with RGB color
    and transparency attached)
  The method `imread` loads the image into an `ndarray`, which can be
  manipulated.
  The method `imshow` renders the image data
 \vspace {2cm}
 ##
 * Drawing pixel data and images
 \footnotesize
 ::: columns
 :::: {.column width=50%}
 ```python
 ....
 # create data array with pixel postion and RGB color code
 width, height = 400, 400
 data = np.zeros((height, width, 3), dtype=np.uint8)
 # red patch in the center
 data[175:225, 175:225] = [255, 0, 0] 
 x = np.random.randint(0,width-1,100)
 y = np.random.randint(0,height-1,100)
 data[x,y]= [0,255,0] # random green pixel
 plt.imshow(data)
 plt.show()
 ....
 import matplotlib.image as mpimg
 #read image into numpy array
 pic = mpimg.imread('picture.jpg')
 mod_pic = pic[:,:,0] # grab slice 0 of the colors
 plt.imshow(mod_pic)  # use default color code also
 plt.colorbar()       # try cmap='hot' 
 plt.show()
 ```
 ::::
 :::: {.column width=25%} 
 ![](figures/matplotlib_Figure_3.png)
 \vspace{1cm}
 ![](figures/matplotlib_Figure_4.png)
 ::::
 ::: 
 \normalsize
 ## Input / output
 For the analysis of measured data efficient input \/ output plays an
 important role. In numpy, `ndarrays` can be saved and read in from files.
 `load()` and `save()` functions handle numpy binary files (.npy extension)
 which contain  data, shape, dtype and other information required to
 reconstruct the `ndarray` of the disk file.
 \footnotesize
 ```python
   r = np.random.default_rng()       # instanciate random number generator
   a = r.random((4,3))               # random 4x3 array
   np.save('myBinary.npy', a)        # write array a to binary file myBinary.npy
   b = np.arange(12)                 
   np.savez('myComp.npz', a=a, b=b)  # write a and b in compressed binary file  
   ......
   b = np.load('myBinary.npy')       # read content of myBinary.npy into b
 ```
 \normalsize
 The storage and retrieval of array data in text file format is done
 with `savetxt()` and `loadtxt()` methods. Parameter controling delimiter,
 line separators, file header and footer can be specified.
 \footnotesize
 ```python
   x = np.array([1,2,3,4,5,6,7])       # create ndarray 
   np.savetxt('myText.txt',x,fmt='%d') # write array x to text file myText.txt
   .....
   y = np.loadtxt('myText.txt',dtype=int)  # read content of myText.txt in y
 ```
 \normalsize
 ## Exercise 1
 i) Display a numpy array as figure of a blue cross. The size should be 200
   by 200 pixel. Use as array format (M, N, 3), where the first 2 specify
   the pixel positions and the last 3 the rbg color from 0:255.
   - Draw in addition a red square of arbitrary position into the figure.
   - Draw a circle in the center of the figure. Try to create a mask which
     selects the inner part of the circle using the indexing.
   \small
   [Solution:  01_intro_ex_1a_sol.py](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/solutions/01_intro_ex_1a_sol.py) \normalsize
 ii) Read data which contains pixels from the binary file horse.py into a
    numpy array. Display the data and the following transformations in 4
    subplots: scaling and translation, compression in x and y, rotation
    and mirroring.
    \small
    [Solution: 01_intro_ex_1b_sol.py](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/solutions/01_intro_ex_1b_sol.py) \normalsize 
 ## Pandas
 [\textcolor{violet}{pandas}](https://pandas.pydata.org/pandas-docs/stable/getting_started/index.html) is a software library written in Python for
 \textcolor{blue}{data manipulation and analysis}. 
 \vspace{0.4cm}
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * Offers data structures and operations for manipulating numerical tables with
  integrated indexing
 * Imports data from various file formats, e.g. comma-separated values, JSON,
  SQL or Excel
 * Tools for reading and writing data structures, allows analyzing, filtering,
  spliting, merging and joining 
 * Built on top of `NumPy`
 * Visualize the data with `matplotlib`
 * Most machine learning tools support `pandas` $\rightarrow$ 
  it is widely used to preprocess data sets for machine learning
 ## Pandas micro introduction
 Goal: Exploring, cleaning, transforming, and visualization of data.
 The basic indexable objects are
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * `Series` -> vector (list) of data elements of arbitrary  type
 * `DataFrame` -> tabular arangement of data elements of column wise
                 arbitrary type
   Both allow cleaning data by removing of `empty` or `nan` data entries
 \footnotesize
 ```python
     import numpy as np
     import pandas as pd                    # use together with numpy
     s = pd.Series([1, 3, 5, np.nan, 6, 8]) # create a Series of float64
     r = pd.Series(np.random.randn(4))      # Series of random numbers float64 
     dates = pd.date_range("20130101", periods=3) # index according to dates
     df = pd.DataFrame(np.random.randn(3,4),index=dates,columns=list("ABCD"))
     print (df)                             # print the DataFrame
                        A         B         C         D
          2013-01-01  1.618395  1.210263 -1.276586 -0.775545
          2013-01-02  0.676783 -0.754161 -1.148029 -0.244821
          2013-01-03 -0.359081  0.296019  1.541571  0.235337
     new_s = s.dropna() # return a new Data Frame with no empty cells	  
 ```
 \normalsize
 ##
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * pandas data can be saved in different file formats (CSV, JASON, html, XML,
  Excel, OpenDocument, HDF5 format, .....). `NaN` entries are kept
  in the output file.
   * csv file
     \footnotesize
     ```python
     df.to_csv("myFile.csv")  # Write the DataFrame df to a csv file 
     ```
      \normalsize
   * HDF5 output
     \footnotesize
     ```python  
     df.to_hdf("myFile.h5",key='df',mode='w') # Write the DataFrame df to HDF5
     s.to_hdf("myFile.h5", key='s',mode='a')	  
     ```
     \normalsize
   * Writing to an excel file
     \footnotesize
     ```python  
     df.to_excel("myFile.xlsx", sheet_name="Sheet1")
     ```
     \normalsize
 * Deleting file with data in python
 \footnotesize
 ```python  
     import os
     os.remove('myFile.h5')
 ```
 \normalsize
 ##
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * read in data from various formats
   * csv file
     \footnotesize
     ```python
      .......
      df = pd.read_csv('heart.csv')  # read csv data table
      print(df.info())
         <class 'pandas.core.frame.DataFrame'>
         RangeIndex: 303 entries, 0 to 302
         Data columns (total 14 columns):
         #   Column    Non-Null Count  Dtype  
         ---  ------    --------------  -----  
         0   age       303 non-null    int64  
         1   sex       303 non-null    int64  
         2   cp        303 non-null    int64
         print(df.head(5))       # prints the first 5 rows of the data table 
         print(df.describe())    # shows a quick statistic summary of your data
     ```
 \normalsize
   * Reading an excel file
     \footnotesize
     ```python  
     df = pd.read_excel("myFile.xlsx","Sheet1", na_values=["NA"])
     ```
     \normalsize
     \textcolor{olive}{There are many options specifying details for IO.}
 ##
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * Various functions exist to select and view data from pandas objects
  * Display column and index
    \footnotesize
     ```python
     df.index                    # show datetime index of df
     DatetimeIndex(['2013-01-01','2013-01-02','2013-01-03'],
                   dtype='datetime64[ns]',freq='D')
     df.column                   # show columns info
     Index(['A', 'B', 'C', 'D'], dtype='object')
     ```
     \normalsize
  * `DataFrame.to_numpy()` gives a `NumPy` representation of the underlying data
    \footnotesize
     ```python
     df.to_numpy()       # one dtype for the entire array, not per column!
     [[-0.62660101 -0.67330526  0.23269168 -0.67403546]
     [-0.53033339  0.32872063 -0.09893568  0.44814084]
     [-0.60289996 -0.22352548 -0.43393248  0.47531456]]
     ```
     \normalsize
     Does not include the index or column labels in the output
  * more on viewing 
    \footnotesize
    ```python
    df.T                                   # transpose the DataFrame df
    df.sort_values(by="B")                 # Sorting by values of a column of df
    df.sort_index(axis=0,ascending=False)  # Sorting by index descending values
    df.sort_index(axis=0,ascending=False)  # Display columns in inverse order
    ```
    \normalsize
 ##
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * Selecting data of pandas objects $\rightarrow$ keep or reduce dimensions
  * get a named column as a Series
    \footnotesize
     ```python
     df["A"]           # selects a column A from df, simular to  df.A
     df.iloc[:, 1:2]   # slices column A explicitly from df, df.loc[:, ["A"]]
     ```
     \normalsize
  * select rows of a DataFrame 
    \footnotesize
     ```python
     df[0:2]                   # selects row 0 and 1 from df, 
     df["20130102":"20130103"] # use indices endpoint are included!
     df.iloc[3]                # Select with the position of the passed integers
     df.iloc[1:3, :]           # selects row 1 and 2 from df
     ```
     \normalsize
  * select by label
     \footnotesize
     ```python
     df.loc["20130102":"20130103",["C","D"]] # selects row 1 and 2 and only C and D
     df.loc[dates[0], "A"]                   # selects a single value (scalar)
     ```
     \normalsize
  *  select by lists of integer position (as in `NumPy`)
     \footnotesize
     ```python
     df.iloc[[0, 2], [1, 3]] # select row 1 and 3 and col B and D
     df.iloc[1, 1]           # get a value explicitly
     ```
     \normalsize
  *  select according to expressions
     \footnotesize
     ```python
     df.query('B<C')         # select rows where B < C
     df1=df[(df["B"]==0)&(df["D"]==0)] # conditions on rows
     ```
     \normalsize
 ##
 \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
 * Selecting data of pandas objects continued
  * Boolean indexing
    \footnotesize
     ```python
     df[df["A"] > 0]           # select df where all values of column A are >0
     df[df > 0]                # select values from the entire DataFrame
     ```
     \normalsize
     more complex example
     \footnotesize
     ```python
     df2 = df.copy()                     # copy df
     df2["E"] = ["eight","one","four"]   # add column E
     df2[df2["E"].isin(["two", "four"])] # test if elements "two" and  "four" are
                                         # contained in Series column E
     ```
     \normalsize
  * Operations (in general exclude missing data)
    \footnotesize
     ```python
     df2[df2 > 0] = -df2   # All elements > 0 change sign
     df.mean(0)            # get column wise mean (numbers=axis)  
     df.mean(1)            # get row wise mean
     df.std(0)             # standard deviation according to axis
     df.cumsum()           # cumulative sum of each column
     df.apply(np.sin)      # apply function to each element of df
     df.apply(lambda x: x.max() - x.min()) # apply lambda function column wise
     df + 10               # add scalar 10
     df - [1, 2, 10 , 100] # subtract values of each column
     df.corr()             # Compute pairwise correlation of columns
     ```
     \normalsize
 ##  Pandas - plotting data
 [\textcolor{violet}{Visualization}](https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html) is integrated in pandas using mathplotlib. Here are only 2 examples
 * Plot random data in histogramm and scatter plot
 \footnotesize
 ```python
     # create DataFrame with random normal distributed data
     df = pd.DataFrame(np.random.randn(1000,4),columns=["a","b","c","d"])
     df = df + [1, 3, 8 , 10]  # shift mean to  1, 3, 8 , 10
     plt.figure()
     df.plot.hist(bins=20)     # histogram all 4 columns
     g1 = df.plot.scatter(x="a",y="c",color="DarkBlue",label="Group 1")
     df.plot.scatter(x="b",y="d",color="DarkGreen",label="Group 2",ax=g1)
 ```
 \normalsize
 ::: columns
 :::: {.column width=35%}
 ![](figures/pandas_histogramm.png)
 ::::
 :::: {.column width=35%}
 ![](figures/pandas_scatterplot.png)
 ::::
 :::
 ##  Pandas - plotting data
 The function crosstab() takes one or more array-like objects as indexes or
 columns and constructs a new DataFrame of variable counts on the inputs
 \footnotesize
 ```python
   df = pd.DataFrame(           # create DataFrame of 2 categories
      {"sex":   np.array([0,0,0,0,1,1,1,1,0,0,0]),
       "heart": np.array([1,1,1,0,1,1,1,0,0,0,1])
      }  )                      # closing bracket goes on next line
   pd.crosstab(df2.sex,df2.heart)    # create cross table of possibilities
   pd.crosstab(df2.sex,df2.heart).plot(kind="bar",color=['red','blue']) # plot counts
 ```
 \normalsize
 ::: columns
 :::: {.column width=42%}
 ![](figures/pandas_crosstabplot.png)
 ::::
 :::
 ## Exercise 2
 Read the file [\textcolor{violet}{heart.csv}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/exercises/heart.csv) into a DataFrame.
 [\textcolor{violet}{Information on the dataset}](https://archive.ics.uci.edu/ml/datasets/heart+Disease)
 \setbeamertemplate{itemize item}{\color{red}$\square$}
  * Which columns do we have
  * Print the first 3 rows
  * Print the statistics summary and the correlations
  * Print mean values for each column with and without disease
  * Select the data according to `sex` and `target` (heart disease 0=no 1=yes). 
  * Plot the `age` distribution of male and female in one histogram
  * Plot the heart disease distribution according to chest pain type `cp`
  * Plot `thalach`  according to `target` in one histogramm
  * Plot `sex` and `target` in a histogramm figure    
  * Correlate `age` and `max heart rate` according to `target` 
  * Correlate `age` and `colesterol` according to `target` 
  \small
   [Solution: 01_intro_ex_2_sol.py](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/solutions/01_intro_ex_2_sol.py) \normalsize
--- a/Show More
+++ b/Show More
		`@ -0,0 +1,2 @@`
							`Pandoc slides example following style of [Stefan Wunsch's CERN IML workhsop presenation](https://github.com/stwunsch/iml_keras_workshop) on [keras](https://keras.io/) (see slides folder)`