Machine Learning Kurs im Rahmen der Studierendentage im SS 2023
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

563 lines
21 KiB

  1. ---
  2. title: |
  3. | Introduction to Data Analysis and Machine Learning in Physics:
  4. | 2. Data modeling and fitting
  5. author: "Martino Borsato, Jörg Marks, Klaus Reygers"
  6. date: "Studierendentage, 11-14 April 2022"
  7. ---
  8. ## Data modeling and fitting - introduction
  9. Data analysis is a process of understanding and modeling measured
  10. data. The goal is to find patterns and to obtain inferences allowing to
  11. observe underlying patterns.
  12. * There are 2 approaches to statistical data modeling
  13. * Hypothesis testing: is our data compatible with a certain model?
  14. * Determination of model parameter: use the data to determine the parameters
  15. of a (theoretical) model
  16. * For the determination of model parameter
  17. * Analysis of data distributions $\rightarrow$ mean, variance,
  18. median, FWHM, .... \newline
  19. allows for an approximate determination of model parameter
  20. * Data fitting with the least square method $\rightarrow$ an iterative
  21. process which minimizes the deviation of a model decribed by parameters
  22. from data. This determines the optimal values and uncertainties
  23. of the parameters.
  24. * Maximum likelihood fitting $\rightarrow$ find a set of model parameters
  25. which most likely describe the data by maximizing the probability
  26. distributions.
  27. The parameter determination by minimization is an integral part of machine
  28. learning approaches, here a system learns patterns and predicts
  29. related ones. This is the focus in the upcoming days.
  30. ## Data modeling and fitting - introduction
  31. Data analysis is a process of understanding and modeling measured
  32. data. The goal is to find patterns and to obtain inferences allowing to
  33. observe underlying patterns.
  34. * There are 2 approaches to statistical data modeling
  35. * Hypothesis testing: is our data compatible with a certain model?
  36. * Determination of model parameter: use the data to determine the parameters
  37. of a (theoretical) model
  38. * For the determination of model parameter
  39. * Analysis of data distributions $\rightarrow$ mean, variance,
  40. median, FWHM, .... \newline
  41. allows for an approximate determination of model parameter
  42. \setbeamertemplate{itemize subitem}{\color{red}\tiny$\blacksquare$}
  43. * \textcolor{blue}{Data fitting with the least square method
  44. $\rightarrow$ an iterative
  45. process which minimizes the deviation of a model decribed by parameters
  46. from data. This determines the optimal values and uncertainties
  47. of the parameters.}
  48. \setbeamertemplate{itemize subitem}{\color{blue}\tiny$\blacktriangleright$}
  49. * Maximum likelihood fitting $\rightarrow$ find a set of model parameters
  50. which most likely describe the data by maximizing the probability
  51. distributions.
  52. The parameter determination by minimization is an integral part of machine
  53. learning approaches, here a system learns patterns and predicts
  54. related ones. This is the focus in the upcoming days.
  55. ## Least Square (LS) Method (1)
  56. The method determines the \textcolor{blue}{optimal parameters of functions
  57. to gaussian distributed measurements}.
  58. Lets consider a sample of $n$ measurements $y_{i}$ and a parametrized
  59. description of the measurement $\eta_{i} = f(x_{i} | \theta)$
  60. with a parameter set $\theta = \theta_{1}, \theta_{2} ,.... \theta_{k}$,
  61. dependent values $x_{i}$ and measurement errors $\sigma_{i}$.
  62. The parameter set should be determined such that
  63. \begin{equation*}
  64. \color{blue}{S = \sum \limits_{i=1}^{n} \frac{(y_i-\eta_i)^2}{\sigma_i^2} = \sum \limits_{i=1}^{n} \frac{(y_i- f(x_i|\theta))^2}{\sigma_i^2} \longrightarrow \, minimal }
  65. \end{equation*}
  66. In case of correlated measurements the covariance matrix of the $y_{i}$ has to
  67. be taken into account. This is accomplished by defining a weight matrix from
  68. the covariance matrix of the input data. A decorrelation of the input data
  69. should be considered.
  70. \vspace{0.2cm}
  71. $S$ follows a $\chi^{2}$-distribution with $(n-k)$ degrees of freedom.
  72. ## Least Square (LS) Method (2)
  73. \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
  74. * Example LS-method
  75. \vspace{0.2cm}
  76. Often the fit function $f(x, \theta)$ is linear in
  77. $\theta = \theta_{1}, \theta_{2} ,.... \theta_{k}$
  78. \vspace{0.2cm}
  79. $f(x | \theta) = \theta_{1} f_{1}(x) + .... + \theta_{k} f_{k}(x)$
  80. \vspace{0.2cm}
  81. If the model is a straight line and our parameters are $\theta_{1}$ and
  82. $\theta_{2}$ $(f_{1}(x) = 1,$ $f_{2}(x) = x)$ we have
  83. $f(x | \theta) = \theta_{1} + \theta_{2} x$
  84. \vspace{0.2cm}
  85. The LS equation is
  86. \vspace{0.2cm}
  87. $\color{blue}{S = \sum \limits_{i=1}^{n} \frac{(y_i-\eta_i)^2}{\sigma_i^2} } \color{black} {= \sum
  88. \limits_{i=1}^{n} \frac{(y_{i} - \theta_{1} - x_{i}
  89. \theta_{2})^2}{\sigma_i^2 }}$ \hspace{0.4cm} and with
  90. \vspace{0.2cm}
  91. $\frac{\partial S}{\partial \theta_1} = \sum\limits_{i=1}^{n} \frac{-2
  92. (y_i - \theta_1 - x_i \theta_2)}{\sigma_i^2} = 0$ \hspace{0.4cm} and \hspace{0.4cm}
  93. $\frac{\partial S}{\partial \theta_2} = \sum\limits_{i=1}^{n} \frac{-2 x_i (y_i - \theta_1 - x_i \theta_2)}{\sigma_i^2} = 0$
  94. \vspace{0.2cm}
  95. the parameters $\theta_{1}$ and $\theta_{2}$ can be determined.
  96. \vspace{0.2cm}
  97. \textcolor{olive}{In case of linear fit functions solutions can be found by matrix inversion}
  98. \vfill
  99. ## Least Square (LS) Method (3)
  100. \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
  101. * Use of a nonlinear fit function $f(x, \theta)$ like \hspace{0.4cm}
  102. $f(x | \theta) = \theta_{1} \cdot e^{-\theta_{2} x}$
  103. \vspace{0.2cm}
  104. results in the LS equation
  105. \vspace{0.2cm}
  106. $\color{blue}{S = \sum \limits_{i=1}^{n} \frac{(y_i-\eta_i)^2}{\sigma_i^2} } \color{black} {= \sum \limits_{i=1}^{n} \frac{(y_{i} - \theta_{1} \cdot e^{-\theta_{2} x_{i}})^2}{\sigma_i^2 }}$ \hspace{0.4cm}
  107. \vspace{0.2cm}
  108. which we have to minimize
  109. \vspace{0.2cm}
  110. $\frac{\partial S}{\partial \theta_1} = \sum\limits_{i=1}^{n} \frac{ 2 e^{-2 \theta_2 x_i} ( \theta_1 - y_i e^{\theta_2 x_i} )} {\sigma_i^2 } = 0$ \hspace{0.4cm} and \hspace{0.4cm}
  111. $\frac{\partial S}{\partial \theta_2} = \sum\limits_{i=1}^{n} \frac{ 2 \theta_1 x_I e^{-2 \theta_2 x_i} (y_i e^{\theta_2 x_i} - \theta_1)} {\sigma_i^2 } = 0$
  112. \vspace{0.4cm}
  113. In a nonlinear system, the LS Ansatz leads to derivatives which are
  114. functions of the independent variable and the parameters $\color{red}\rightarrow$ \textcolor{olive}{no closed solutions}
  115. \vspace{0.4cm}
  116. In general, we have gradient equations which don't have closed solutions.
  117. There are a couple of methods including approximations which allow together
  118. with numerical methods to find a global minimum, Gauss–Newton algorithm,
  119. Levenberg–Marquardt algorithm, gradient descend methods and also direct
  120. search methods.
  121. ## Minuit - a programm package for minimization (1)
  122. In general data fitting and also solving machine learning algorithms lead
  123. to a minimization problem of functions. In the
  124. 1975-1980 F. James (CERN) developed
  125. a FORTRAN-based package, [\textcolor{violet}{MINUIT}](http://seal.web.cern.ch/seal/documents/minuit/mntutorial.pdf), which is a framework to handle
  126. multiparameter minimization and compute the best-fit parameter values and
  127. uncertainties, including correlations between the parameters.
  128. \vspace{0.2cm}
  129. The user provides a minimization function
  130. $F(X,P)$ with the parameter space $P=(p_1,....p_k)$ and
  131. variable space $X$ (also multi-dimensional). There is an interface via
  132. functions which influences the
  133. the minimization process. MINUIT provides
  134. [\textcolor{violet}{error calculations}](http://seal.web.cern.ch/seal/documents/minuit/mnerror.pdf) including correlations for the parameter space by evaluating the shape of the function in some neighbourhood of the minimum.
  135. \vspace{0.2cm}
  136. The package
  137. has now a new object-oriented implementation as [\textcolor{violet}{Minuit2 library}](https://root.cern.ch/doc/master/Minuit2Page.html) , written
  138. in C++.
  139. \vspace{0.2cm}
  140. During the minimization $F(X,P)$ is evaluated for various $X$. For the
  141. choice of $P=(p_1,....p_k)$ different methods are used
  142. ## Minuit - a programm package for minimization (2)
  143. \vspace{0.4cm}
  144. \textcolor{olive}{SEEK}: Search for the minimum with Monte Carlo methods, mostly used at the start
  145. of the minimization with unknown starting values. It is not a converging
  146. algorithm.
  147. \vspace{0.2cm}
  148. \textcolor{olive}{SIMPLX}:
  149. Uses the simplex method of Nelder and Mead. Function values are compared
  150. in the parameter space. Via step size control the minimum is approached.
  151. Parameter errors are only approximate, no covariance matrix is calculated.
  152. \vspace{0.2cm}
  153. <!---
  154. A simplex is the smallest n dimensional figure with n+1 corners. By reflecting
  155. one point in the hyperplane of the other point and adopts itself to the
  156. function plane.
  157. -->
  158. \textcolor{olive}{MIGRAD}:
  159. Uses an algorithm of R. Fletcher, which takes the function and the gradient
  160. to approach the minimum with a variable metric method. An error matrix and
  161. correlation coefficients are available
  162. \vspace{0.2cm}
  163. \textcolor{olive}{HESSE}:
  164. Calculates the hessian matrix of second derivatives and determines the
  165. covariance matrix.
  166. \vspace{0.2cm}
  167. \textcolor{olive}{MINOS}:
  168. Calculates (asymmetric) errors using likelihood profiles.
  169. The algorithm for finding the positive and negative MINOS errors for parameter
  170. $n$ consists of varying $n$ each time minimizing $F(X,P)$ with respect to
  171. all the others.
  172. \vspace{0.2cm}
  173. ## Minuit - a programm package for minimization (3)
  174. \vspace{0.4cm}
  175. Fit process with the minuit package
  176. \vspace{0.2cm}
  177. \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
  178. * The individual steps decribed above can be called several times and in different order during the minimization process.
  179. * Each of the parameters $p_i$ of $P=(p_1,....p_k)$ can be set constant and
  180. released during the minimization steps.
  181. * Problems are expected in models with strong correlation between
  182. parameters $\rightarrow$ change model to uncorrelated definitions
  183. * Local minima, edges/steps or undefined ranges in $F(X,P)$ are problematic
  184. $\rightarrow$ simplify your model
  185. \vspace{3cm}
  186. ## Minuit2 - The iminuit package
  187. \vspace{0.4cm}
  188. [\textcolor{violet}{iminuit}](https://iminuit.readthedocs.io/en/stable/) is
  189. a Jupyter-friendly Python interface for the Minuit2 C++ library.
  190. \vspace{0.2cm}
  191. \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
  192. * The class `iminuit.Minuit` instanciates the minuit object. The minimizer
  193. function is given as argument. Basic steering of the fit
  194. like setting start parameters, error definition and print level is also
  195. done here.
  196. \footnotesize
  197. ```python
  198. from iminuit import Minuit
  199. def fcn(x, y, z): # definition of the minimizer function
  200. return (x - 2) ** 2 + (y - x) ** 2 + (z - 4) ** 2
  201. m = Minuit(fcn, x=0, y=0, z=0, errordef=1 , print_level=1)
  202. ```
  203. \normalsize
  204. * Several methods determine the interaction with the fitting process, calls
  205. to `migrad` , `hesse` or printing of parameters and errors
  206. \footnotesize
  207. ```python
  208. ......
  209. m.migrad() # run optimiser
  210. print(m.values , m.errors) # print results
  211. m.hesse() # run covariance estimator
  212. ```
  213. \normalsize
  214. ## Minuit2 - iminuit example
  215. \vspace{0.2cm}
  216. \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
  217. * The function `fcn` describes the model with parameters to be determined by
  218. data.`fcn` is minimal when the model parameters agree best with data.
  219. `fcn` has positional arguments, one for each fit parameter. `iminuit`
  220. example fit:
  221. [\textcolor{violet}{02\_fit\_exp\_fit\_iMinuit.py}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_exp_fit_iMinuit.py)
  222. \footnotesize
  223. ```python
  224. ......
  225. x = np.array([....],dtype='d') # measurements x
  226. y = np.array([....],dtype='d') # measurements y
  227. dy = np.array([....],dtype='d') # error in y
  228. def xp(a, b , c):
  229. return a * np.exp(b*x) + c
  230. # least-squares function = sum of data residuals squared
  231. def fcn(a,b,c):
  232. return np.sum((y - xp(a,b,c)) ** 2 / dy ** 2)
  233. # limit the range of b and fix parameter c
  234. m = Minuit(fcn,a=1,b=-0.7,c=1,limit_b=(-1,0.1),fix_c=True)
  235. m.migrad() # run minimizer
  236. m.fixed["c"] = False # release parameter c
  237. m.migrad() # rerun minimizer
  238. ```
  239. \normalsize
  240. * Might be useful to fix parameters or limit the range for some applications
  241. ## Minuit2 - iminuit (3)
  242. \vspace{0.2cm}
  243. \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
  244. * Results and control information of the fit can be printed and accessed
  245. in the the prorgamm.
  246. \footnotesize
  247. ```python
  248. ......
  249. m = Minuit(fcn,....,print_level=1) # set flag in the initializer
  250. m.migrad() # run minimizer
  251. a_fit = m.values['a'] # get parameter value a
  252. a_fit_error = m.errors['a'] # get parameter error of a
  253. print (m.values,m.errors) # print results
  254. ```
  255. \normalsize
  256. * After processing Hesse, covariance and correlation information of the
  257. fit is available
  258. \footnotesize
  259. ```python
  260. ......
  261. m.hesse() # run covariance estimator
  262. m.matrix() # get covariance matrix
  263. m.matrix(correlation=True) # get full correlation matrix
  264. cov = m.np_matrix() # save matrix to numpy
  265. cor = m.np_matrix(correlation=True)
  266. print(cor[0, 1]) # print correlation between parameter 1 and 2
  267. ```
  268. \normalsize
  269. ## Minuit2 - iminuit (4)
  270. \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
  271. * Minos provides asymmetric uncertainty intervals and parameter contours by
  272. scanning one parameter and minimizing the function with respect to all other
  273. parameters for each scan point. Results are displayed with `matplotlib`.
  274. \footnotesize
  275. ```python
  276. ......
  277. m.minos()
  278. print (m.get_merrors()['a'])
  279. m.draw_mnprofile('b')
  280. m.draw_mncontour('a', 'b', nsigma=4)
  281. ```
  282. ::: columns
  283. :::: {.column width=40%}
  284. ![](figures/iminuit_minos_scan-1.png)
  285. ::::
  286. :::: {.column width=40%}
  287. ![](figures/iminuit_minos_scan-2.png)
  288. ::::
  289. :::
  290. ## Exercise 3
  291. Plot the following data with mathplotlib as in the iminuit example:
  292. \footnotesize
  293. ```
  294. x: 0.2,0.4,0.6,0.8,1.,1.2,1.4,1.6,1.8,2.,2.2,2.4,2.6,2.8,3.,3.2,
  295. 3.4,3.6, 3.8,4.
  296. y: 0.04,0.021,0.035,0.03,0.029,0.019,0.024,0.018,0.019,0.022,0.02,
  297. 0.025,0.018,0.024,0.019,0.021,0.03,0.019,0.03,0.024
  298. dy: 1.792,1.695,1.541,1.514,1.427,1.399,1.388,1.270,1.262,1.228,1.189,
  299. 1.182,1.121,1.129,1.124,1.089,1.092,1.084,1.058,1.057
  300. ```
  301. \normalsize
  302. \setbeamertemplate{itemize item}{\color{red}$\square$}
  303. * Exchange in the example iminuit fit `02_fit_exp_fit_iMinuit.ipynb` the
  304. exponential function by a 3rd order polynomial and perform the fit
  305. * Compare the correlation of the parameters of the exponential and
  306. the polynomial fit
  307. * What defines the fit quality, give an estimate
  308. \small
  309. Solution: [\textcolor{violet}{02\_fit\_ex\_3\_sol.py}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/solutions/02_fit_ex_3_sol.py) \normalsize
  310. ## Exercise 4
  311. Plot the following data with mathplotlib:
  312. \footnotesize
  313. ```
  314. x: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
  315. dx: 0.1,0.1,0.5,0.1,0.5,0.1,0.5,0.1,0.5,0.1
  316. y: 1.1,2.3,2.7,3.2,3.1,2.4,1.7,1.5,1.5,1.7
  317. dy: 0.15,0.22,0.29,0.39,0.31,0.21,0.13,0.15,0.19,0.13
  318. ```
  319. \normalsize
  320. \setbeamertemplate{itemize item}{\color{red}$\square$}
  321. * Perform a fit with iminuit. Which model do you use?
  322. * Plot the resulting fit function in the graph with the data
  323. * Print the covariance matrix. Can we improve the errors.
  324. * Can you draw a contour plot of 2 of the fit parameters.
  325. \small
  326. Solution: [\textcolor{violet}{02\_fit\_ex\_4\_sol.py}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/solutions/02_fit_ex_4_sol.py) \normalsize
  327. ## PyROOT
  328. [\textcolor{violet}{PyROOT}](https://root.cern/manual/python/) is the python binding for the C++ data analysis toolkit [\textcolor{violet}{ROOT}](https://root.cern/) developed with and for the LHC community. You can access the full
  329. ROOT functionality from Python while
  330. benefiting from the performance of the ROOT C++ libraries. The PyROOT bindings
  331. are automatic and dynamic and are able to interoperate with widely-used Python
  332. data-science libraries as `NumPy`, `pandas`, SciPy `scikit-learn` and `tensorflow`.
  333. * ROOT/PyROOT can be installed easily within anaconda3 (ROOT version 6.22.02
  334. or later ) or is available in the
  335. [\textcolor{violet}{CIP jupyter2 Hub}](https://jupyter2.kip.uni-heidelberg.de/)
  336. * Tools for statistical analysis, a math library with optimized algorithms,
  337. multivariate analysis, visualization and simulation of data.
  338. * Storing data including objects and classes with compression in files is a
  339. very powerfull aspect for any data analysis project
  340. * Within PyROOT Minuit2 can be accessed easily either with predefined functions
  341. or your own function definition
  342. * For advanced statistical analyses and data modeling likelihood fitting with
  343. the packages **rooFit** and **rooStats** is available.
  344. ##
  345. * Example reading the invariant mass measurements of a $D^0$ from a text file
  346. and determine $\mu$ and $\sigma$ \hspace{1.0cm} \small
  347. [\textcolor{violet}{02\_fit\_histFit.py}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_histFit.py)
  348. \normalsize
  349. \footnotesize
  350. ```python
  351. import numpy as np
  352. import math
  353. from ROOT import TCanvas, TFile, TH1D, TF1, TMinuit, TFitResult
  354. data = np.genfromtxt('D0Mass.txt', dtype='d') # read data from text file
  355. c = TCanvas('c','D0 Mass',200,10,700,500) # instanciate output canvas
  356. d0 = TH1D('d0','D0 Mass',200,1700.,2000.) # instanciate histogramm
  357. for x in data : # fill data into histogramm d0
  358. d0.Fill(x)
  359. def pyf_tf1_params(x, p): # define fit function
  360. return p[0] * math.exp (-0.5 * ((x[0] - p[1])**2 / p[2]**2))
  361. func = TF1("func",pyf_tf1_params,1840.,1880.,3)
  362. # func = TF1("func",'gaus',1840.,1880.) # use predefined function
  363. func.SetParameters(500.,1860.,5.5) # set start parameters
  364. myfit = d0.Fit(func,"S") # fit function to the histogramm data
  365. print ("Fit results: mean=",myfit.Parameter(0)," +/- ",myfit.ParError(0))
  366. c.Draw() # draw canvas
  367. myfile = TFile('myOutFile.root','RECREATE') # Open a ROOT file for output
  368. c.Write() # Write canvas
  369. d0.Write() # Write histogram
  370. myfile.Close() # close file
  371. ```
  372. \normalsize
  373. ##
  374. * Fit Options
  375. \vspace{0.1cm}
  376. ::: columns
  377. :::: {.column width=2%}
  378. ::::
  379. :::: {.column width=98%}
  380. ![](figures/rootOptions.png)
  381. ::::
  382. :::
  383. ## Exercise 5
  384. Read text file [\textcolor{violet}{FitTestData.txt}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/exercises/FitTestData.txt) and draw a histogramm using PyROOT.
  385. \setbeamertemplate{itemize item}{\color{red}$\square$}
  386. * Determine the mean and sigma of the signal distribution. Which function do
  387. you use for fitting?
  388. * The option S fills the result object.
  389. * Try to improve the errors of the fit values with minos using the option E
  390. and also try the option M to scan for a new minimum, option V provides more
  391. output.
  392. * Fit the background outside the signal region use the option R+ to add the
  393. function to your fit
  394. \small
  395. Solution: [\textcolor{violet}{02\_fit\_ex\_5\_sol.py}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/solutions/02_fit_ex_5_sol.py) \normalsize
  396. ## iPython Examples for Fitting
  397. The different python packages are used in
  398. \textcolor{blue}{example iPython notebooks}
  399. to demonstrate the fitting of a third order polynomial to the same data
  400. available as numpy arrays.
  401. \setbeamertemplate{itemize item}{\color{red}\tiny$\blacksquare$}
  402. * LSQ fit of a polynomial to data using Minuit2 with
  403. \textcolor{blue}{iminuit} and \textcolor{blue}{matplotlib} plot:
  404. \small
  405. [\textcolor{violet}{02\_fit\_iminuitFit.ipynb}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_iminuitFit.ipynb)
  406. \normalsize
  407. * Graph fitting with \textcolor{blue}{pyROOT} with options using a python
  408. function including confidence level plot:
  409. \small
  410. [\textcolor{violet}{02\_fit\_fitGraph.ipynb}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_fitGraph.ipynb)
  411. \normalsize
  412. * Graph fitting with \textcolor{blue}{numpy} and confidence level
  413. plotting with \textcolor{blue}{matplotlib}:
  414. \small
  415. [\textcolor{violet}{02\_fit\_numpyFit.ipynb}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_numpyFit.ipynb)
  416. \normalsize
  417. * Graph fitting with a polynomial fit of \textcolor{blue}{scikit-learn} and
  418. plotting with \textcolor{blue}{matplotlib}:
  419. \normalsize
  420. \small
  421. [\textcolor{violet}{02\_fit\_scikitFit.ipynb}](https://www.physi.uni-heidelberg.de/~reygers/lectures/2021/ml/examples/02_fit_scikitFit.ipynb)
  422. \normalsize