PhD-Kopecna-Renata/Chapters/EventSelection/MultCand.tex



								\subsection{Treatment of multiple candidates}\label{sec:sel-MultipleCandidates}


								In most \lhcb analyses, multiple candidates are not considered as a pollution due to the relatively precise charged track selection. In the case of \piz reconstruction, especially resolved \piz, multiple candidates are abundant. Multiple candidates refers to an event that is reconstructed with several signal candidates. This can happen \eg when a \piz meson is reconstructed using a random photon, especially in the case of very soft pions.


								It is presented in \refFig{sel-Multiple} that the fraction of events containing multiple candidates decreases with increasing value of MLP response. This reflects the fact that MLP removes background events.


								\begin{figure}[hbt!]

								    \centering

								    \includegraphics[width=0.47\textwidth]{./Data/MultipleCandidates/Run_1_tagged.png} \hspace{10pt}

								    \includegraphics[width=0.47\textwidth]{./Data/MultipleCandidates/Run_2_tagged.png}

								    \captionof{figure}[Fraction of multiple candidates in data and simulation.]{Fraction of multiple candidates in data and simulation depending on the cut on MLP response. The fraction is defined as the number of all multiple candidates divided by the number of all events. This means that if \eg there is one event with two multiple candidates in a sample of ten events, the ratio would be 0.2. This represents the number of events we actually exclude as the \emph{fake} candidates are indistinguishable from \emph{true} candidates. The blue points represent data, the orange points represent the simulation sample and the green points represent the truth-matched simulation sample. } \label{fig:sel-Multiple}

								\end{figure}


								Removing all multiple candidates no matter if they correspond to signal or not could negatively effect the significance $\mathcal{S}$, defined in \refEq{significance}, where $S$ is the number of signal candidates and $B$ is the amount of background candidates:

								\begin{equation}\label{eq:significance}

									\mathcal{S} = \frac{S}{\sqrt{S+B}}\,.

								\end{equation}

								However, as shown in \refFig{sel-Multiple}, the final fraction of multiple candidates in the sample is about 10\%. This means that in the worst possible case, 5\% of true candidates have exactly one fake partner. In this case, the significance is worsened by a factor 0.97. The possible gain in significance if we would remove only the fake events is negligible. As a small fraction of candidates (about a 1\%) have more than one fake partner, the removal of all events with \emph{at least} one fake partner does not worsen the significance. As the disentanglement of the \emph{true} candidate from the \emph{fake} candidate is almost impossible and the possible loss of significance negligible, all multiple candidates are removed.


								%As most events have two multiple candidates (about 8.6\% of all events is fake\footnote{Which makes up in total 17.2\% of all events.}), the effective significance is then

								%\begin{equation}

								%\mathcal{S'} = \frac{S-F}{\sqrt{S-F+B-F}}\,,

								%\end{equation}

								%where $F$ denotes the number of fake candidates.  The ratio of ideal significance $\mathcal{S}$ over achievable significance $\mathcal{S'}$ can be rewritten as

								%\begin{equation}

								% \frac{S \sqrt{1 - \frac{2 F}{B + S}}}{S -F}% \sim \frac{1}{\sqrt{(1-F/S)}}

								%\end{equation}

								%and assuming $F/(S+B) = 8.6\%$ and $S\sim B$, the significance is worsened by a factor of 0.98. On the other hand, as another 5.8\% have three multiple candidates\footnote{Including both fake and real events.} or more\footnote{Up to 8!}, therefore removal of all fake candidates together with the real candidates does not worsen the overall significance.


								Moreover, the multiple candidates do not only affect the shape of the background. As shown in \refFig{sel-MultipleResolution}, multiple candidates typically worsen momentum resolution as they are background. As the \piz momentum is tied to \thetak (see \refFig{anglesB+}, \thetak is proportional to the asymmetry between \Kp and \piz momenta), it is important to keep the \piz resolution as good as possible. Removing multiple candidates is therefore a crucial step in this analysis, even though it is not possible to distinguish a true candidate from a fake candidate.


								\begin{figure}[hbt!]

									\centering

									\includegraphics[width=0.325\textwidth]{Data/Resolution/MC/2016/new/measure_vs_true_MC_Run2_2016_TM_IDTM_rndGamma.eps}

									\includegraphics[width=0.325\textwidth]{Data/Resolution/MC/2016/new/measure_vs_true_MC_Run2_2016_TM_IDTM_rndGamma_TMVA0.990000.eps}

									\includegraphics[width=0.325\textwidth]{Data/Resolution/MC/2016/new/measure_vs_true_MC_Run2_2016_TM_IDTM_rndGamma_TMVA0.990000_removedMultiple.eps} \\


									\raggedright

									   \captionof{figure}[Neutral pion momentum resolution, 2016 simulation sample.]{\piz momentum resolution in 2016 truth-matched simulation sample. The x-axis represents the \emph{true} \piz momentum, y-axis \emph{measured} \piz momentum. On the left, all events are shown. In the middle, events passing a cut on MLP response of 0.99 are shown\protect\footnotemark. On the right, on top of the cut on MLP response at 0.99, multiple candidates are removed. It is clear that removal of multiple candidates removes candidates with worse momentum resolution, especially for soft pions.} \label{fig:sel-MultipleResolution}

								\end{figure}


								\footnotetext{This number is arbitrarily chosen, as it is clear from the MLP training the optimal MLP cut will be very close to one.}