Update 'Ganga'

Renata Kopecná 2022-02-10 13:24:49 +01:00
parent 63d1823f60
commit 1ff263f7e1

@ -1,10 +1,20 @@
## Table of Contents
* [Contents of the folder](#contents-of-the-folder)
* [How to get the condDB tag and more info](#how-to-get-the-conddb-tag-and-more-info)
* [xml Files](#xml-files)
* [Getting lfns](#getting-lfns)
* [Option Files](#option-files)
* [Running ganga](#running-ganga)
* [Smart scripts](#smart-scripts)
* [CondDB info](#conddb-info)
# Contents of the folder # Contents of the folder
* `decFiles`: This folder contains the decFiles for `Bu_JpsiKst`, `Bu_Kstmumu` and `Bu_KstPhi` decays. * `decFiles`: This folder contains the decFiles for `Bu_JpsiKst`, `Bu_Kstmumu` and `Bu_KstPhi` decays.
* `xmlFiles`: This folder ontains the catalogs with corresponding locations of the `.MDST` and `.DST` files on the grid. * `xmlFiles`: This folder ontains the catalogs with corresponding locations of the `.MDST` and `.DST` files on the grid.
* `lfnFiles`: **TODO** * `lfnFiles`: This folder ontains the python script with lfn adresses of the files on the grid. [Click here for more details on lfns and downloading files](https://lhcb.github.io/starterkit-lessons/first-analysis-steps/files-from-grid.html)
* `CondDB_info`: Information about the simulation samples. The folder contains files for each decay channel with its decay number as the name. In the file, the bookkeeping paths is saved, togeher with the database tag, the simulation version, number of files, number of events and the lfn tag. * `CondDB_info`: Information about the simulation samples. The folder contains files for each decay channel with its decay number as the name. In the file, the bookkeeping paths is saved, togeher with the database tag, the simulation version, number of files, number of events and the lfn tag.
* `OptionFiles`: Files used by ganga **TODO** as option files. Each channel, year and polarity has their own optionfile. There is one option file common for all the variants of the data (MC/Data, polarity, K+/KS, ...) called `BasicOptfile.py`. In this file, used TupleTools and Loki functors are specified. There are also two files with the detialed settings specified that can be used for testing the production locally. A dedicated python script `ScriptForScripts.py` creates the specifications for each year, data type, ... . * `OptionFiles`: Files used by [ganga](https://lhcb.github.io/starterkit-lessons/second-analysis-steps/ganga-scripting.html) as [option files](https://lhcb.github.io/starterkit-lessons/first-analysis-steps/minimal-dv-job.html). Each channel, year and polarity has their own optionfile. There is one option file common for all the variants of the data (MC/Data, polarity, K+/KS, ...) called `BasicOptfile.py`. In this file, used TupleTools and Loki functors are specified. There are also two files with the detialed settings specified that can be used for testing the production locally. A dedicated python script `ScriptForScripts.py` creates the specifications for each year, data type, ... .
* `RunningGanga`: Scripts used to actualy run ganga * `RunningGanga`: Scripts used to actualy run ganga
* `LocationList`: Scripts used to get the location of the produced tuples from Ganga on the grid and download them locally to a folder at the Heidelberg server. Includes a `.txt` file with a list of the jobs and their numbers asigned from Ganga. * `LocationList`: Scripts used to get the location of the produced tuples from Ganga on the grid and download them locally to a folder at the Heidelberg server. Includes a `.txt` file with a list of the jobs and their numbers asigned from Ganga.
* `SimulationDetails`: Bookkeeping details about the simulation samples in `.txt` files. Includes all the simulation steps. * `SimulationDetails`: Bookkeeping details about the simulation samples in `.txt` files. Includes all the simulation steps.
@ -25,23 +35,35 @@ get_bookkeeping_info 12113100
### xml Files ### xml Files
The catalogs stored in `xmlFiles` folder contain information needed to download the corresponding files from the grid. These might be obsolete as the data is not stored there forever. **TODO** add some more information about how to generate them and use them The catalogs stored in `xmlFiles` folder contain information needed to download the corresponding files from the grid. These might be obsolete as the data is not stored there forever. The manual how to generate and use the xml files [can be found here](https://lhcb.github.io/starterkit-lessons/first-analysis-steps/files-from-grid.html).
### Getting lfns ### Getting lfns
**TODO** [Starterkit lesson: downloading files](https://lhcb.github.io/starterkit-lessons/first-analysis-steps/files-from-grid.html). Following this tutorial, a handy script `Code/Ganga/lfnFiles/get_LFNs.py` is included that downloads the tags and lfns.
File Code/Gnaga/lfnFiles/get_LFNs.py, try it out and explain
### Option Files ### Option Files
**TODO** The data was produced using DaVinci v41r2 (mostly). To verify which DaVinci version was used for what, check the ganga submission scripts. For each year, the DV.directory with corresponding version should be listed. The data was produced using DaVinci v41r2 (mostly). To verify which DaVinci version was used for what, check the ganga submission scripts. For each year, the DV.directory with corresponding version should be listed. The option files are automatically generated for each sample using `ScriptFotScripts.py`. This loops over samples (data, MC, PHSP, background MC), years, polarity, rare/reference and channel and prints the corresponding settings into the optionfile. Then, the content of `BasicOptfile.py` is copied into the optionfile. This way, there is only one file with all the setting and one doesn't need to modify 200 files when something changes in the optionfile.
Note that the optinfiles were written and used when Ganga was still using opython2. The optionfiles would now have to be updated to python3 before usage.
# Running ganga # Running ganga
run_local_tests_all.sh: A script used to locally produce tuples. This is useful to check before sending the whole thing to the grid, where it can just fail. run_local_tests_all.sh: A script used to locally produce tuples. This is useful to check before sending the whole thing to the grid, where it can just fail.
**TODO** The submitting python scripts can be find in `Code/Ganga/RunningGanga/`.
How to use and run ganga is nicely summarised [here](https://lhcb.github.io/starterkit-lessons/first-analysis-steps/davinci-grid.html)
# Smart scripts # Smart scripts
**TODO** There are scripts to retrieve the generator efficiency, see `get_MC_eff.py`, `get_MCref_eff.py` and `get_PHSP_eff.py`
The scripts used to download the dat afrom the grid to Heidelberg are saved in `/Code/Ganga/LocationList/`.
### CondDB info
To get the information about the available samples with decay ID 10000027, do:
```
export PATH=$PATH:/afs/cern.ch/user/i/ibelyaev/public/scripts
lhcb-proxy-init
get_bookkeeping_info 10000027
```