diff --git a/computing_htcondor.md b/computing_htcondor.md new file mode 100644 index 0000000..27a428e --- /dev/null +++ b/computing_htcondor.md @@ -0,0 +1,36 @@ +*Email from Alexey:* + +Dear colleagues, + +I have achieved what I have planed as default behavior in our batch system (HT Condor), when submission is done from CentOS7 container (currently only from lhcba1 port 30). + +`ssh -p 30 lhcba1.physi.uni-heidelberg.de` + +Without any extra flags in the configuration, jobs shoud run under +CentOS7 (local container), after login scripts applied and in the current (at the time of submission) directory. Also jobs will run on "LHCb software compatible" +servers only. + +There are currently 3 servers with 90 slots in total which support that model. There will be one more with 16 slots. +4 other servers are interactive at the moment, lhcba1, d0new, lhcbi1 and not updated yet d0bar-new. They can be added for batch processing (also time limited, f.e. at nights and +weekends) but there is no such plans at the moment. + +All other servers (many...) are "old". They will be updated to support mentioned submission, but they can fail to run particular versions of LHCb software. + +Simple test can be started from lhcba1 port 30, with command + +`condor_submit -interactive` + +An example of submission file is in +/auto/work/zhelezov/singularity/batch_centos7. Do not forget to start job submission from directory into which you can write, otherwise log files can not be written and your jobs will be in "on hold" state forever. + +I still propose to use Singularity based approach when possible, demonstrated in /auto/work/zhelezov/singularity/FCNCfitter. +That allows to use SLC6 / CentOS8 / etc. without local installation on all servers. + +While not really checked, I believe the environment closely mimics current CERN/DESY HTCondor. Note that defaults are conservative (everywhere) in reserved resources (1 core, 512MB RAM). +It is better specify required resourced explicitly (as documented in general HTCondor manual). + +For the moment there is no multi-core slots and up to 8GB RAM per slot. +Jobs with higher requirements will find no working nodes. Please let me know if you hit the problem. + +Regards, +Alexey.