Add 'computing_htcondor'
parent
718be34255
commit
7c14429048
36
computing_htcondor.md
Normal file
36
computing_htcondor.md
Normal file
@ -0,0 +1,36 @@
|
||||
*Email from Alexey:*
|
||||
|
||||
Dear colleagues,
|
||||
|
||||
I have achieved what I have planed as default behavior in our batch system (HT Condor), when submission is done from CentOS7 container (currently only from lhcba1 port 30).
|
||||
|
||||
`ssh -p 30 lhcba1.physi.uni-heidelberg.de`
|
||||
|
||||
Without any extra flags in the configuration, jobs shoud run under
|
||||
CentOS7 (local container), after login scripts applied and in the current (at the time of submission) directory. Also jobs will run on "LHCb software compatible"
|
||||
servers only.
|
||||
|
||||
There are currently 3 servers with 90 slots in total which support that model. There will be one more with 16 slots.
|
||||
4 other servers are interactive at the moment, lhcba1, d0new, lhcbi1 and not updated yet d0bar-new. They can be added for batch processing (also time limited, f.e. at nights and
|
||||
weekends) but there is no such plans at the moment.
|
||||
|
||||
All other servers (many...) are "old". They will be updated to support mentioned submission, but they can fail to run particular versions of LHCb software.
|
||||
|
||||
Simple test can be started from lhcba1 port 30, with command
|
||||
|
||||
`condor_submit -interactive`
|
||||
|
||||
An example of submission file is in
|
||||
/auto/work/zhelezov/singularity/batch_centos7. Do not forget to start job submission from directory into which you can write, otherwise log files can not be written and your jobs will be in "on hold" state forever.
|
||||
|
||||
I still propose to use Singularity based approach when possible, demonstrated in /auto/work/zhelezov/singularity/FCNCfitter.
|
||||
That allows to use SLC6 / CentOS8 / etc. without local installation on all servers.
|
||||
|
||||
While not really checked, I believe the environment closely mimics current CERN/DESY HTCondor. Note that defaults are conservative (everywhere) in reserved resources (1 core, 512MB RAM).
|
||||
It is better specify required resourced explicitly (as documented in general HTCondor manual).
|
||||
|
||||
For the moment there is no multi-core slots and up to 8GB RAM per slot.
|
||||
Jobs with higher requirements will find no working nodes. Please let me know if you hit the problem.
|
||||
|
||||
Regards,
|
||||
Alexey.
|
Loading…
Reference in New Issue
Block a user