Grid – Guide on ARC Submission
The Grid has several CE submission types (ARC, Condor, Cream), at Durham we use the ARC CE software.
For information about the systems available please see the following pages:
Simple ARC Submission Script
You first need a Grid Certificate and it setup correctly prior to submitting.
Save the following as mfj.sh
#!/bin/bash echo "Grid Job has Started" touch output.txt hostname > output.txt cat mfj-input.txt > output.txt rm -rf mfj-input.txt echo "Grid Job has finished" exit0
Save the following as mjf-input.txt
mfj1 mfj2 mfj3
Save the following example as mfj.xrsl.
& (executable = "mfj.sh") (arguments = "") (jobName="MyFirstJob") (inputFiles = ("mfj-input.txt" "") ) (outputFiles = ("output.txt" "") ) (stdout = "stdout") (stderr = "stderr") (walltime="60") (count="1") (countpernode="1")
We then make the file executable and then run it like so:
chmod 755 mjf.sh
We then need to ensure we have a valid proxy and submit:
user@gridui~: arcproxy -S pheno -c validityPeriod=24h -c vomsACvalidityPeriod=24h Your identity: /C=UK/O=eScience/OU=Durham/CN=fake name Contacting VOMS server (named pheno) Proxy generation succeeded Your proxy is valid until: yyyy-mm-dd hh:mm:ss user@gridui~: arcsub --direct -c ce-test.dur.scotgrid.ac.uk mfj.xrsl Job submitted with jobid: gsiftp://ce-test.dur.scotgrid.ac.uk:2811/jobs/8SnNDmgzwjxnXk5IKnL2N00mABFKDmABFKDmvpMKDmABFKDmYacvkm
The above example will output the test “My First Job is Running” to stdout and the Node Hostname and a copy of input to the output.txt file, this will then run and wait until you collect the output.
We have several CE systems, please ensure you submit it to the correct one for your job.
To retrieve your jobs you can do
By detail your job database is in ~/.arc/jobs.dat but you can select another location with the -j flag in ARC
If you happened to lose your job database you can recreate one with arcsync.
arcsync -j jobs.xml -c ce-test.dur.scotgrid.ac.uk
Check your currently queued jobs
Get your job details/status with
arcstat -j <jobdatabase>
Please be aware that ARC is a large scale system and information within the system can take up to 5 minutes to reflect changes, so a job may be submitted but it wont always show in the client tools until 5 minutes later.
Cancel a Job
Kill/cancel your arcjobs with
arckill -j <jobdatabase>
Proxy Renewal and Running Jobs
If your jobs are running for longer than your proxy (~24hours) then you will need to renew your proxy and update the jobs. Please update only the CE you are using.
arcproxy -S pheno arcrenew -c ce0.dur.scotgrid.ac.uk -a
High-Throughput Mass Job Submission Example
This is a simple script to allow quick mass job submission. It switches between CE1 and CE2 to help distribute the load and will submit in near-parallel without hitting an ARC random number bug which happens if you submit too fast/too many jobs at once (especially noticeable when using GNU Parallel to submit).
COUNT=100 JOBDB=jobs.db CE=1 i=0 while [[ $COUNT -gt $i ]]; do arcsub --direct -c ce$CE.dur.scotgrid.ac.uk -j $JOBDB job.xrsl >/dev/null & i=$((i+1)) if [ $((i%2)) -ne 0 ]; then ce=2 else ce=1 sleep 0.2 fi done;
Visit the Nordugrid ARC help for more details and help on how to submit – http://www.nordugrid.org/arc/arc6/users/client_tools.html