Table of Contents

Grid – Guide on ARC Submission

The Grid has several CE submission types (ARC, Condor, Cream), at Durham we use the ARC CE software.

Available Resources

For information about the systems available please see the following pages:

Simple ARC Submission Script

You first need a Grid Certificate and it setup correctly prior to submitting.

Save the following as mfj.sh

#!/bin/bash
echo "Grid Job has Started"
touch output.txt
hostname > output.txt
cat mfj-input.txt > output.txt
rm -rf mfj-input.txt
echo "Grid Job has finished"
exit0

Save the following as mjf-input.txt

mfj1
mfj2
mfj3

Save the following example as mfj.xrsl (walltime if job time length in minutes).

& (executable = "mfj.sh")
(arguments = "")
(jobName="MyFirstJob")
(inputFiles = ("mfj-input.txt" "") )
(outputFiles = ("output.txt" "") )
(stdout = "stdout")
(stderr = "stderr")
(walltime="10")
(count="1")
(countpernode="1")

We then make the file executable and then run it like so:

chmod 755 mjf.sh

We then need to ensure we have a valid proxy and submit:

user@gridui~: arcproxy -S pheno -c validityPeriod=24h -c vomsACvalidityPeriod=24h
Your identity: /C=UK/O=eScience/OU=Durham/CN=fake name
Contacting VOMS server (named pheno)
Proxy generation succeeded
Your proxy is valid until: yyyy-mm-dd hh:mm:ss
user@gridui~: arcsub --direct -c ce-test.dur.scotgrid.ac.uk mfj.xrsl
Job submitted with jobid: gsiftp://ce-test.dur.scotgrid.ac.uk:2811/jobs/8SnNDmgzwjxnXk5IKnL2N00mABFKDmABFKDmvpMKDmABFKDmYacvkm

The above example will output the test “My First Job is Running” to stdout and the Node Hostname and a copy of input to the output.txt file, this will then run and wait until you collect the output.

We have several CE systems, please ensure you submit it to the correct one for your job.

Job Retrieval

To retrieve your jobs you can do

arcget <jobid>

for example

arcget gsiftp://ce-test.dur.scotgrid.ac.uk:2811/jobs/8SnNDmgzwjxnXk5IKnL2N00mABFKDmABFKDmvpMKDmABFKDmYacvkm

Job Database

By detail your job database is in ~/.arc/jobs.dat but you can select another location with the -j flag in ARC

If you happened to lose your job database you can recreate one with arcsync.

arcsync -j jobs.xml -c ce-test.dur.scotgrid.ac.uk

Check your currently queued jobs

Get your job details/status with

arcstat <jobid>

or

arcstat -j <jobdatabase>

Please be aware that ARC is a large scale system and information within the system can take up to 5 minutes to reflect changes, so a job may be submitted but it wont always show in the client tools until 5 minutes later.

Cancel a Job

Kill/cancel your arcjobs with

arckill <jobid>

or

arckill -j <jobdatabase>

Proxy Renewal and Running Jobs

If your jobs are running for longer than your proxy (~24hours) then you will need to renew your proxy and update the jobs. Please update only the CE you are using.

arcproxy -S pheno
arcrenew -c ce0.dur.scotgrid.ac.uk -a

High-Throughput Mass Job Submission Example

This is a simple script to allow quick mass job submission. It switches between CE1 and CE2 to help distribute the load and will submit in near-parallel without hitting an ARC random number bug which happens if you submit too fast/too many jobs at once (especially noticeable when using GNU Parallel to submit).

COUNT=100
JOBDB=jobs.db

CE=1
i=0
while [[ $COUNT -gt $i ]]; do
  arcsub --direct -c ce$CE.dur.scotgrid.ac.uk -j $JOBDB job.xrsl >/dev/null &
  i=$((i+1))
  if [ $((i%2)) -ne 0 ]; then
    ce=2
  else
    ce=1
    sleep 0.2
  fi
done;

More Help?

Visit the Nordugrid ARC help for more details and help on how to submit – http://www.nordugrid.org/arc/arc6/users/client_tools.html