Friday, August 7, 2015

GPUs and adding new resources types to the HTCondor-CE

In the past, it has been difficult to add new resource types to the OSG CE, whether it was a Globus GRAM CE or a HTCondor-CE.  But, it has just gotten a little bit easier.  Today I added HCC's GPU resources to the OSG with this new method.

With a (yet unapproved) pull request, the HTCondor-CE is able to add new resource types by modifying only 2 files, the routes table and scheduler attributes customization script.  Previously, it required editing a third python script which had very tricky syntax (python, which spit out ClassAds...).  In the following examples, I will demonstrate how to use this new feature with GPUs.

The Routes

Each job submitted to a HTCondor-CE must follow a route from the original job, to the final job submitted to the local batch system.  The HTCondor JobRouter is in charge of translating the original job to the final job, according to rules specified in the router configuration.  Crane's GPU route is:

The route submit the job to the local PBS (actually Slurm) scheduler to the grid_gpu partition.  Further, it adds a special new attribute:
default_remote_cerequirements = "RequestGpus == 1"
This attribute is used in the next section, the local submit attributes script.

Local Submit Attributes Script

The local submit attributes script translates the remote_cerequirements to the actual scheduler language used at the site.  For Crane's GPU configuration, the snippet added for GPUs is:

This snippet checks for the existence of the RequestGpus attribute from the environment, and if detected, will insert several lines into the submit script.  It will first add the SLURM line to request a GPU, then it will source the module setup script and load the cuda module.

Next Steps

The next steps for using GPUs on the OSG is to use one of the many frontends that are capable of submitting glideins to the GPU resources at HCC.  Currently, the HCC, OSG, OSGConnect, and GLOW frontends are capable of submitting to the GPU resources.