Friday, June 29, 2012

CMS with the Campus Factory


The Campus Factory is usually used by small research groups to expand their available resources to those on the campus.  Of course, that's not always easy for larger VO's, who tend to have more complicated software setups.  This is where the combination of Parrot and CernVM-FS comes in.
Source: http://cernvm.cern.ch/portal/filesystem

CernVM-FS is a HTTP based file system that serves many CERN based VOs software repositories.  In our case, we used a CernVM-FS server hosted at the University of Wisconsin - Madison (Docs)

Parrot is a program that will capture reads and writes from arbitrary executables and redirect them to remote resources.  For our use, we will redirect reads from the local file system to reads from the CernVM-FS server at UW.

Our T3, as usual, is over subscribed.  Sending our T3 jobs out onto the grid, much like overflowing Tier 2 jobs, would significantly decrease the time to completion for our CMS users.  But, our campus grid does not have CMS software available everywhere, therefore we must export the software to the jobs.  For this, we use Parrot and CernVM-FS.

Pilot submission of BOSCO


The BOSCO system is depicted in the above graphic.  First the user submit their job to their local Condor.  This instance of Condor could be tied to local resources that also can run their jobs, but for this picture, we only show the BOSCO resources.  The Factory periodically queries the user's Condor, and submits Pilot jobs to run the user's jobs.  Once the pilots start on the remote system, they begin executing the users' jobs.  The user does not have to specify any special requirements, nor use any special commands for this system to work.

We used BOSCO to flock jobs from our T3 to our other campus resources.  This process required no user interaction.  Matter of fact, the user had no idea that her jobs where not running on the T3.  This transparent interaction with the user is the primary goal of the Campus Factory design, and was clear in this experiment.
Tier-3 Connection to the UNL Campus Grid

We hope to make this a production service in the future.  In the meantime, this is being used as a prototype for what other Campuses can do with BOSCO.

Acknowledgments: Dan Bradley and the ccTools team for the CernVM-FS integration with parrot.  The AAA project for the file infrastructure to enable transparent data access.  And Helena Malbouisson for allowing me to play with her jobs, sending them to other resources.
Modifications to campus factory configs can be found on github.

1 comment:

  1. Great post, your efforts are much appreciated.

    ReplyDelete