Friday, May 31, 2013

Bosco at GPN Conference


This week, I presented the Bosco poster at the Great Plains Network annual meeting.  It received a very good reception.  Many people where interested in what Bosco could do for them.  Most of the audience was HPC center admins or directors, therefore they where looking at how they can use Bosco to help their users utilize their campus clusters.

At the poser presentation
At the conference we heard a lot about networks, as GPN is mostly a network collaboration.  But we also heard about the Condo of Condos proposal which the OSG is well represented (by Miron).  There is a very good webcast of the description of Condo of Condos on the I2 Website.  It feels like it's very early in the planning phase, but I am curious how the OSG will integrate with the Condos.

The weather has been terrible, but the meeting has been great.


Monday, May 20, 2013

Submitting R jobs with Bosco

The Bosco team has been working on integrating with the R statistics processing language.  We have chosen to modify the GridR package in order to integrate with R.

How will the R user see Bosco?

The goal of the integration is to simplify the method of submitting processing, written in the R language, to remote clusters and grids.  The expected steps for the integration are:
  1. Install Bosco
  2. Install the Bosco'ified GridR package into your local R environment.
After installing the 2 pieces of software above, the user creates a R script, which includes the 'function' that is to be executed on the remote cluster.  The user can send any data as input, lists, tables, an entire CSV file (already read into a R variable).  The function output will be automatically imported into the environment when the remote job has completed.

Below is a demo of the GridR package working with Bosco to submit to a campus cluster here at Nebraska.

RStudio IDE showing demo of Bosco + GridR integration
The steps in the demo are:
  1. Load the GridR library
  2. Create the function, in this case named simply 'a' that doubles the value of the argument.
  3. Initialize the GridR integration to talk to Bosco
  4. "Apply" the function.  Run the function 'a', with the input 14, and write the result to the variable "x".  Also, wait for the remote job to complete.
  5. Finally, I printed out the value of x, which is 28, double the 14. 
This is a very simple demo.  You could imagine the function sent to the remote machine could parse the a CSV file, or more complex operations...

The Bosco team expects to have this integration done and in production by Mid-July for the R users meeting.

Bosco Download