Thursday, November 1, 2012

BOSCO v1.1 Features: Single Port Usage

Welcome to part 2 of my ongoing series of v1.1 features for BOSCO.  Part 1 was on SSH File Transfer.

This time, I'll talk about a new feature that we didn't planned on implementing at first, using only a single port for all communication.  After a small investigation, it was discovered that using a single port is very simple, and with no interruption to other components.  I talked briefly about it in a previous post.

What is it?

In 1.0 of BOSCO, the submit host needed a lot of ports open for connections originating from the remote clusters.  This was caused by 2 mechanisms:
  1. File transfer from the BOSCO submit host to the cluster login node before issuing the local submit call (qsub, condor_submit...).  This opens ports on the submit host because the cluster would call out to the submit host to initiate transfers.
  2. Connections for control, status, and workflow management between the cluster worker nodes and BOSCO submit host.  This is the Campus Factory, which gives BOSCO the traditional Condor look and feel.
In order for BOSCO to function, the submit host needs a large swath of ports in order to operate correctly.  Also, as you scale, you will need even more ports open.

The file transfers from the submit host to the login node are now being transferred using SSH, see my previous post.

With the new feature of single port usage, all control, status, and workflow management connections are routed through HTCondor's share_port_daemon on port 11000 (which is hardcoded, but I picked at random).

Why should I care?

Limiting BOSCO to using only 1 incoming port is very useful for users on systems not managed by them.  The node will only need 1 port open in order to run BOSCO, 11000.  If the system has a firewall, you only have to request port 11000 be opened, rather than huge swaths.  If you manage the system, then you will be happy that only 1 port needs to be opened in order to allow BOSCO submissions.

Administrators will like this feature as it is more in line with other applications that they may run.  For example, httpd only requires 1 port, 80.  Now BOSCO is in the same realm, only requiring 1 port, 11000.

No comments:

Post a Comment