This guide is heavily based on the administrator's guide for Gluster.
InstallationAll of the gluster packages are in EPEL, so first we need to install that repo on our nodes.
$ rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-7.noarch.rpm
Then install the glusterfs server:
$ yum install glusterfs-server -y
Then start the server:
$ /etc/init.d/glusterd start
For demo purposes only, flush the firewall:
$ iptables -F
ConfigurationAnd now add the nodes to the gluster system:
$ gluster peer probe i-0000011a Probe successful $ gluster peer probe i-0000011c Probe successful
Now you can check for the nodes with the status command:
$ gluster peer status Number of Peers: 2 Hostname: i-0000011a Uuid: 5bdc4f02-4e08-4794-af03-fd624be2d2e0 State: Peer in Cluster (Connected) Hostname: i-0000011c Uuid: 248be1ba-c5aa-40d1-90e9-ca95a7e31697 State: Peer in Cluster (Connected)
In this demo, I decided to make a Distributed Replicated volume. There are many options, but this seemed the best I could see.
To create the volume:
$ gluster volume create test-volume replica 3 transport tcp i-00000119:/exp1 i-0000011a:/exp2 i-0000011c:/exp3
Note, I didn't make the directories /expX on any of the nodes, they are automatically made for you.
To start the volume:
$ gluster volume start test-volume
To mount the volume, we don't have to modprobe fuse since it's built into the 2.6.32 kernel that comes with EL6. You can also use NFS to mount gluster volumes, but I decided to use fuse.
$ mkdir -p /mnt/glusterfs $ mount -t glusterfs i-0000011a:/test-volume /mnt/glusterfs
YAY! working glusterfs. To confirm that it is working, I copied in a test file, mounted the test-volume on another node in the test cluster as well, and there was my file!
SummaryGlusterFS doesn't seem too advanced compared to Hadoop or Ceph. If I look in the /expX directories I just see the whole file in there. In the current release, I believe the closest volume configuration we could have to Hadoop or Ceph is Striped Replicated Volumes. But, that volume type is only supported for use as a MapReduce backend.
I think GlusterFS would be really cool for a OpenStack back end. Especially since it's so darn simple. Easily recoverable since the files are stored in plain text. Of course, you would probably want to do striping for the large image sizes of those files.
Overall, I feel this was the easiest of the file systems I have tried out. Ceph was a little scary with all the configuration needed. GlusterFS was as simple as just issueing a command to add another server. Of course, does this mean it'll load balance the files if a server goes away? Don't really know how that'll work.