Tuesday, April 5, 2011

Gluster Setup in the Cloud: Simple, Easy

Here, I'll spin up two instances and configure distributed, replicated storage between them. I'll do two instances but I've actually done it for up to six instances. I have no idea how far it would scale but I'm guessing 30 or more would work fine using this method.

First I fired up two CentOS 5 instances which will be talking to each other over the private network. I'll name them node-1 and node-2 and add entries for both to /etc/hosts, like so:             node-1             node-2

They are in the same security group, so they can see each other. After you add the entries, ping the other side, like this:

[root@node-1 ~]# ping node-2
PING node-2 ( 56(84) bytes of data.
64 bytes from node-2 ( icmp_seq=1 ttl=64 time=0.140 ms
64 bytes from node-2 ( icmp_seq=2 ttl=64 time=0.138 ms

--- node-2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.138/0.139/0.140/0.001 ms

Next, grab and install the software:
wget http://download.gluster.com/pub/gluster/glusterfs/LATEST/RHEL/glusterfs-core-3.1.3-1.x86_64.rpm
wget http://download.gluster.com/pub/gluster/glusterfs/LATEST/RHEL/glusterfs-fuse-3.1.3-1.x86_64.rpm

rpm -Uvh gluster*.rpm
rm gluster*.rpm

Then, load the fuse module:
modprobe fuse

Then, start glusterd
/etc/init.d/glusterd start

Pick some directories to use, in this case, gluster will use /export/queue-data and you and your apps will use /queue. So, don't every access files in /export/queue-data, gluster owns that directory:

mkdir /queue /export/queue-data

Setup the clients so they can see/talk to each other, run this on each system. From node-1:
[root@node-1 ~]# gluster peer probe node-2
Probe successful

[root@node-2 ~]# gluster peer probe node-1
Probe successful

Next, create your directories on both systems:
[root@node-1 ~]# mkdir -p /queue /export/queue-data/
[root@node-2 ~]# mkdir -p /queue /export/queue-data/

Now, create your volume:
[root@node-1 ~]# gluster volume create queue-data replica 2 node-1:/export/queue-data node-2:/export/queue-data
Creation of volume queue-data has been successful. Please start the volume to access data.

Ready to start the volume export:
[root@node-1 ~]# gluster volume start queue-data
Starting volume queue-data has been successful

To manually mount the volume, run:
[root@node-1 ~]# mount -t glusterfs /queue

You can see that it's been mounted here:
[root@node-1 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             2.0G  1.5G  445M  77% /
/dev/sda2             7.9G  152M  7.4G   2% /opt
none                  256M     0  256M   0% /dev/shm
                      2.0G  1.5G  445M  77% /queue

To mount it automatically on boot, run:
[root@node-2 ~]# echo "    /queue     glusterfs defaults,_netdev 0 0" >> /etc/fstab
[root@node-2 ~]# mount -a

If you're doing something different and want to be able to run VM's off your glusterfs, add this to fstab:          /queue                   glusterfs direct-io-mode=disable,_netdev 0 0

And you can see that on node-2 it's also been mounted:
[root@node-2 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             2.0G  1.5G  445M  77% /
/dev/sda2             7.9G  152M  7.4G   2% /opt
none                  256M     0  256M   0% /dev/shm
                      2.0G  1.5G  445M  77% /queue

Now, let's make sure it works. I'll create a file on node-1 and then make sure it exists on node-2:

[root@node-1 ~]# cd /queue && dd if=/dev/zero of=output.dat bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.572936 seconds, 18.3 MB/s

And, here I see it on node-2:
[root@node-2 ~]# ls -al /queue
total 10276
drwxr-xr-x  2 root root     4096 Apr  5 12:49 .
drwxr-xr-x 25 root root     4096 Apr  5 12:43 ..
-rw-r--r--  1 root root 10485760 Apr  5 12:49 output.dat

And, there you have it. If you require iptables rules, which I don't because I'm already behind ec2's ACL's, add something like this to node-1 and changing the IP for node-2:
# /etc/sysconfig/iptables
-A INPUT -m state --state NEW -p tcp --dport 24007 --source -j ACCEPT
-A INPUT -m state --state NEW -p tcp --dport 24008 --source -j ACCEPT
-A INPUT -m state --state NEW -p tcp --dport 24009 --source -j ACCEPT
-A INPUT -m state --state NEW -p tcp --dport 24010 --source -j ACCEPT

You could also just do something like this:
A INPUT -m state --state NEW -p tcp --dport 24007:24010 --source -j ACCEPT


  1. Thanks for a great writeup. I have a question related to your article, perhaps you could help?

    I need to scale up a couple apache servers but share the docroot using a synchonized (common source). I am however a little unclear how to "link" /var/www to the glusterfs client mount point (e.g. /queue). After googling, there seem to be a few options, but not sure about "best practices".

    1) Could one simply make the glusterfs client mount point /var/www.

    Not sure it is that simple?

    2) Would one use symblic links between /var/www and the glusterfs client mount point? Not exactly sure how this would be done. Am not a linux systems administrator.

    3) Would one use mount --bind between /var/www and the glusterfs client mount point? Again, not exactly sure how this would be done.

    4) Or perhaps there is another best practice?

    Any help would be appreciated, since I am a bit of a newbie in this area.

  2. Good article and it really works. I have a small question. I am wondering whether you have considered other distributed file systems such hadoop, luster, mogilefs,xtreefs. Which one is easy to manage and scale for replicated volume setup? I know everyone has its own merits and demerits still what is your opinion on that?