SoftLayer Technologies (tm)
 
Knowledgebase Home | Favorites Knowledgebase Home | Favorites
Search the Knowledgebase Browse by Category
GFS howto
References and Links

RedHat/CentOS 5.2 Documentation
Cluster Suite Overview
Global File System

Oracle Real Application Cluster GFS
GFS Project Page
Linux-cluster Mailing List

Test Environment

·3 CentOS 5 servers fully updated
·1 iSCSI Target
·Using fanout and ssh keys for repetitive tasks

Before you Start

All of the servers that will belong to the GFS cluster will need to be located in the same VLAN. Contact support if you need assistance regarding this.
If you are only configuring two servers in the cluster, you will need to manually edit the file /etc/cluster/cluster.conf file on each server. After the <cluster …> tag, add the following text:

<cman expected_votes="1" two_node="1"></cman>

If you do not make this change, the servers will not be able to establish a quorum and will refuse to cluster by design.

Terms to Know
Fencing
– Fencing is necessary to isolate failed nodes in the GFS cluster. In the event that the cluster determines a node is no longer responding, the cluster will attempt to fence off the server. This is done in order to prevent the failed node from corrupting data on the storage device. Once the server is fenced, the other servers in the cluster will replay the failed servers’ journals and the cluster will continue operation without the failed node.
It is important to properly configure fencing, in order to prevent the situation where one of the servers corrupts the data on iSCSI device because it believes it is operating correctly when it is in fact not.
LVMLogical Volume Manager – This is the software which will map the physical disks to the logical disks.
CLVMClustered Logical Volume Manager – This is a modified software package which Redhat provides to allow for cluster-aware logical volumes. The CLVM software is required to use the iSCSI volume on multiple machines at the same time, and maintain data integrity. LVM by itself is not cluster-aware.
Physical Volume/Physical Disk - The physical volumes will refer to the iSCSI volume. Other examples of physical volumes could be other hard drives, CDROMs, etc. We will only be using a single physical volume.
Logical Volume/Logical Disk -  The logical volume refers to a section of data sliced out of a Volume Group.Logical volumes allow for simple disk management by allowing dynamic grow/shrink functionality without changing the underlying physical disk configuration.
Volume Group -  The Volume Group is the highest level of grouping which maps the logical volumes to the physical volumes. The volume group can consist of multiple physical volumes and multiple logical volumes. For our purposes, the only applicable physical volumes will be iSCSI disks because they need to be shared across all the servers. More complex configurations can apply GNBD to use local disks as a cheap alternative for a GFS cluster, however this substantially increases the complexity of the configuration and is beyond the scope of this document.
Journals - Each server will need a journal for the GFS filesystem. Make sure to account for additional journals you may use for future expansion, as you cannot add journals dynamically to a GFS file system.

Working with lots of servers

You can always log into each server and run commands manually, however we will be using fanout to simplify running a single command across all the servers at the same time. First, we will set up our server list:

export SERVERS="root@10.0.0.1 root@10.0.0.2 root@10.0.0.3”

You will need to upload your SSH key to each server to be able to execute commands correctly.

Package Installation

First, we will need to install all the software on the machines:

fanout "$SERVERS" 'yum install -y cman gfs-utils kmod-gfs kmod-dlm modcluster ricci luci cluster-snmp iscsi-initiator-utils lvm2-cluster openais oddjob rgmanager'

Configuring iSCSI

We won't spend much time on the configuration of the iSCSI. You will probably want to make additional optimizations (such as MTU modification) for better performance, but this is beyond the scope of this document. You can replace ISCSI_USER and ISCSI_PASS with your username/password. This creates a fairly generic iscsid.conf file:

fanout "$SERVERS" 'echo -e "node.startup = automatic\nnode.session.auth.username = ISCSI_USER\nnode.session.auth.password = ISCSI_PASS\ndiscovery.sendtargets.auth.username = ISCSI_USER\ndiscovery.sendtargets.auth.password = ISCSI_PASS\nnode.session.timeo.replacement_timeout = 120\nnode.conn[0].timeo.login_timeout = 15\nnode.conn[0].timeo.logout_timeout = 15\nnode.conn[0].timeo.noop_out_interval = 10\nnode.conn[0].timeo.noop_out_timeout = 15\nnode.session.iscsi.InitialR2T = No\nnode.session.iscsi.ImmediateData = Yes\nnode.session.iscsi.FirstBurstLength = 262144\nnode.session.iscsi.MaxBurstLength = 16776192\nnode.conn[0].iscsi.MaxRecvDataSegmentLength = 65536">/etc/iscsi/iscsid.conf'

Then run a discovery on all the machines:

fanout "$SERVERS" 'iscsiadm -m discovery -t sendtargets -p 10.0.81.10'

Restart the iSCSI service on all the machines so we pick up the newly configured LUN:

fanout "$SERVERS" 'service iscsi restart'

We should now see the iSCSI lun on all the servers when we run ‘fdisk –l’:

fanout "$SERVERS" 'fdisk -l'

The iSCSI volume will probably be the device “sdb” if you only have 1 hard drive in the server, but there are no guarantees as to what it will come up as.

Configuring RedHat Clustering Services

Now we need to create the cluster configuration file. The cluster configuration file will refer to the nodes by their hostnames. By default, the cluster would try to use the public network – we will explicitly configure it to use the private network. Note that the GFS documentation recommends keeping a synchronized /etc/hosts file across all the machines, rather than using DNS, for reliability reasons.

We will use /etc/hosts to append this mapping for our example:

10.0.0.1 gfs1.softlayer.local
10.0.0.2 gfs2.softlayer.local
10.0.0.3 gfs3.softlayer.local

We will copy this new /etc/hosts to each machine in the cluster:

fanout "$SERVERS" 'echo -e "\n10.0.0.1 gfs1.softlayer.local\n10.0.0.2 gfs2.softlayer.local\n10.0.0.3 gfs3.softlayer.local" >>/etc/hosts'

Now we will run the following commands to configure the cluster configuration file on each server. The IPMI Fence lines are of particular importance. Be sure to place the correct IPMI information for each server – fencing will not work properly otherwise, and a failure to properly fence a server can lead to data corruption:

ccs_tool create MyGFSCluster
ccs_tool addfence -C node1_ipmi fence_ipmilan ipaddr=10.1.0.0.1 login=root passwd=1TqBzEEgy5
ccs_tool addfence -C node2_ipmi fence_ipmilan ipaddr=10.1.0.0.2 login=root passwd=1TqBzEEgy5
ccs_tool addfence -C node3_ipmi fence_ipmilan ipaddr=10.1.0.0.3 login=root passwd=1TqBzEEgy5

ccs_tool addnode -C gfs1.softlayer.local -n 1 -v 1 -f node1_ipmi
ccs_tool addnode -C gfs2.softlayer.local -n 2 -v 1 -f node2_ipmi
ccs_tool addnode -C gfs3.softlayer.local -n 3 -v 1 -f node3_ipmi

This is the same series of commands to execute on all the servers:

fanout "$SERVERS" 'ccs_tool create MyGFSCluster ; ccs_tool addfence -C node1_ipmi fence_ipmilan ipaddr=10.1.0.0.1 login=root passwd=1TqBzEEgy5 ; ccs_tool addfence -C node2_ipmi fence_ipmilan ipaddr=10.1.0.0.2 login=root passwd=1TqBzEEgy5 ; ccs_tool addfence -C node3_ipmi fence_ipmilan ipaddr=10.1.0.0.3 login=root passwd=1TqBzEEgy5 ; ccs_tool addnode -C gfs1.softlayer.local -n 1 -v 1 -f node1_ipmi ; ccs_tool addnode -C gfs2.softlayer.local -n 2 -v 1
-f node2_ipmi ; ccs_tool addnode
-C gfs3.softlayer.local-n 3 -v 1 -f node3_ipmi'

The next step is to start the cluster. Make sure you start all the machines together, or the servers will start fencing eachother.

fanout "$SERVERS" 'service cman start'

You should now be able to see the status of all the servers in the cluster:

fanout "$SERVERS" 'cman_tool nodes'
going...
10.0.0.1
Node  Sts   Inc   Joined               Name
   1   M     16   2008-11-06 22:35:21  gfs1.softlayer.local
   2   M     20   2008-11-06 22:35:32  gfs2.softlayer.local
   3   M     24   2008-11-06 22:36:11  gfs3.softlayer.local
10.0.0.2
Node  Sts   Inc   Joined               Name
   1   M     20   2008-11-06 22:35:32  gfs1.softlayer.local
   2   M      8   2008-11-06 22:35:32  gfs2.softlayer.local
   3   M     24   2008-11-06 22:36:11  gfs3.softlayer.local
10.0.0.3
Node  Sts   Inc   Joined               Name
   1   M     24   2008-11-06 22:36:11  gfs1.softlayer.local
   2   M     24   2008-11-06 22:36:11  gfs2.softlayer.local
   3   M     12   2008-11-06 22:36:11  gfs3.softlayer.local


Configuring the GFS Volume

Now that the cluster is up and running, we can set up the GFS volume and activate it. We will start by launching all the GFS services on each machine:

service gfs start ; service gfs2 start ; service clvmd start

The same command to run on all the servers at once:

fanout "$SERVERS" 'service gfs start ; service gfs2 start ; service clvmd start'

Then, enable the LVM clustering on each machine:

lvmconf --enable-cluster

Or, for all the machines:

fanout "$SERVERS" 'lvmconf --enable-cluster'

The next few commands we will only run on 1 of the nodes, because the cluster will propagate the configuration automatically. This will format the disk /dev/sdb for LVM, initialize the volume group called ‘vg_iscsi’, and create a single 9GB partition for our GFS disk:

pvcreate /dev/sdb
vgcreate vg_iscsi /dev/sdb
lvcreate -n GFSVolume -L 9G vg_iscsi

You might need to run a 'vgscan' if you have issues creating logical volumes after creating the volume group.

Now we format the new volume. The “-j” flag defines the number of journals to create. Make sure you define the correct number of journals! It is much easier to allocate additional journals at the creation time than adding them later. You will need 1 journal per machine in the GFS cluster.

gfs_mkfs -j 4 -p lock_dlm -t MyGFSCluster:FirstGFSVolume /dev/vg_iscsi/GFSVolume

We need to configure all the servers to start the clustering services at boot time, configure the GFS volume to mount at boot, and mount everything. It is necessary to turn off the ‘acpid’ service in order for fencing to work properly. We will run the following commands on all the servers:

chkconfig gfs on
chkconfig gfs2 on
chkconfig clvmd on
chkconfig cman on
chkconfig iscsi on
chkconfig acpid off
echo "/dev/vg_iscsi/GFSVolume /mnt gfs defaults 0 0" >>/etc/fstab
mount /mnt

Or for the fanout script,

fanout "$SERVERS" 'chkconfig gfs on ; chkconfig gfs2 on ; chkconfig clvmd on ; chkconfig cman on ;
chkconfig iscsi on ; chkconfig acpid off ; echo "/dev/vg_iscsi/GFSVolume /mnt gfs defaults 0 0"
>>/etc/fstab ; mount /mnt'

We can now see the volume is mounted on all the servers:

fanout "$SERVERS" 'df -h'
going...
10.0.0.1
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             224G  2.0G  211G   1% /
/dev/sda1             996M   39M  906M   5% /boot
tmpfs                1014M     0 1014M   0% /dev/shm
/dev/mapper/vg_iscsi-GFSVolume
                      8.5G   20K  8.5G   1% /mnt
10.0.0.2
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             224G  2.0G  211G   1% /
/dev/sda1             996M   39M  906M   5% /boot
tmpfs                1014M     0 1014M   0% /dev/shm
/dev/mapper/vg_iscsi-GFSVolume
                      8.5G   20K  8.5G   1% /mnt
10.0.0.3
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             224G  2.0G  211G   1% /
/dev/sda1             996M   39M  906M   5% /boot
tmpfs                1014M     0 1014M   0% /dev/shm
/dev/mapper/vg_iscsi-GFSVolume
                      8.5G   20K  8.5G   1% /mnt

You should now be able to create files on one of the nodes in the cluster, and have the files appear right away on all the other nodes in the cluster.

Related Articles
Attachments
No attachments were found.
Home | Dedicated | CloudLayerTM | Virtualization | Network | Solutions | Facilities | Resources | Partners | News | About | Specials | Contact | Legal | Sitemap
©2010 SoftLayer Technologies, Inc.