|
Thank you for rating this answer.
|
References and Links
RedHat/CentOS 5.2 Documentation Cluster Suite Overview Global File System
Oracle Real Application Cluster GFS GFS Project Page Linux-cluster Mailing List
Test Environment
·3 CentOS 5 servers fully updated ·1 iSCSI Target ·Using fanout and ssh keys for repetitive tasks
Before you Start
All of the servers that will belong to the GFS cluster will need to be located in the same VLAN. Contact support if you need assistance regarding this. If you are only configuring two servers in the cluster, you will need to manually edit the file /etc/cluster/cluster.conf file on each server. After the <cluster …> tag, add the following text:
<cman expected_votes="1" two_node="1"></cman>
If you do not make this change, the servers will not be able to establish a quorum and will refuse to cluster by design.
Terms to Know Fencing – Fencing is necessary to isolate failed nodes in the GFS cluster. In the event that the cluster determines a node is no longer responding, the cluster will attempt to fence off the server. This is done in order to prevent the failed node from corrupting data on the storage device. Once the server is fenced, the other servers in the cluster will replay the failed servers’ journals and the cluster will continue operation without the failed node. It is important to properly configure fencing, in order to prevent the situation where one of the servers corrupts the data on iSCSI device because it believes it is operating correctly when it is in fact not. LVM – Logical Volume Manager – This is the software which will map the physical disks to the logical disks. CLVM – Clustered Logical Volume Manager – This is a modified software package which Redhat provides to allow for cluster-aware logical volumes. The CLVM software is required to use the iSCSI volume on multiple machines at the same time, and maintain data integrity. LVM by itself is not cluster-aware. Physical Volume/Physical Disk - The physical volumes will refer to the iSCSI volume. Other examples of physical volumes could be other hard drives, CDROMs, etc. We will only be using a single physical volume. Logical Volume/Logical Disk - The logical volume refers to a section of data sliced out of a Volume Group.Logical volumes allow for simple disk management by allowing dynamic grow/shrink functionality without changing the underlying physical disk configuration. Volume Group - The Volume Group is the highest level of grouping which maps the logical volumes to the physical volumes. The volume group can consist of multiple physical volumes and multiple logical volumes. For our purposes, the only applicable physical volumes will be iSCSI disks because they need to be shared across all the servers. More complex configurations can apply GNBD to use local disks as a cheap alternative for a GFS cluster, however this substantially increases the complexity of the configuration and is beyond the scope of this document. Journals - Each server will need a journal for the GFS filesystem. Make sure to account for additional journals you may use for future expansion, as you cannot add journals dynamically to a GFS file system.
Working with lots of servers
You can always log into each server and run commands manually, however we will be using fanout to simplify running a single command across all the servers at the same time. First, we will set up our server list:
export SERVERS="root@10.0.0.1 root@10.0.0.2 root@10.0.0.3”
You will need to upload your SSH key to each server to be able to execute commands correctly.
Package Installation
First, we will need to install all the software on the machines:
fanout "$SERVERS" 'yum install -y cman gfs-utils kmod-gfs kmod-dlm modcluster ricci luci cluster-snmp iscsi-initiator-utils lvm2-cluster openais oddjob rgmanager'
Configuring iSCSI
We won't spend much time on the configuration of the iSCSI. You will probably want to make additional optimizations (such as MTU modification) for better performance, but this is beyond the scope of this document. You can replace ISCSI_USER and ISCSI_PASS with your username/password. This creates a fairly generic iscsid.conf file:
fanout "$SERVERS" 'echo -e "node.startup = automatic\nnode.session.auth.username = ISCSI_USER\nnode.session.auth.password = ISCSI_PASS\ndiscovery.sendtargets.auth.username = ISCSI_USER\ndiscovery.sendtargets.auth.password = ISCSI_PASS\nnode.session.timeo.replacement_timeout = 120\nnode.conn[0].timeo.login_timeout = 15\nnode.conn[0].timeo.logout_timeout = 15\nnode.conn[0].timeo.noop_out_interval = 10\nnode.conn[0].timeo.noop_out_timeout = 15\nnode.session.iscsi.InitialR2T = No\nnode.session.iscsi.ImmediateData = Yes\nnode.session.iscsi.FirstBurstLength = 262144\nnode.session.iscsi.MaxBurstLength = 16776192\nnode.conn[0].iscsi.MaxRecvDataSegmentLength = 65536">/etc/iscsi/iscsid.conf'
Then run a discovery on all the machines:
fanout "$SERVERS" 'iscsiadm -m discovery -t sendtargets -p 10.0.81.10'
Restart the iSCSI service on all the machines so we pick up the newly configured LUN:
fanout "$SERVERS" 'service iscsi restart'
We should now see the iSCSI lun on all the servers when we run ‘fdisk –l’:
fanout "$SERVERS" 'fdisk -l'
The iSCSI volume will probably be the device “sdb” if you only have 1 hard drive in the server, but there are no guarantees as to what it will come up as.
Configuring RedHat Clustering Services
Now we need to create the cluster configuration file. The cluster configuration file will refer to the nodes by their hostnames. By default, the cluster would try to use the public network – we will explicitly configure it to use the private network. Note that the GFS documentation recommends keeping a synchronized /etc/hosts file across all the machines, rather than using DNS, for reliability reasons.
We will use /etc/hosts to append this mapping for our example:
10.0.0.1 gfs1.softlayer.local 10.0.0.2 gfs2.softlayer.local 10.0.0.3 gfs3.softlayer.local
We will copy this new /etc/hosts to each machine in the cluster:
fanout "$SERVERS" 'echo -e "\n10.0.0.1 gfs1.softlayer.local\n10.0.0.2 gfs2.softlayer.local\n10.0.0.3 gfs3.softlayer.local" >>/etc/hosts'
Now we will run the following commands to configure the cluster configuration file on each server. The IPMI Fence lines are of particular importance. Be sure to place the correct IPMI information for each server – fencing will not work properly otherwise, and a failure to properly fence a server can lead to data corruption:
ccs_tool create MyGFSCluster ccs_tool addfence -C node1_ipmi fence_ipmilan ipaddr=10.1.0.0.1 login=root passwd=1TqBzEEgy5 ccs_tool addfence -C node2_ipmi fence_ipmilan ipaddr=10.1.0.0.2 login=root passwd=1TqBzEEgy5 ccs_tool addfence -C node3_ipmi fence_ipmilan ipaddr=10.1.0.0.3 login=root passwd=1TqBzEEgy5
ccs_tool addnode -C gfs1.softlayer.local -n 1 -v 1 -f node1_ipmi ccs_tool addnode -C gfs2.softlayer.local -n 2 -v 1 -f node2_ipmi ccs_tool addnode -C gfs3.softlayer.local -n 3 -v 1 -f node3_ipmi
This is the same series of commands to execute on all the servers:
fanout "$SERVERS" 'ccs_tool create MyGFSCluster ; ccs_tool addfence -C node1_ipmi fence_ipmilan ipaddr=10.1.0.0.1 login=root passwd=1TqBzEEgy5 ; ccs_tool addfence -C node2_ipmi fence_ipmilan ipaddr=10.1.0.0.2 login=root passwd=1TqBzEEgy5 ; ccs_tool addfence -C node3_ipmi fence_ipmilan ipaddr=10.1.0.0.3 login=root passwd=1TqBzEEgy5 ; ccs_tool addnode -C gfs1.softlayer.local -n 1 -v 1 -f node1_ipmi ; ccs_tool addnode -C gfs2.softlayer.local -n 2 -v 1 -f node2_ipmi ; ccs_tool addnode -C gfs3.softlayer.local-n 3 -v 1 -f node3_ipmi'
The next step is to start the cluster. Make sure you start all the machines together, or the servers will start fencing eachother.
fanout "$SERVERS" 'service cman start'
You should now be able to see the status of all the servers in the cluster:
fanout "$SERVERS" 'cman_tool nodes' going... 10.0.0.1 Node Sts Inc Joined Name 1 M 16 2008-11-06 22:35:21 gfs1.softlayer.local 2 M 20 2008-11-06 22:35:32 gfs2.softlayer.local 3 M 24 2008-11-06 22:36:11 gfs3.softlayer.local 10.0.0.2 Node Sts Inc Joined Name 1 M 20 2008-11-06 22:35:32 gfs1.softlayer.local 2 M 8 2008-11-06 22:35:32 gfs2.softlayer.local 3 M 24 2008-11-06 22:36:11 gfs3.softlayer.local 10.0.0.3 Node Sts Inc Joined Name 1 M 24 2008-11-06 22:36:11 gfs1.softlayer.local 2 M 24 2008-11-06 22:36:11 gfs2.softlayer.local 3 M 12 2008-11-06 22:36:11 gfs3.softlayer.local
Configuring the GFS Volume
Now that the cluster is up and running, we can set up the GFS volume and activate it. We will start by launching all the GFS services on each machine:
service gfs start ; service gfs2 start ; service clvmd start
The same command to run on all the servers at once:
fanout "$SERVERS" 'service gfs start ; service gfs2 start ; service clvmd start'
Then, enable the LVM clustering on each machine:
lvmconf --enable-cluster
Or, for all the machines:
fanout "$SERVERS" 'lvmconf --enable-cluster'
The next few commands we will only run on 1 of the nodes, because the cluster will propagate the configuration automatically. This will format the disk /dev/sdb for LVM, initialize the volume group called ‘vg_iscsi’, and create a single 9GB partition for our GFS disk:
pvcreate /dev/sdb vgcreate vg_iscsi /dev/sdb lvcreate -n GFSVolume -L 9G vg_iscsi
You might need to run a 'vgscan' if you have issues creating logical volumes after creating the volume group.
Now we format the new volume. The “-j” flag defines the number of journals to create. Make sure you define the correct number of journals! It is much easier to allocate additional journals at the creation time than adding them later. You will need 1 journal per machine in the GFS cluster.
gfs_mkfs -j 4 -p lock_dlm -t MyGFSCluster:FirstGFSVolume /dev/vg_iscsi/GFSVolume
We need to configure all the servers to start the clustering services at boot time, configure the GFS volume to mount at boot, and mount everything. It is necessary to turn off the ‘acpid’ service in order for fencing to work properly. We will run the following commands on all the servers:
chkconfig gfs on chkconfig gfs2 on chkconfig clvmd on chkconfig cman on chkconfig iscsi on chkconfig acpid off echo "/dev/vg_iscsi/GFSVolume /mnt gfs defaults 0 0" >>/etc/fstab mount /mnt
Or for the fanout script,
fanout "$SERVERS" 'chkconfig gfs on ; chkconfig gfs2 on ; chkconfig clvmd on ; chkconfig cman on ; chkconfig iscsi on ; chkconfig acpid off ; echo "/dev/vg_iscsi/GFSVolume /mnt gfs defaults 0 0" >>/etc/fstab ; mount /mnt'
We can now see the volume is mounted on all the servers:
fanout "$SERVERS" 'df -h' going... 10.0.0.1 Filesystem Size Used Avail Use% Mounted on /dev/sda3 224G 2.0G 211G 1% / /dev/sda1 996M 39M 906M 5% /boot tmpfs 1014M 0 1014M 0% /dev/shm /dev/mapper/vg_iscsi-GFSVolume 8.5G 20K 8.5G 1% /mnt 10.0.0.2 Filesystem Size Used Avail Use% Mounted on /dev/sda3 224G 2.0G 211G 1% / /dev/sda1 996M 39M 906M 5% /boot tmpfs 1014M 0 1014M 0% /dev/shm /dev/mapper/vg_iscsi-GFSVolume 8.5G 20K 8.5G 1% /mnt 10.0.0.3 Filesystem Size Used Avail Use% Mounted on /dev/sda3 224G 2.0G 211G 1% / /dev/sda1 996M 39M 906M 5% /boot tmpfs 1014M 0 1014M 0% /dev/shm /dev/mapper/vg_iscsi-GFSVolume 8.5G 20K 8.5G 1% /mnt
You should now be able to create files on one of the nodes in the cluster, and have the files appear right away on all the other nodes in the cluster.
|