NEW! Mirantis Academy -   Learn confidently with expert guidance and On-demand content.   Learn More

< BLOG HOME

Clustered LVM on DRBD resource in Fedora Linux

Oleg Gelbukh - June 09, 2011

As Florian Haas has pointed out in my previous post's comment, our shared storage configuration requires special precautions to avoid corruption of data when two hosts connected via DRBD try to manage LVM volumes simultaneously. Generally, these precautions concern locking LVM metadata operations while running DRBD in 'dual-primary' mode.

Let's examine it in detail. The LVM locking mechanism is configured in the [global] section of
/etc/lvm/lvm.conf. The 'locking_type' parameter is the most important here. It defines which locking LVM is
used while changing metadata. It can be equal to:

  • '0': disables locking completely - it's dangerous to use;
  • '1': default, local file-based locking. It knows nothing about the cluster and possible conflicting metadata
    changes;
  • '2': uses an external shared library and is defined by the 'locking_library' parameter;
  • '3': uses built-in LVM clustered locking;
  • '4': read-only locking which forbids any changes of metadata.

The simplest way is to use local locking on one of the drbd peers and to disable metadata operations on another one.
This has a serious drawback though: we won't have our Volume Groups and Logical Volumes activated automatically upon
creation on the other, 'passive' peer. The thing is that it's not good for the production environment and cannot be
automated easily.

But there is another, more sophisticated way. We can use the
Linux-HA (Heartbeat) coupled with the
LVM Resource Agent . It automates activation of the newly created LVM resources on the shared storage, but still provides no locking
mechanism suitable for a 'dual-primary' DRBD operation.

It should be noted that full support of clustered locking for the LVM can be achieved by
thelvm2-cluster Fedora RPM package stored in the repository. It contains the clvmd service which runs on all hosts in the cluster and controls LVM locking on shared storage. In this case, we have
only 2 drbd-peers in the cluster.

clvmd requires a cluster engine in order to function properly. It's provided by the
cmanservice, installed as a dependency of the lvm2-cluster (other dependencies may
vary from installation to installation):



(drbd-node1)# yum install clvmd
...
Dependencies Resolved
================================================================================
Package Arch Version Repository Size
================================================================================
Installing:
lvm2-cluster x86_64 2.02.84-1.fc15 fedora 331 k
Installing for dependencies:
clusterlib x86_64 3.1.1-1.fc15 fedora 70 k
cman x86_64 3.1.1-1.fc15 fedora 364 k
fence-agents x86_64 3.1.4-1.fc15 updates 182 k
fence-virt x86_64 0.2.1-4.fc15 fedora 33 k
ipmitool x86_64 1.8.11-6.fc15 fedora 273 k
lm_sensors-libs x86_64 3.3.0-2.fc15 fedora 36 k
modcluster x86_64 0.18.7-1.fc15 fedora 187 k
net-snmp-libs x86_64 1:5.6.1-7.fc15 fedora 1.6 M
net-snmp-utils x86_64 1:5.6.1-7.fc15 fedora 180 k
oddjob x86_64 0.31-2.fc15 fedora 61 k
openais x86_64 1.1.4-2.fc15 fedora 190 k
openaislib x86_64 1.1.4-2.fc15 fedora 88 k
perl-Net-Telnet noarch 3.03-12.fc15 fedora 55 k
pexpect noarch 2.3-6.fc15 fedora 141 k
python-suds noarch 0.3.9-3.fc15 fedora 195 k
ricci x86_64 0.18.7-1.fc15 fedora 584 k
sg3_utils x86_64 1.29-3.fc15 fedora 465 k
sg3_utils-libs x86_64 1.29-3.fc15 fedora 54 k
Transaction Summary
================================================================================
Install 19 Package(s)

The only thing we need the cluster for is the use of clvmd; the configuration of cluster itself is pretty basic.
Since we don't need advanced features like automated
fencing yet, we specify manual handling. As we have only 2 nodes in the cluster, we can tell cman about it. Configuration
for cman resides in the /etc/cluster/cluster.conf file:



<?xml version="1.0"?>
<cluster name="cluster" config_version="1">
<!-- post_join_delay: number of seconds the daemon will wait before
fencing any victims after a node joins the domain
post_fail_delay: number of seconds the daemon will wait before
fencing any victims after a domain member fails
clean_start : prevent any startup fencing the daemon might do.
It indicates that the daemon should assume all nodes
are in a clean state to start. -->
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="drbd-node1" votes="1" nodeid="1">
<fence>
<!-- Handle fencing manually -->
<method name="human">
<device name="human" nodename="drbd-node1"/>
</method>
</fence>
</clusternode>
<clusternode name="drbd-node2" votes="1" nodeid="2">
<fence>
<!-- Handle fencing manually -->
<method name="human">
<device name="human" nodename="drbd-node2"/>
</method>
</fence>
</clusternode>
</clusternodes>
<!-- cman two nodes specification -->
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<!-- Define manual fencing -->
<fencedevice name="human" agent="fence_manual"/>
</fencedevices>
</cluster>


clusternode name should be a fully qualified domain name and should be resolved by DNS or be present
in /etc/hosts. Number of votes is used to determine quorum of the
cluster. In this case, we have two nodes, one vote per node, and expect one vote to make the cluster run (to have a
quorum), as configured by cman expected attribute.

The second thing we need to configure is the cluster engine (corosync). Its configuration goes to
/etc/corosync/corosync.conf:


compatibility: whitetank
totem {
version: 2
secauth: off
threads: 0
# fail_recv_const: 5000
interface {
ringnumber: 0
bindnetaddr: 10.0.0.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: yes
# the pathname of the log file
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}



The bindinetaddr parameter must contain a network address. We configure corosync to work on eth1 interfaces, connecting our nodes back-to-back on 1Gbps network. Also, we should configure iptables to accept multicast traffic on both hosts.

It's noteworthy that these configurations should be identical on both cluster nodes.

After the cluster has been prepared, we can change the LVM locking type in/etc/lvm/lvm.conf on both drbd-connected nodes:




global {
...
locking_type = 3
...
}


Start cman and clvmd services on drbd-peers and get our cluster ready for the
action:



(drbd-node1)# service cman start
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Waiting for quorum... [ OK ]
Starting fenced... [ OK ]
Starting dlm_controld... [ OK ]
Unfencing self... [ OK ]
Joining fence domain... [ OK ]
(drbd-node1)# service clvmd start
Starting clvmd:
Activating VG(s): 2 logical volume(s) in volume group "vg-sys" now active
2 logical volume(s) in volume group "vg_shared" now active
[ OK ]


Now, as we already have a Volume Group on the shared storage, we can easily make it cluster-aware:



(drbd-node1)# vgchange -c y vg_shared


Now we see the 'c' flag in VG Attributes:



(drbd-node1)# vgs
VG #PV #LV #SN Attr VSize VFree
vg_shared 1 3 0 wz--nc 1.29t 1.04t
vg_sys 1 2 0 wz--n- 19.97g 5.97g


As a result, Logical Volumes created in the vg_shared volume group will be active on both nodes, and
clustered locking is enabled for operations with volumes in this group. LVM commands can be issued on both hosts and
clvmd takes care of possible concurrent metadata changes.

Choose your cloud native journey.

Whatever your role, we’re here to help with open source tools and world-class support.

GET STARTED
NEWSLETTER

Subscribe to our bi-weekly newsletter for exclusive interviews, expert commentary, and thought leadership on topics shaping the cloud native world.

JOIN NOW