HW 엔지니어를 위한 Deep Learning: stonith

Redhat HA cluster를 IBM POWER9 (ppc64le) 기반의 Redhat 7에서 설치하는 방법입니다.

먼저 firewalld를 stop 시킵니다.

[root@ha1 ~]# systemctl stop firewalld

[root@ha1 ~]# systemctl disable firewalld

아래의 package들을 설치합니다. 이건 Redhat OS DVD에는 없고 별도의 yum repository에 들어있습니다. ppc64le의 경우엔 rhel-ha-for-rhel-7-server-for-power-le-rpms 라는 yum repo에 있습니다.

[root@ha1 ~]# yum install pcs fence-agents-all

설치하면 hacluster라는 user가 자동 생성되는데 여기에 passwd를지정해줘야 합니다.

[root@ha1 ~]# passwd hacluster

그리고 pcsd daemon을 start 합니다. Reboot 후에도 자동 start 되도록 enable도 합니다.

[root@ha1 ~]# systemctl start pcsd.service

[root@ha1 ~]# systemctl enable pcsd.service

참여할 node에 아래와 같이 인증 작업을 합니다.

[root@ha1 ~]# pcs cluster auth ha1 ha2

Username: hacluster

Password:

ha1: Authorized

ha2: Authorized

간단히 아래와 같이 corosysnc.conf 파일을 만듭니다. ha1, ha2 노드는 물론 /etc/hosts에 등록된 IP 주소입니다.

[root@ha1 ~]# vi /etc/corosync/corosync.conf

totem {

version: 2

secauth: off

cluster_name: tibero_cluster

transport: udpu

}

nodelist {

node {

ring0_addr: ha1

nodeid: 1

}

node {

ring0_addr: ha2

nodeid: 2

}

quorum {

provider: corosync_votequorum

two_node: 1

}

logging {

to_syslog: yes

}

Cluster를 전체 node에서 enable합니다.

[root@ha1 ~]# pcs cluster enable --all

ha1: Cluster Enabled

ha2: Cluster Enabled

다음과 같이 cluster start 합니다.

[root@ha1 ~]# pcs cluster start --all

ha1: Starting Cluster (corosync)...

ha2: Starting Cluster (corosync)...

ha1: Starting Cluster (pacemaker)...

ha2: Starting Cluster (pacemaker)...

상태 확인해봅니다.

[root@ha1 ~]# pcs cluster status

Cluster Status:

Stack: unknown

Current DC: NONE

Last updated: Mon Oct 26 10:14:59 2020

Last change: Mon Oct 26 10:14:55 2020 by hacluster via crmd on ha1

2 nodes configured

0 resource instances configured

PCSD Status:

ha1: Online

ha2: Online

이때 ha2 노드에 가보면 corosysnc.conf 파일은 없습니다.

[root@ha2 ~]# ls -l /etc/corosync/

total 12

-rw-r--r--. 1 root root 2881 Jun 5 23:10 corosync.conf.example

-rw-r--r--. 1 root root 767 Jun 5 23:10 corosync.conf.example.udpu

-rw-r--r--. 1 root root 3278 Jun 5 23:10 corosync.xml.example

drwxr-xr-x. 2 root root 6 Jun 5 23:10 uidgid.d

이걸 ha2에서 생성시키려면 cluster를 sync하면 됩니다.

[root@ha1 ~]# pcs cluster sync

ha1: Succeeded

ha2: Succeeded

생성된 것을 확인하실 수 있습니다.

[root@ha2 ~]# ls -l /etc/corosync/

total 16

-rw-r--r--. 1 root root 295 Oct 26 10:17 corosync.conf

-rw-r--r--. 1 root root 2881 Jun 5 23:10 corosync.conf.example

-rw-r--r--. 1 root root 767 Jun 5 23:10 corosync.conf.example.udpu

-rw-r--r--. 1 root root 3278 Jun 5 23:10 corosync.xml.example

drwxr-xr-x. 2 root root 6 Jun 5 23:10 uidgid.d

이제 cluster resource를 확인합니다. 당연히 아직 정의된 것이 없습니다.

[root@ha1 ~]# pcs resource show

NO resources configured

두 node 사이에서 failover 받을 cluster의 virtual IP를 아래와 같이 VirtualIP라는 resource ID 이름으로 등록합니다. 참고로 ha1은 10.1.1.1, ha2는 10.1.1.2이고 모두 eth1에 부여된 IP입니다.

[root@ha1 ~]# pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=10.1.1.11 cidr_netmask=24 nic=eth1 op monitor interval=30s

[root@ha1 ~]# pcs resource enable VirtualIP

이제 다시 resource를 봅니다.

[root@ha1 ~]# pcs resource show

VirtualIP (ocf::heartbeat:IPaddr2): Stopped

아직 VirtualIP가 stopped 상태인데, 이는 아직 STONITH가 enable 되어 있는 default 상태이기 때문입니다. STONITH는 split-brain을 방지하기 위한 장치인데, 지금 당장은 disable 하겠습니다.

[root@ha1 ~]# pcs property set stonith-enabled=false

Verify를 해봅니다. 아무 메시지 없으면 통과입니다.

[root@ha1 ~]# crm_verify -L

이제 다시 status를 보면 VirtualIP가 살아 있는 것을 보실 수 있습니다.

[root@ha1 ~]# pcs status

Cluster name: tibero_cluster

Stack: corosync

Current DC: ha1 (version 1.1.23-1.el7-9acf116022) - partition with quorum

Last updated: Mon Oct 26 11:18:31 2020

Last change: Mon Oct 26 11:17:31 2020 by root via cibadmin on ha1

2 nodes configured

1 resource instance configured

Online: [ ha1 ha2 ]

Full list of resources:

VirtualIP (ocf::heartbeat:IPaddr2): Started ha1

Daemon Status:

corosync: active/enabled

pacemaker: active/enabled

pcsd: active/enabled

또 IP address를 보면 10.1.1.11이 eth1에 붙은 것도 보실 수 있습니다.

[root@ha1 ~]# ip a | grep 10.1.1

inet 10.1.1.1/24 brd 10.1.1.255 scope global noprefixroute eth1

inet 10.1.1.11/24 brd 10.1.1.255 scope global secondary eth1

다른 node에서 10.1.1.11 (havip)로 ping을 해보면 잘 됩니다.

[root@gw ~]# ping havip

PING havip (10.1.1.11) 56(84) bytes of data.

64 bytes from havip (10.1.1.11): icmp_seq=1 ttl=64 time=0.112 ms

64 bytes from havip (10.1.1.11): icmp_seq=2 ttl=64 time=0.040 ms

이 resource가 failover된 이후 죽었던 node가 되살아나면 원래의 node로 failback 하게 하려면 아래와 같이 합니다.

[root@ha1 ~]# pcs resource defaults resource-stickiness=100

Warning: Defaults do not apply to resources which override them with their own defined values

[root@ha1 ~]# pcs resource defaults

resource-stickiness=100

이제 sdb disk를 이용하여 LVM 작업을 합니다. 여기서는 /data와 /backup이 VirtualIP와 함께 ha1에 mount 되어 있다가 유사시 ha2로 failover 되도록 하고자 합니다.

[root@ha1 ~]# pvcreate /dev/sdb

[root@ha1 ~]# vgcreate datavg /dev/sdb

[root@ha1 ~]# lvcreate -L210000 -n datalv datavg

[root@ha1 ~]# lvcreate -L150000 -n backuplv datavg

[root@ha1 ~]# vgs

VG #PV #LV #SN Attr VSize VFree

datavg 1 2 0 wz--n- <400.00g 48.43g

[root@ha1 ~]# lvs

LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert

backuplv datavg -wi-a----- 146.48g

datalv datavg -wi-a----- <205.08g

[root@ha1 ~]# mkfs.ext4 /dev/datavg/datalv

[root@ha1 ~]# mkfs.ext4 /dev/datavg/backuplv

[root@ha1 ~]# mkdir /data

[root@ha1 ~]# mkdir /backup

[root@ha1 ~]# ssh ha2 mkdir /data

[root@ha1 ~]# ssh ha2 mkdir /backup

이제 이 VG와 filesystem들이 한쪽 node에만, 그것도 OS가 아니라 HA cluster (pacemaker)에 의해서만 mount 되도록 설정합니다.

[root@ha1 ~]# grep use_lvmetad /etc/lvm/lvm.conf

use_lvmetad = 1

[root@ha1 ~]# lvmconf --enable-halvm --services --startstopservices

[root@ha1 ~]# grep use_lvmetad /etc/lvm/lvm.conf

use_lvmetad = 0

[root@ha1 ~]# vgs --noheadings -o vg_name

datavg

그리고 이 VG와 LV, filesystem을 pcs에 등록합니다.

[root@ha1 ~]# pcs resource create tibero_vg LVM volgrpname=datavg exclusive=true --group tiberogroup

Assumed agent name 'ocf:heartbeat:LVM' (deduced from 'LVM')

[root@ha1 ~]# pcs resource create tibero_data Filesystem device="/dev/datavg/datalv" directory="/data" fstype="ext4" --group tiberogroup

Assumed agent name 'ocf:heartbeat:Filesystem' (deduced from 'Filesystem')

[root@ha1 ~]# pcs resource create tibero_backup Filesystem device="/dev/datavg/backuplv" directory="/backup" fstype="ext4" --group tiberogroup

Assumed agent name 'ocf:heartbeat:Filesystem' (deduced from 'Filesystem')

[root@ha1 ~]# pcs resource update VirtualIP --group tiberogroup

그리고 VirtualIP가 항상 이 filesytem들과 함께 움직이도록 colocation constraint를 줍니다.

[root@ha1 ~]# pcs constraint colocation add tiberogroup with VirtualIP INFINITY

그리고 아래 내용은 두 node에서 모두 수행합니다. ha2도 reboot해야 거기서 datavg 및 거기에 든 LV들이 인식됩니다.

[root@ha1 ~]# vi /etc/lvm/lvm.conf

...

volume_list = [ ]

...

[root@ha1 ~]# dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)

[root@ha1 ~]# shutdown -r now

Reboot 이후 보면 VirtualIP나 /data, /backup filesystem이 모두 ha1에서 mount 되어 있는 것을 보실 수 있습니다.

[root@ha1 ~]# pcs status

Cluster name: tibero_cluster

Stack: corosync

Current DC: ha1 (version 1.1.23-1.el7-9acf116022) - partition with quorum

Last updated: Mon Oct 26 13:44:36 2020

Last change: Mon Oct 26 13:44:14 2020 by root via cibadmin on ha1

2 nodes configured

4 resource instances configured

Online: [ ha1 ha2 ]

Full list of resources:

VirtualIP (ocf::heartbeat:IPaddr2): Started ha1

Resource Group: tiberogroup

tibero_vg (ocf::heartbeat:LVM): Started ha1

tibero_data (ocf::heartbeat:Filesystem): Started ha1

tibero_backup (ocf::heartbeat:Filesystem): Started ha1

Daemon Status:

corosync: active/enabled

pacemaker: active/enabled

pcsd: active/enabled

ha1 노드를 죽여버리면 곧 VirtualIP와 filesystem들이 자동으로 ha2에 failover 되어 있는 것을 확인하실 수 있습니다.

[root@ha1 ~]# halt -f

Halting.

------------

[root@ha2 ~]# df -h

Filesystem Size Used Avail Use% Mounted on

devtmpfs 28G 0 28G 0% /dev

tmpfs 28G 58M 28G 1% /dev/shm

tmpfs 28G 14M 28G 1% /run

tmpfs 28G 0 28G 0% /sys/fs/cgroup

/dev/sda5 50G 2.6G 48G 6% /

/dev/sda6 321G 8.5G 313G 3% /home

/dev/sda2 1014M 231M 784M 23% /boot

tmpfs 5.5G 0 5.5G 0% /run/user/0

/dev/mapper/datavg-datalv 202G 61M 192G 1% /data

/dev/mapper/datavg-backuplv 145G 61M 137G 1% /backup

[root@ha2 ~]# ip a | grep 10.1.1

inet 10.1.1.2/24 brd 10.1.1.255 scope global noprefixroute eth1

inet 10.1.1.11/24 brd 10.1.1.255 scope global secondary eth1

ha1이 죽은 상태에서의 status는 아래와 같이 나옵니다.

[root@ha2 ~]# pcs status

Cluster name: tibero_cluster

Stack: corosync

Current DC: ha2 (version 1.1.23-1.el7-9acf116022) - partition with quorum

Last updated: Mon Oct 26 13:48:49 2020

Last change: Mon Oct 26 13:47:18 2020 by root via cibadmin on ha1

2 nodes configured

4 resource instances configured

Online: [ ha2 ]

OFFLINE: [ ha1 ]

Full list of resources:

VirtualIP (ocf::heartbeat:IPaddr2): Started ha2

Resource Group: tiberogroup

tibero_vg (ocf::heartbeat:LVM): Started ha2

tibero_data (ocf::heartbeat:Filesystem): Started ha2

tibero_backup (ocf::heartbeat:Filesystem): Started ha2

Daemon Status:

corosync: active/enabled

pacemaker: active/enabled

pcsd: active/enabled

이 상태에서 pcs cluster를 stop 시키면 VirtualIP와 filesystem들이 모두 내려갑니다.

[root@ha2 ~]# pcs cluster stop --force

Stopping Cluster (pacemaker)...

Stopping Cluster (corosync)...

[root@ha2 ~]# df -h

Filesystem Size Used Avail Use% Mounted on

devtmpfs 28G 0 28G 0% /dev

tmpfs 28G 0 28G 0% /dev/shm

tmpfs 28G 14M 28G 1% /run

tmpfs 28G 0 28G 0% /sys/fs/cgroup

/dev/sda5 50G 2.6G 48G 6% /

/dev/sda6 321G 8.5G 313G 3% /home

/dev/sda2 1014M 231M 784M 23% /boot

tmpfs 5.5G 0 5.5G 0% /run/user/0

[root@ha2 ~]# ip a | grep 10.1.1

inet 10.1.1.2/24 brd 10.1.1.255 scope global noprefixroute eth1

HW 엔지니어를 위한 Deep Learning

2020년 10월 27일 화요일

IBM POWER9 Redhat 7에서 Redhat HA Cluster 구성하는 방법