旧ceph集群

IP hostname 角色
10.99.2.155 pool1 ceph-deploy,mon,mgr,osd
10.99.2.156 pool2 mon,mgr,osd
10.99.2.157 pool3 mon,mgr,osd

新ceph节点

做好yum源,放行防火墙,selinux,ntp

IP hostname 角色
10.99.2.158 pool4 安装ceph,不需要安装ceph-deploy
10.99.2.159 pool5 安装ceph
10.99.2.160 pool6 安装ceph

需求:需要将新集群中的ceph osd加入旧集群中,集群合二为一并且保证集群中数据同步且一致;

源ceph环境

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
 [root@pool1 ceph]# ceph -s
services:
mon: 3 daemons, quorum pool1,pool2,pool3
mgr: pool1(active), standbys: pool2, pool3
osd: 3 osds: 3 up, 3 in; 79 remapped pgs

[root@pool1 ceph]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.11151 root default
-3 0.01859 host pool1
0 hdd 0.01859 osd.0 up 1.00000 1.00000
-5 0.01859 host pool2
1 hdd 0.01859 osd.1 up 1.00000 1.00000
-7 0.01859 host pool3
2 hdd 0.01859 osd.2 up 1.00000 1.00000

新节点的机器做免密登录

pool1执行

1
2
3
[root@pool1 ceph]# ssh-copy-id pool4
[root@pool1 ceph]# ssh-copy-id pool5
[root@pool1 ceph]# ssh-copy-id pool6

新节点安装ceph

pool1执行

1
[root@pool1 ceph]# ceph-deploy install --no-adjust-repos pool4 pool5 pool6

新节点执行

1
[root@pool4 ~]# parted  /dev/sdb mklabel gpt -s

将新节点的osd加入到集群中

pool1执行

1
2
3
4
5
6
[root@pool1 ceph]# ceph-deploy disk zap pool4 /dev/sdb
[root@pool1 ceph]# ceph-deploy disk zap pool5 /dev/sdb
[root@pool1 ceph]# ceph-deploy disk zap pool6 /dev/sdb
[root@pool1 ceph]# ceph-deploy osd create pool4 --data /dev/sdb
[root@pool1 ceph]# ceph-deploy osd create pool5 --data /dev/sdb
[root@pool1 ceph]# ceph-deploy osd create pool6 --data /dev/sdb

修改ceph配置文件

pool1执行

> 如需添加控制节点需要修改配置文件,不需要添加则不需要修改

1
2
3
4
5
6
7
8
9
10
11
12
13
# 添加新集群的hostname及IP
echo "public network=10.99.2.0/24" >> ceph.conf
[root@pool1 ceph]# cat ceph.conf
[global]
fsid = d35002f6-34f0-4097-b6db-2dc40e66764e
mon_initial_members = pool1, pool2, pool3, pool4, pool5, pool6
mon_host = 10.99.2.155,10.99.2.156,10.99.2.157,10.99.2.158,10.99.2.159,10.99.2.160
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public network=10.99.2.0/24
[mon]
mon allow pool delete = true

下发文件

pool1执行

旧节点新节点,都需要同步配置文件

1
[root@pool1 ceph]# ceph-deploy --overwrite-conf admin pool1 pool2 pool3 pool4 pool5 pool6

加入到集群监控

pool1执行

1
2
[root@pool1 ceph]# ceph-deploy mon create pool4 pool5 pool6
[root@pool1 ceph]# ceph-deploy mgr create pool4 pool5 pool6

新节点成功加入集群后

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
[root@pool1 ceph]# ceph -s
cluster:
id: d35002f6-34f0-4097-b6db-2dc40e66764e
health: HEALTH_WARN
26/5439 objects misplaced (0.478%)
Degraded data redundancy: 1782/5439 objects degraded (32.763%), 105 pgs degraded, 80 pgs undersized
application not enabled on 3 pool(s)
clock skew detected on mon.pool4, mon.pool5, mon.pool6

services:
mon: 6 daemons, quorum pool1,pool2,pool3,pool4,pool5,pool6
mgr: pool1(active), standbys: pool2, pool3, pool4, pool6, pool5
osd: 6 osds: 6 up, 6 in; 79 remapped pgs

[root@pool1 ceph]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.11151 root default
-3 0.01859 host pool1
0 hdd 0.01859 osd.0 up 1.00000 1.00000
-5 0.01859 host pool2
1 hdd 0.01859 osd.1 up 1.00000 1.00000
-7 0.01859 host pool3
2 hdd 0.01859 osd.2 up 1.00000 1.00000
-9 0.01859 host pool4
3 hdd 0.01859 osd.3 up 1.00000 1.00000
-11 0.01859 host pool5
4 hdd 0.01859 osd.4 up 1.00000 1.00000
-13 0.01859 host pool6
5 hdd 0.01859 osd.5 up 1.00000 1.00000

查看集群状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@pool1 ceph]# ceph mon dump
dumped monmap epoch 4
epoch 4
fsid d35002f6-34f0-4097-b6db-2dc40e66764e
last_changed 2021-03-11 12:34:48.877201
created 2021-03-09 21:59:45.148467
0: 10.99.2.155:6789/0 mon.pool1
1: 10.99.2.156:6789/0 mon.pool2
2: 10.99.2.157:6789/0 mon.pool3
3: 10.99.2.158:6789/0 mon.pool4
4: 10.99.2.159:6789/0 mon.pool5
5: 10.99.2.160:6789/0 mon.pool6
# 此时发现新节点pool4/5/6都已经加入集群
[root@pool4 ceph]# ceph mgr dump

扩容之后发现集群WARN

1
2
3
4
5
6
7
8
[root@pool1 ceph]# ceph -s
cluster:
id: d35002f6-34f0-4097-b6db-2dc40e66764e
health: HEALTH_WARN
26/5439 objects misplaced (0.478%)
Degraded data redundancy: 1782/5439 objects degraded (32.763%), 105 pgs degraded, 80 pgs undersized
application not enabled on 3 pool(s)
clock skew detected on mon.pool4, mon.pool5, mon.pool6

解决方案

1
2
3
4
5
6
7
8
9
# 所有节点重新ntpdate,同步时间
ntpdate 10.99.2.2
# 重启mon,mgr,osd
systemctl restart ceph-mon.target
systemctl restart ceph-mgr.target
systemctl restart ceph-osd.target

[root@pool1 ceph]# ceph health
HEALTH_OK