ceph-deploy mon create-initial报错

1
2
3
4
5
6
7
8
[ceph_deploy.mds][INFO  ] Distro info: CentOS Linux 7.3.1611 Core
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to k8s-node1
[k8s-node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mds][ERROR ] RuntimeError: config file /etc/ceph/ceph.conf exists with different content; use --overwrite-conf to overwrite
[ceph_deploy] [ERROR ] GenericError: Failed to create 1 MDSs
# 解决办法:
[root@k8s-node1 ceph]# ceph-deploy --overwrite-conf config push k8s-node1

osd不足三个导致Ceph WARN

1
2
3
如果创建池时不指定副本数量,则默认为3
可通过osd_pool_default_size参数修改,还可以通过如下命令修改:
ceph osd pool set {pool-name} size {number-of-replicas} osd_pool_default_min_size参数可用于设置最对象可用的最小副本数,默认为2

Ceph 集群 WARN :mon.host-node3

1
2
3
4
5
6
7
8
9
10
11
12
[root@host-node2 ceph]# ceph -s
cluster:
id: 5a28f45d-b1f1-4f75-a735-80d5021763f0
health: HEALTH_WARN
clock skew detected on mon.host-node3
services:
mon: 3 daemons, quorum host-node2,host-node3,host-node8
mgr: host-node2(active)
osd: 3 osds: 3 up, 3 in
# 解决办法:
[root@host-node3 ceph]# ntpdate cn.ntp.org.cn && systemctl restart ntpd
[root@host-node2 ceph]# systemctl restart ceph-mon.target

配置osd时出现以下问题

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# 报错如下:是因为磁盘不为空
[yz-node1][WARNIN] Running command: /bin/ceph-authtool --gen-print-key
[yz-node1][WARNIN] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 8d59ceec-a885-4e9c-b947-8c607f18873e
[yz-node1][WARNIN] Running command: /usr/sbin/vgcreate -s 1G --force --yes ceph-4660e614-8006-458c-b32c-31be3f45b595 /dev/sdb
[yz-node1][WARNIN] stderr: Physical volume '/dev/sdb' is already in volume group 'ceph-616760d9-7c3b-4b18-b747-b4c251125a0c'
[yz-node1][WARNIN] Unable to add physical volume '/dev/sdb' to volume group 'ceph-616760d9-7c3b-4b18-b747-b4c251125a0c'
[yz-node1][WARNIN] /dev/sdb: physical volume not initialized.
[yz-node1][WARNIN] --> Was unable to complete a new OSD, will rollback changes
[yz-node1][WARNIN] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.0 --yes-i-really-mean-it
[yz-node1][WARNIN] stderr: purged osd.0
[yz-node1][WARNIN] --> RuntimeError: command returned non-zero exit status: 5
[yz-node1][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-volume --cluster ceph lvm create --filestore --data /dev/sdb --journal /dev/sda4
[ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs

# 解决方法:
[root@yz-node1 ceph]# lvm
lvm> pvs
PV VG Fmt Attr PSize PFree
/dev/sda3 centos lvm2 a-- 206.00g 4.00m
/dev/sdb ceph-616760d9-7c3b-4b18-b747-b4c251125a0c lvm2 a-- <7.28t 0
/dev/sdc ceph-b2a9aa97-e37d-4f79-8ae7-6f5867c62335 lvm2 a-- <7.28t 0
/dev/sdd ceph-9312a3e7-07bb-4f38-80cb-c8ea7418b5c1 lvm2 a-- <7.28t 0
/dev/sdf ceph-afba9740-8e9c-4f19-abe5-0815893b51ca lvm2 a-- <7.28t 0
/dev/sdg ceph-c887e4c8-5480-4934-ae7a-2e30b9816f27 lvm2 a-- <7.28t 0
/dev/sdh ceph-3751ad8f-2c52-4d5b-a0fb-f4f660951749 lvm2 a-- <7.28t 0
/dev/sdi ceph-9527d47a-84e9-457d-b346-832adbf4ac88 lvm2 a-- <7.28t 0
/dev/sdj ceph-085416d0-c77e-4c64-96ea-d915794f881c lvm2 a-- <7.28t 0
/dev/sdk ceph-befdb236-4571-426f-99a1-2529fcca1955 lvm2 a-- <7.28t 0
/dev/sdl ceph-23355e81-4674-4cd4-a2a4-bd48015b31d0 lvm2 a-- <7.28t 0
/dev/sdm ceph-40e03345-b6bf-4eba-af27-5baab0d1e40b lvm2 a-- <7.28t 0
lvm> pvcreate -ff /dev/sdb # 强制初始化
...
lvm> pvcreate -ff /dev/sdm
lvm> pvremove /dev/sdb # 该命令为删除物理卷
...
lvm> pvremove /dev/sdm
lvm> pvs
PV VG Fmt Attr PSize PFree
/dev/sda3 centos lvm2 a-- 206.00g 4.00m
# 当sd磁盘不存在时说明已解决
[root@yz-node1 ~]# lsblk
sdb 8:16 0 7.3T 0 disk
sdc 8:32 0 7.3T 0 disk
sdd 8:48 0 7.3T 0 disk
sde 8:64 0 7.3T 0 disk
sdf 8:80 0 7.3T 0 disk
sdg 8:96 0 7.3T 0 disk
sdh 8:112 0 7.3T 0 disk
sdi 8:128 0 7.3T 0 disk
sdj 8:144 0 7.3T 0 disk
sdk 8:160 0 7.3T 0 disk
sdl 8:176 0 7.3T 0 disk
sdm 8:192 0 7.3T 0 disk

Ceph 集群 WARN :Long heartbeat ping times on back interface seen, longest is 4810.821 msec

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# deploy节点执行
[root@yz-node1 ceph]# ceph -s
cluster:
id: b185cae9-a5b6-45d9-9342-5e737fdf38d9
health: HEALTH_WARN
Long heartbeat ping times on back interface seen, longest is 4810.821 msec
Long heartbeat ping times on front interface seen, longest is 4811.218 msec
clock skew detected on mon.yz-node2, mon.yz-node3
[root@yz-node1 ceph]# vi ceph.conf
# 修改.ceph文件,在gloabl中补充两条信息
mon clock drift allowed = 2
mon clock drift warn backoff = 30
[root@yz-node1 ceph]# ceph-deploy --overwrite-conf config push yz-node1 yz-node2 yz-node3
[root@yz-node1 ceph]# systemctl daemon-reload
[root@yz-node1 ceph]# systemctl restart ceph-mon@mon.targe

# mon.yz-node2, mon.yz-node3执行
[root@yz-node1 ceph]# ntpdate 220.250.70.60
如果恢复不了,卸载ntpdate重新下载并同步时间

Ceph 集群 WARN :application not enabled on 1 pool(s)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# deploy节点执行
[root@yz-node1 data]# ceph -s
cluster:
id: b185cae9-a5b6-45d9-9342-5e737fdf38d9
health: HEALTH_WARN
application not enabled on 1 pool(s)
[root@yz-node1 data]# ceph health detail
HEALTH_WARN application not enabled on 1 pool(s)
POOL_APP_NOT_ENABLED application not enabled on 1 pool(s)
application not enabled on pool 'images'

[root@yz-node1 data]# ceph osd pool application enable {poolname} rbd

[root@yz-node1 data]# ceph -s
cluster:
id: b185cae9-a5b6-45d9-9342-5e737fdf38d9
health: HEALTH_OK