Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 1510470

Summary: Containerized OSDs don't start - fail to find the Journal device
Product: Red Hat Ceph Storage Reporter: Daniel Messer <dmesser>
Component: Ceph-AnsibleAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED ERRATA QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.0CC: adeza, aschoen, ceph-eng-bugs, dmesser, gabrioux, gmeno, hnallurv, nthomas, sankarshan, shan, tserlin, vakulkar, vashastr
Target Milestone: rc   
Target Release: 3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: RHEL: ceph-ansible-3.0.10-2.el7cp Ubuntu: ceph-ansible_3.0.10-2redhat1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-05 23:49:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
all.yml
none
osds.yml none

Description Daniel Messer 2017-11-07 13:57:02 UTC
Created attachment 1348966 [details]
all.yml

Description of problem:

With ceph-ansible a containerized setup does not provide a functional cluster. In a non-collocated file store scenario the installation playbook runs through successfully but the OSD containers are constantly restarting.

Version-Release number of selected component (if applicable):

ceph-ansible-3.0.9-1.el7cp
ceph-3.0-rhel-7-docker-candidate-61072-20171104225422


How reproducible:

Steps to Reproduce:
1. Set up ceph-ansible according to https://doc-stage.usersys.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/container_guide/#additional_resources_7
2. Choose non-collocated as the osd_sceario and populate devices, dedicated_devices with with the OSD and Journal devices
3. run the installation playbooks site-docker.yml

Actual results:

Cluster deployed but OSD containers restarting. Looking at the log output of those containers they seem to wait on a device with a PARTUUID that does not exist:

Waiting for /dev/disk/by-partuuid/f98ac6cf-bcf8-4276-995a-7cdb7e0ae5d0 to show up

Looking at the blkid output of this host the correct partitioning appears but the PARTUUIDs are different:

blkid
/dev/xvda1: PARTUUID="3c387322-88aa-42ba-8c46-ddb0e76f1054"
/dev/xvda2: UUID="de4def96-ff72-4eb9-ad5e-0847257d1866" TYPE="xfs" PARTUUID="a34cf35b-104d-49b0-ae11-f664a286af07"
/dev/xvdg1: UUID="0c7b0218-be01-452f-bc53-e4c0a1599f6c" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="8c871e7e-2b97-4829-9e67-a27fb0e3c208"
/dev/xvdf1: UUID="5653689a-b654-4308-b3e7-d2400bad1054" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="6fa2c6a0-4d04-4d13-bdfe-8c9f3ade661e"
/dev/xvdi1: UUID="68f241e9-5054-4b63-af01-37e491c81eff" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="d650c775-c2be-421d-b69a-37cb69cfdbe2"
/dev/xvdd1: UUID="8e2e99fa-5678-4b86-ac76-e1fb2afb8124" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="23641025-a6f9-4515-aa83-47e7cc387d7b"
/dev/xvdc1: UUID="874dd414-73cd-484f-ab55-fd63a3b1425e" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="ef474883-14df-420e-b09d-20d3760de8a9"
/dev/xvdb1: UUID="7ae5ba65-3805-4b0b-ba46-b0402c38bd29" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="1c1947d6-0eb8-4f8e-9561-bee41df7358b"
/dev/xvdh1: UUID="7840d2c1-c6a5-4c1c-a6c6-b84246304786" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="9c8ea86d-6c1f-48e3-84f0-c80d3de7e577"
/dev/xvde1: UUID="a86090ca-c60a-405a-a851-b379aa5093ce" TYPE="xfs" PARTLABEL="ceph data" PARTUUID="ba3a233b-e6bb-442e-b0d7-79f2d7450dbc"
/dev/xvdj1: PARTLABEL="ceph journal" PARTUUID="029b1cc0-f5e0-47a1-bc5e-1c94b8661a8f"
/dev/xvdj2: PARTLABEL="ceph journal" PARTUUID="40b70e9b-b774-46c5-89fd-d800c50cef94"
/dev/xvdj3: PARTLABEL="ceph journal" PARTUUID="9f701674-6c62-461d-85fa-93e94b68b094"
/dev/xvdj4: PARTLABEL="ceph journal" PARTUUID="17ff2492-4475-4fe8-b529-55abdde206ec"
/dev/xvdj5: PARTLABEL="ceph journal" PARTUUID="351839f8-e70f-4cac-a752-a071b2f36db2"
/dev/xvdj6: PARTLABEL="ceph journal" PARTUUID="7a8d8a3a-7507-46e5-ad58-c9c6d8fb1fe2"
/dev/xvdj7: PARTLABEL="ceph journal" PARTUUID="90d14afd-164b-4861-86f6-b7208b84f3e9"
/dev/xvdj8: PARTLABEL="ceph journal" PARTUUID="9fcddc08-2cce-4ffb-8c63-d5d1a2427fab"

The PARTUUID the container is referring to comes from it's env variable OSD_JOURNAL. The id does not exist on any nodes/devices in the entire cluster.

Expected results:

The installation finishes and OSD containers are up and running. OSD_JOURNAL either points to an existing block device or symbolic link below /dev/disk/by-partuuid/ which points to an existing device.

Additional info:

- current RHEL 7.4
- current nightly build of RHCS 3.0 beta
- ceph-ansible from nightly builds
- group_vars/all.yml attached
- group_vars/osds.yml attached

Comment 3 Daniel Messer 2017-11-07 13:59:31 UTC
Created attachment 1348967 [details]
osds.yml

Comment 4 Daniel Messer 2017-11-07 14:00:41 UTC
I should add that this environment had several install attempts before and cleaned with purge-site-docker.yml. The PARTUUID that the OSDs are looking for however stay the same. The fetch directory is cleaned between runs.
The problem persists even when changing to collocated setups.

Comment 5 leseb 2017-11-08 04:43:41 UTC
This is not something we see in our CI.

Can we access this env?
Thanks!

Comment 6 Guillaume Abrioux 2017-11-08 09:59:33 UTC
Could you provide the full playbook run log?

Thanks

Comment 7 Guillaume Abrioux 2017-11-08 12:12:14 UTC
I tried to reproduce your issue with ceph-ansibe v3.0.9 and ceph-3.0-rhel-7-docker-candidate-61072-20171104225422 container image, the deployment worked fine.

OSDs are UP:

[root@osd0 ~]# docker ps -a
CONTAINER ID        IMAGE                                                                                                               COMMAND             CREATED             STATUS                      PORTS               NAMES
ff6266f26745        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-3.0-rhel-7-docker-candidate-61072-20171104225422   "/entrypoint.sh"    28 minutes ago      Up 28 minutes                                   ceph-osd-osd0-sdb
cea920b57eca        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-3.0-rhel-7-docker-candidate-61072-20171104225422   "/entrypoint.sh"    28 minutes ago      Up 28 minutes                                   ceph-osd-osd0-sda
299226e51fd4        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-3.0-rhel-7-docker-candidate-61072-20171104225422   "/entrypoint.sh"    28 minutes ago      Exited (0) 28 minutes ago                       ceph-osd-prepare-osd0-sdb
0a103838a516        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhceph:ceph-3.0-rhel-7-docker-candidate-61072-20171104225422   "/entrypoint.sh"    28 minutes ago      Exited (0) 28 minutes ago                       ceph-osd-prepare-osd0-sda
[root@osd0 ~]#


[root@mon0 ~]# docker exec -ti ceph-mon-mon0 ceph -s
  cluster:
    id:     915ba53a-1288-4062-aa6d-45b5db0019b2
    health: HEALTH_WARN
            too few PGs per OSD (8 < min 30)

  services:
    mon: 3 daemons, quorum mon0,mon1,mon2
    mgr: mon0(active)
    mds: cephfs-1/1/1 up  {0=mds0=up:active}
    osd: 2 osds: 2 up, 2 in

  data:
    pools:   2 pools, 16 pgs
    objects: 21 objects, 2246 bytes
    usage:   214 MB used, 102133 MB / 102347 MB avail
    pgs:     16 active+clean

[root@mon0 ~]#


I think your multiple attempts to deploy have probably broken something.
I couldn't reproduce your issue, CI and QE didn't catch this issue as well. Could you retry to deploy from scratch? As Sebastien asked, any chance to access your env?

Thanks!

Comment 8 Daniel Messer 2017-11-08 12:38:18 UTC
@Guilaume - this might well be the case. I will send you and leseb the credentials of the environment. It's AWS-based. I could retry to deploy from scratch to, but honestly I don't see what could cause this. I suggest we work in parallel. I will try to re-deploy from scratch and you can try to re-deploy in my environment see where it's choking up. This behavior will likely effect others that run in https://bugzilla.redhat.com/show_bug.cgi?id=1510555 - which is the reason I had to re-deploy so many times.

Comment 9 Guillaume Abrioux 2017-11-09 10:13:52 UTC
Hi Daniel,

the issue here is in purge-docker-cluster.yml playbook.
You tried several times to deploy your cluster; the first time, the osd disk prepare process produced some logs that are used later in ceph-ansible to retrieve journal partition uuid, these logs are supposed to be generated [1] only at initial deployment because they come from the prepare containers logs, if we lose these containers for any reason (reboot or anything else) we can't generate these logs again.

upstream PR: https://github.com/ceph/ceph-ansible/pull/2152

[1] https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-osd/templates/ceph-osd-run.sh.j2#L17-L35

Comment 15 Vasishta 2017-11-15 17:59:01 UTC
Hi Daniel, 

Tried setting up OSDs with dedicated journals (both dmcrypt and non-dmcrypt) using latest builds, 

Container image - ceph-3.0-rhel-7-docker-candidate-36461-20171114235412
Ceph-ansible - ceph-ansible-3.0.11-1.el7cp.noarch

(Because of test environment constraint only two journals (of 2 OSDs) were on dedicated disk)

Initialization and purging were tried thrice back to back on same set of nodes with node reboot after initializing cluster each time. All these time (both Initialization and after reboot) OSDs came up and were running as expected, thus it looks good to me.

Can you please let me know your views on steps followed as part of the bug fix verification ? 

Regards,
Vasishta

Comment 16 Vasishta 2017-11-17 13:53:19 UTC
Hi,

I'm moving the BZ to VERIFIED as per suggestions I got based on Comment 15.

Please feel free to let me know if there are any concerns.

Regards,
Vasishta

Comment 19 errata-xmlrpc 2017-12-05 23:49:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387