Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1561456 - [ceph-ansible] [ceph-container] : shrink OSD with NVMe disks - failing as OSD services are not stoppped
Summary: [ceph-ansible] [ceph-container] : shrink OSD with NVMe disks - failing as OSD...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Ceph-Ansible
Version: 3.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: z3
: 3.0
Assignee: leseb
QA Contact: Vasishta
Erin Donnelly
URL:
Whiteboard:
: 1555793 (view as bug list)
Depends On:
Blocks: 1553254 1557269 1572368 1600697
TreeView+ depends on / blocked
 
Reported: 2018-03-28 11:36 UTC by Vasishta
Modified: 2018-07-12 19:43 UTC (History)
13 users (show)

Fixed In Version: RHEL: ceph-ansible-3.0.32-1.el7cp Ubuntu: ceph-ansible_3.0.32-2redhat1
Doc Type: Bug Fix
Doc Text:
.The `shrink-osd` playbook supports NVMe drives Previously, the `shrink-osd` Ansible playbook did not support shrinking OSDs backed by an NVMe drive. NVMe drive support has been added in this release.
Clone Of:
Environment:
Last Closed: 2018-05-15 18:20:31 UTC
Target Upstream Version:


Attachments (Terms of Use)
File contains contents ansible-playbook log (deleted)
2018-03-28 11:36 UTC, Vasishta
no flags Details


Links
System ID Priority Status Summary Last Updated
Github ceph ceph-ansible pull 2537 None None None 2018-04-20 09:15:52 UTC
Red Hat Product Errata RHBA-2018:1563 None None None 2018-05-15 18:21:25 UTC

Description Vasishta 2018-03-28 11:36:46 UTC
Created attachment 1414140 [details]
File contains contents ansible-playbook log

Description of problem:
Shrinking of OSD with NVMe disks are failing in task deallocate osd(s) id when ceph-disk destroy fail saying "Error EBUSY: osd.<id> is still up; must be down before removal. "

Version-Release number of selected component (if applicable):
ceph-ansible-3.0.28-1.el7cp.noarch

How reproducible:
Always (3/3)

Steps to Reproduce:
1. Configure containerized cluster with NVMe disks for OSDs
2. Try to shrink an OSD.


Actual results:
TASK [deallocate osd(s) id when ceph-disk destroy fail]
--------------------------------
"stderr_lines": [
        "Error EBUSY: osd.7 is still up; must be down before removal. "
    ], 

Expected results:
OSD must be removed successfully 

Additional info:

The task TASK [stop osd services (container)] was completed with status 'ok'.

Comment 6 leseb 2018-04-12 12:14:43 UTC
Can you check if the container is still running?
Perhaps we tried to stop the wrong service.

Thanks.

Comment 7 Vasishta 2018-04-12 13:12:32 UTC
Hi Sebastien,

As I remember container was running. (Unfortunately I don't have environment as of now)

I think we must have tried to stop wrong service as I see "name": "ceph-osd@nvme0n1p" in the log (atachment). By following the convention, service name must have been "ceph-osd@nvme0n1"

The logic we have in shrink-osd.yml [1] to findout service seems to be not working for nvme disks.


- name: stop osd services (container)
      service:
        name: "ceph-osd@{{ item.0.stdout[:-1] | regex_replace('/dev/', '') }}"

I think It would have been fine if we could have "item.0.stdout[:-2]" only for nvme disks.

[1] https://github.com/ceph/ceph-ansible/blob/37117071ebb7ab3cf68b607b6760077a2b46a00d/infrastructure-playbooks/shrink-osd.yml#L119-L121


Regards,
Vasishta Shastry
AQE, Ceph

Comment 9 leseb 2018-04-20 09:44:40 UTC
*** Bug 1555793 has been marked as a duplicate of this bug. ***

Comment 10 leseb 2018-04-23 21:02:22 UTC
Will be in the next release v3.0.32

Comment 14 Vasishta 2018-05-09 07:15:44 UTC
working fine with ceph-ansible-3.0.32-1.el7cp

Comment 17 errata-xmlrpc 2018-05-15 18:20:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1563


Note You need to log in before you can comment on or make changes to this bug.