Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1357482 - Manual deployment of RHCS container 2.0: Preparation of disks fail with layered installation of ceph containers
Summary: Manual deployment of RHCS container 2.0: Preparation of disks fail with layer...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Documentation
Version: 2.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: 2.2
Assignee: Bara Ancincova
QA Contact: krishnaram Karthick
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-18 10:22 UTC by krishnaram Karthick
Modified: 2017-03-21 23:48 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-21 23:48:30 UTC


Attachments (Terms of Use)

Description krishnaram Karthick 2016-07-18 10:22:29 UTC
Description of problem:
On RHEL7.2 nodes, when docker is installed and RHCS container 2.0 is deployed manually using [1], preparation of disks doesn't create partitions as expected [2].

[1] - https://docs.google.com/document/d/1Ef5a_-Yjozy5Ue3C0M7mMQNn6zWZe0-514bhxKwFHI8/edit?ts=576a3d95 

[2]
[root@dhcp37-58 ~]# docker run -d --net=host --pid=host --privileged=true -v /var/lib/ceph:/var/lib/ceph:z -v /etc/ceph:/etc/ceph:z -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdb -e OSD_FORCE_ZAP=1 -e CEPH_DAEMON=OSD_CEPH_DISK_PREPARE hchen/rhceph2
4acd1bb021152640162a1c888a121776e0ef5da332e347284b510e1d6830a256
[root@dhcp37-58 ~]# 
[root@dhcp37-58 ~]# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
4acd1bb02115        hchen/rhceph2       "/entrypoint.sh"    8 seconds ago       Up 7 seconds                            admiring_jepsen
[root@dhcp37-58 ~]# 
[root@dhcp37-58 ~]# fdisk -l /dev/vdb 

Disk /dev/vdb: 21.5 GB, 21474836480 bytes, 41943040 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

[root@dhcp37-58 ~]# 
[root@dhcp37-58 ~]# docker logs 4acd1bb02115
static: does not generate config
2016-07-18 04:42:28.529349 7f1938248700  0 -- :/1258136722 >> 10.70.37.55:6789/0 pipe(0x7f193405c930 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f193405fb00).fault
2016-07-18 04:42:31.529045 7f1938147700  0 -- :/1258136722 >> 10.70.37.55:6789/0 pipe(0x7f1928000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f1928001f90).fault
2016-07-18 04:42:34.529503 7f1938248700  0 -- :/1258136722 >> 10.70.37.55:6789/0 pipe(0x7f1928005270 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f1928006530).fault
2016-07-18 04:42:37.529661 7f1938147700  0 -- :/1258136722 >> 10.70.37.55:6789/0 pipe(0x7f1928000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f1928002410).fault

Version-Release number of selected component (if applicable):
 - Docker version 1.10.3, build 7ffc8ee-unsupported
 - ceph-common-0.80.7-3.el7.x86_64
 - RHEL 7.2 (3.10.0-327.el7.x86_64)

How reproducible:
100%

Steps to Reproduce:
1. create 4 rhel 7.2 machines and install docker
2. Install ceph-common packages
3. Follow the following doc to install RHCS container manually

https://docs.google.com/document/d/1Ef5a_-Yjozy5Ue3C0M7mMQNn6zWZe0-514bhxKwFHI8/edit?ts=576a3d95

4. Prepare disks using the below command

docker run -d --net=host --pid=host --privileged=true -v /var/lib/ceph:/var/lib/ceph:z -v /etc/ceph:/etc/ceph:z -v /dev/:/dev/ -e OSD_DEVICE=[device_osd_path] -e OSD_FORCE_ZAP=1 -e CEPH_DAEMON=/dev/vdb hchen/rhceph2

Actual results:

Partitions aren't created

docker logs 8294e79419ae
static: does not generate config
2016-07-18 05:54:11.475744 7f7960549700  0 -- :/1304377178 >> 10.70.37.55:6789/0 pipe(0x7f795c05c930 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f795c05fb00).fault
2016-07-18 05:54:14.475458 7f7960448700  0 -- :/1304377178 >> 10.70.37.55:6789/0 pipe(0x7f7950000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7950001f90).fault
2016-07-18 05:54:17.475679 7f7960549700  0 -- :/1304377178 >> 10.70.37.55:6789/0 pipe(0x7f7950005270 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7950006530).fault
2016-07-18 05:54:20.475886 7f7960448700  0 -- :/1304377178 >> 10.70.37.55:6789/0 pipe(0x7f7950000c80 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f7950002410).fault


Expected results:
Partitions for journal and data has to be created.

Additional info:
The above test works with RHEL Atomic host.

Comment 3 Daniel Gryniewicz 2016-07-18 14:24:45 UTC
Is this a typo, or did you really put CEPH_DAEMON=/dev/vdb ?  The doc lists CEPH_DAEMON=OSD_CEPH_DISK_PREPARE

Comment 4 krishnaram Karthick 2016-07-18 14:37:12 UTC
(In reply to Daniel Gryniewicz from comment #3)
> Is this a typo, or did you really put CEPH_DAEMON=/dev/vdb ?  The doc lists
> CEPH_DAEMON=OSD_CEPH_DISK_PREPARE

Sorry, that was a typo. This was the actual command used as mentioned in [2] 

docker run -d --net=host --pid=host --privileged=true -v /var/lib/ceph:/var/lib/ceph:z -v /etc/ceph:/etc/ceph:z -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdb -e OSD_FORCE_ZAP=1 -e CEPH_DAEMON=OSD_CEPH_DISK_PREPARE hchen/rhceph2

Comment 5 Daniel Gryniewicz 2016-07-18 15:05:34 UTC
Okay, I misread.  The reason the prepare step fails is that the OSD container cannot contact the monitor.  What is the ceph.conf that you are using on the OSD host?  Is the monitor address in that conf reachable from the OSD container host?

Comment 6 krishnaram Karthick 2016-07-20 06:05:01 UTC
This is the ceph.conf file and the monitor address is reachable from OSD container host.

cat /etc/ceph/ceph.conf 
[global]
fsid = 947445de-c58c-4f2f-9cff-7bcf6ab5d19a
mon initial members = dhcp37-55.lab.eng.blr.redhat.com
mon host = 10.70.37.55
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
public network = 10.70.36.0/23
cluster network = 10.70.36.0/23
osd journal size = 100


[root@dhcp37-58 ceph]# ping dhcp37-55.lab.eng.blr.redhat.com
PING dhcp37-55.lab.eng.blr.redhat.com (10.70.37.55) 56(84) bytes of data.
64 bytes from dhcp37-55.lab.eng.blr.redhat.com (10.70.37.55): icmp_seq=1 ttl=64 time=0.393 ms
64 bytes from dhcp37-55.lab.eng.blr.redhat.com (10.70.37.55): icmp_seq=2 ttl=64 time=0.344 ms

As I had mentioned earlier, same steps were used to deploy RHCS container successfully on RHEL Atomic host. Is there any additional package required when we do a layered deployment? On the RHEL hosts, I've installed the following package for ceph.
 - ceph-common-0.80.7-3.el7.x86_64

Comment 7 Daniel Gryniewicz 2016-07-20 12:15:41 UTC
Can you check the firewall on the host?  Atomic has a more relaxed firewall by default than RHEL, Centos, or Fedora.

Comment 8 krishnaram Karthick 2016-07-22 07:43:29 UTC
Yes, zapping disks worked after disabling firewall on MON and OSDs. we'll need to now try this by disabling appropriate services from firewall and document it. Do we know which services needs to be turned on/off to get this going?

Comment 9 Daniel Gryniewicz 2016-07-22 13:06:02 UTC
This is actually checked in the ansible playbook.

For basic ceph:
6789, 6800 - 7300

For RGW:
Your rgw port (8800 by default)

For NFS:
111 and 2049

Comment 10 Harish NV Rao 2016-09-29 14:49:44 UTC
@Karthick, Please ack this bug if you plan to verify it in 2.1.

Comment 13 Daniel Gryniewicz 2016-11-30 12:46:24 UTC
The docs need to say that the firewall must allow access to those ports (in comment #9) or that nmap must be installed so that ceph-ansible can check the firewall.

Comment 15 krishnaram Karthick 2017-02-20 02:43:25 UTC
looks good to me, moving the bug to verified.


Note You need to log in before you can comment on or make changes to this bug.