Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 1356005

Summary: [Ubuntu] calamari-lite is not running on any monitor
Product: Red Hat Ceph Storage Reporter: Daniel Horák <dahorak>
Component: CalamariAssignee: Gregory Meno <gmeno>
Calamari sub component: Back-end QA Contact: Daniel Horák <dahorak>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: ceph-eng-bugs, hnallurv, kdreyer, mkudlej, nlevine, nthomas, vsarmila
Version: 2.0Keywords: TestBlocker
Target Milestone: rc   
Target Release: 2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: calamari-server-1.4.6-1.el7cp Ubuntu: calamari-server_1.4.6-2redhat1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-23 19:44:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1344195    

Description Daniel Horák 2016-07-13 09:04:38 UTC
Description of problem:
  On cluster created from Ubuntu nodes is not calamari-lite properly started on any monitor.
  
  It might be problem with supervisor.service, because it is also not running and is disabled.

Version-Release number of selected component (if applicable):
  USM server (RHEL 7.2):
  ceph-ansible-1.0.5-25.el7scon.noarch
  ceph-installer-1.0.12-4.el7scon.noarch
  rhscon-ceph-0.0.32-1.el7scon.x86_64
  rhscon-core-0.0.33-1.el7scon.x86_64
  rhscon-core-selinux-0.0.33-1.el7scon.noarch
  rhscon-ui-0.0.47-1.el7scon.noarch

  Ceph MON (Ubuntu 16.04):
  calamari-server 1.4.5-2redhat1xenial
  ceph-base       10.2.2-16redhat1xenial
  ceph-common     10.2.2-16redhat1xenial
  ceph-mon        10.2.2-16redhat1xenial
  libcephfs1      10.2.2-16redhat1xenial
  python-cephfs   10.2.2-16redhat1xenial
  rhscon-agent    0.0.14-2redhat1xenial
  
How reproducible:
  100%

Steps to Reproduce:
1. Prepare bunch of nodes (one RHEL 7.2 and at least 5 Ubuntu 16.04).
2. Install and configure USM server on RHEL node and configure rhscon-agents on Ubuntu nodes.
3. Create Ceph cluster via USM web UI.
4. Check if calamari-lite is running on some ceph MON node.
  # supervisorctl status calamari-lite
  # systemctl status supervisor.service 
  
Actual results:
  calamari-lite (and also supervisor.service) is not running, supervisor.service is not enabled to start after machine reboot.
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  # supervisorctl status calamari-lite
    unix:///var/run/supervisor.sock no such file

  # systemctl status supervisor.service 
    ● supervisor.service - Supervisor process control system for UNIX
       Loaded: loaded (/lib/systemd/system/supervisor.service; disabled; vendor preset: enabled)
       Active: inactive (dead)
         Docs: http://supervisord.org
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expected results:
  calamari-lite will be properly configured and running on one ceph MON as it is required and als it will be configured to automatically start after machine reboot.

Additional info:
  I'm not 100% sure, who is responsible for configuring and starting calamari and related services, so if it is problem for example with ceph-installer or ceph-ansible, please reassign this bug to proper component.

  It might be related to Bug 1305259.

Comment 1 Daniel Horák 2016-07-13 09:15:26 UTC
Just a note: I also noticed, that supervisor.service is called differently on RHEL and on Ubuntu.
On RHEL it is supervisord.service, but on Ubuntu it is only supervisor.service.

Comment 2 Nishanth Thomas 2016-07-13 13:31:15 UTC
yes Daniel, that is root cause of this issue. 

https://github.com/ceph/calamari/blob/master/cthulhu/cthulhu/calamari_ctl.py#L260 tries to start supervisord(as specified in /opt/calamari/salt-local/services.sls) rather it should be supervisor in ubuntu

Comment 8 Daniel Horák 2016-07-28 07:44:00 UTC
Tested on:
USM Server (RHEL 7.2):
  ceph-ansible-1.0.5-31.el7scon.noarch
  ceph-installer-1.0.14-1.el7scon.noarch
  rhscon-ceph-0.0.36-1.el7scon.x86_64
  rhscon-core-0.0.36-1.el7scon.x86_64
  rhscon-core-selinux-0.0.36-1.el7scon.noarch
  rhscon-ui-0.0.50-1.el7scon.noarch

Ceph MON (Ubuntu 16.04):
  ii  calamari-server 1.4.7-2redhat1xenial    amd64  Inktank package containing the Calamari management server
  ii  ceph-base       10.2.2-23redhat1xenial  amd64  common ceph daemon libraries and management tools
  ii  ceph-common     10.2.2-23redhat1xenial  amd64  common utilities to mount and interact with a ceph storage cluster
  ii  ceph-mon        10.2.2-23redhat1xenial  amd64  monitor server for the ceph storage system
  ii  libcephfs1      10.2.2-23redhat1xenial  amd64  Ceph distributed file system client library
  ii  python-cephfs   10.2.2-23redhat1xenial  amd64  Python libraries for the Ceph libcephfs library
  ii  rhscon-agent    0.0.16-2redhat1xenial   all    SKYNET is the event agent for SKYRING. Each storage node managed

Service supervisor and calamari-lite is properly running on one Ceph MON.

>> VERIFIED

Comment 10 errata-xmlrpc 2016-08-23 19:44:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1755.html