Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1695850 - ceph-ansible containerized Ceph MDS is limited to 1 CPU core by default - not enough
Summary: ceph-ansible containerized Ceph MDS is limited to 1 CPU core by default - no...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: Ceph-Ansible
Version: 3.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 4.0
Assignee: Guillaume Abrioux
QA Contact: ceph-qe-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-03 19:30 UTC by Ben England
Modified: 2019-04-04 18:48 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Ben England 2019-04-03 19:30:41 UTC
Description of problem:

Cephfs for metadata-intensive workloads, such as large directories, small files, will perform far slower for containerized Ceph than for non-containerized Ceph.   Ideally there should be no significant difference for almost all workloads between containerized and non-containerized Ceph.

Version-Release number of selected component (if applicable):

RHCS 3.2 (and RHCS 4)

How reproducible:

should be every time, haven't actually compared directly but have measured ceph-mds CPU consumption many times for such workloads and this is typically well above 1 core, usually more like 3-4 cores.

Steps to Reproduce:
1. Install containerized Ceph
2. run metadata-intensive workload where Ceph MDS is bottleneck
3. observe CPU consumption of ceph-mds process
4. compare to CPU consumption of ceph-mds in non-containerized Ceph

Actual results:

It is expected that containerized-Ceph Cephfs user will get 1/2-1/3 metadata performance of bare-metal Ceph.

Expected results:

User should get something resembling bare-metal Ceph performance with containerized Ceph, without tuning, in most cases.

Additional info:

See https://github.com/ceph/ceph-ansible/blob/master/roles/ceph-mds/defaults/main.yml#L23

This shows that default for CPU CGroup limit is 1 core.

I think this should default to 4 CPU cores since I have seen Ceph-MDS CPU utilization get that high for metadata-intensive workloads (such as smallfile), and there is typically only one such process per host.  But something > 2 would be satisfactory.

Hyperconverged environments could elect to lower it to free up cores for other processes, but I doubt they will, since this is an *upper bound* on CPU consumption, and if the MDS is not being used then the CPU resources are available for others to use.

workloads to reproduce this problem are available using:

https://github.com/distributed-system-analysis/smallfile

Comment 1 Ben England 2019-04-04 18:48:47 UTC
I should assign to Ceph-ansible, sorry Patrick.


Note You need to log in before you can comment on or make changes to this bug.