Bug 1515275 - jewel: rgw: folders starting with "_" underscore are not in bucket index
Summary: jewel: rgw: folders starting with "_" underscore are not in bucket index
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RGW
Version: 2.3
Hardware: x86_64
OS: Linux
Target Milestone: rc
: 2.5
Assignee: Marcus Watts
QA Contact: vidushi
Aron Gunn
Depends On: 1531279
Blocks: 1536401
Reported: 2017-11-20 13:22 UTC by Edu Alcaniz
Modified: 2018-02-21 19:46 UTC (History)
Fixed In Version: RHEL: ceph-10.2.10-9.el7cp Ubuntu: ceph_10.2.10-6redhat1xenial
Doc Type: Bug Fix
Doc Text:
.Folders starting with an underscore (_) are not in the bucket index Previously, a server-side copy mishandled object names starting with an underscore. This led to objects being created with two leading underscores. The Ceph Object Gateway code has been fixed to properly handle leading underscores. As a result, objects names with leading underscores behave correctly.
Clone Of:
: 1531279 (view as bug list)
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0340 normal SHIPPED_LIVE Red Hat Ceph Storage 2.5 bug fix and enhancement update 2018-02-22 00:50:32 UTC
Ceph Project Bug Tracker 19432 None None None 2017-11-20 13:26:01 UTC
Description Edu Alcaniz 2017-11-20 13:22:30 UTC
Description of problem:

When you copy underscore names directly under bucket (s3://bucket/_bar) then the s3 copy fails, but when there is a additional folder in the target path, then s3 copy seems to succeed (s3://bucket/foo/_bar).

1. S3 upload (put/sync) works with underscore name as target:
  $ touch _bar60.txt
  $ s3cmd -c essi29-gooditest.cfg put _bar60.txt s3://testix/_bar60.txt
  upload: '_bar60.txt' -> 's3://testix/_bar60.txt'  [1 of 1]
   0 of 0     0% in    0s     0.00 B/s  done
  $ s3cmd -c essi29-gooditest.cfg --recursive ls -l s3://testix/ |grep bar60
  2017-11-16 10:21         0   d41d8cd98f00b204e9800998ecf8427e  STANDARD  s3://testix/_bar60.txt

2. But S3 copy fails if target is underscore object under bucket:
  $ s3cmd -c essi29-gooditest.cfg cp s3://testix/orig/bar/tst/bar/bar.txt s3://testix/_bar61.txt
  WARNING: Key not found s3://testix/orig/bar/tst/bar/bar.txt

3. If you add a subfolder in the path then theS3 copy succeeds to underscore name:
  $ s3cmd -c essi29-gooditest.cfg cp s3://testix/orig/bar/tst/bar/bar.txt s3://testix/case6/_bar61.txt
  remote copy: 's3://testix/orig/bar/tst/bar/bar.txt' -> 's3://testix/case6/_bar61.txt'

Object listing shows that name has changed from _bar61.txt to __bar61.txt, but when copied under folder it’s ok:
  $ s3cmd -c essi29-gooditest.cfg ls --recursive -l s3://testix/ | grep bar61
  2017-11-16 10:13         0   d41d8cd98f00b204e9800998ecf8427e  STANDARD  s3://testix/__bar61.txt
  2017-11-16 10:13         0   d41d8cd98f00b204e9800998ecf8427e  STANDARD  s3://testix/case6/_bar61.txt

4. It can be downloaded and deleted with "__" name
  $ s3cmd -c essi29-gooditest.cfg  rm s3://testix/__bar61.txt
  delete: 's3://testix/__bar61.txt'

So the question is that why it's not possible to copy objects:
  $ s3cmd cp s3://bucket/foo/bar.txt -> s3://bucket/_bar.txt
unless you specify a subfolder into copy:
  $ s3cmd cp s3://bucket/foo/bar.txt -> s3://bucket/subfolder/_bar.txt

And why in the first case the target name gets changed from "_bar.txt to "__bar.txt"  (which has 2x _).

Version-Release number of selected component (if applicable):
Ceph 2.3
package-data:ceph-base-10.2.7-27.el7cp.x86_64	Fri Jul 14 14:55:34 2017	

How reproducible:

Steps to Reproduce:

Actual results:

Expected results:
it seems resolve on -Ceph - v10.2.10

Additional info:

