Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1689981 - OSError: [Errno 1] Operation not permitted - failing with socket files?
Summary: OSError: [Errno 1] Operation not permitted - failing with socket files?
Keywords:
Status: NEW
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: 4.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-18 14:50 UTC by Davide
Modified: 2019-03-18 14:50 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Davide 2019-03-18 14:50:40 UTC
Description of problem:


georeplciation during "History Crawl" starts failing on each of the three bricks, one after the other. I have enabled DEBUG for all the logs configurable by the geo-replication command.

Running glusterfs v4.16 the behaviour is as follow:
- The "History Crawl" worked fine for about one hr, it actually replicated some files and folders albeit most of them looks empty
- at some point it starts becoming faulty, try to start on another brick, faulty and so on
- in the logs, Python exception above mentioned is raised:
[2019-03-17 18:52:49.565040] E [syncdutils(worker /var/lib/heketi/mounts/vg_b088aec908c959c75674e01fb8598c21/brick_f90f425ecb89c3eec6ef2ef4a2f0a973/brick):332:log_raise_exception] <top>: FAIL:                                                                              
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, in main
    func(args)
  File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, in subcmd_worker
    local.service_loop(remote)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1291, in service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1569, in crawl
    self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1469, in changelogs_batch_process
    self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1304, in process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1203, in process_change
    failures = self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in __call__
    raise res
OSError: [Errno 1] Operation not permitted

- The operation before the exception:
[2019-03-17 18:52:49.545103] D [master(worker /var/lib/heketi/mounts/vg_b088aec908c959c75674e01fb8598c21/brick_f90f425ecb89c3eec6ef2ef4a2f0a973/brick):1186:process_change] _GMaster: entries: [{'uid': 7575, 'gfid': 'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'gid': 100, 'mode'
: 49536, 'entry': '.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY', 'op': 'MKNOD'}, {'gfid': 'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'entry': '.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9
234d005406a13deb4375459715', 'stat': {'atime': 1552661403.3846507, 'gid': 100, 'mtime': 1552661403.3846507, 'uid': 7575, 'mode': 49536}, 'link': None, 'op': 'LINK'}, {'gfid': 'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'entry': '.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.con
trol_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY', 'op': 'UNLINK'}]
[2019-03-17 18:52:49.548614] D [repce(worker /var/lib/heketi/mounts/vg_b088aec908c959c75674e01fb8598c21/brick_f90f425ecb89c3eec6ef2ef4a2f0a973/brick):179:push] RepceClient: call 56917:140179359156032:1552848769.55 entry_ops([{'uid': 7575, 'gfid': 'e1ad7c98-f32a-4e48-9902-
cc75840de7c3', 'gid': 100, 'mode': 49536, 'entry': '.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY', 'op': 'MKNOD'}, {'gfid': 'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'entry': '.gfid/5219e4b8-a1f3-4a4e-b9c7-c9b
129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715', 'stat': {'atime': 1552661403.3846507, 'gid': 100, 'mtime': 1552661403.3846507, 'uid': 7575, 'mode': 49536}, 'link': None, 'op': 'LINK'}, {'gfid': 'e1ad7c98-f32a-4e48-9902-cc75840de7c3', 'entry': '.gfid/5219e4b8
-a1f3-4a4e-b9c7-c9b129abe671/.control_f7c33270dc9db9234d005406a13deb4375459715.6lvofzOuVnfAwOwY', 'op': 'UNLINK'}],) ...

- The gfid highlighted, is pointing to these control files which are "unix sockets" as per below:
rw-------  2 pippo users     0 Mar 14 16:32 .control_31c3a99664c1f956f949311e58434037e6a52d22
srw-------  2 pippo users     0 Mar 14 16:33 .control_a9b82937042529bca677b9f43eba9eb02ca7c5ee
srw-------  2 pippo users     0 Mar 14 16:32 .control_f429221460d52570066d9f25521011fe7e081cf5
srw-------  2 pippo users     0 Mar 15 15:50 .control_f7c33270dc9db9234d005406a13deb4375459715

So it seems geo-replicaiton should be at least skipping such file rather than raising an exception? 


Steps to Reproduce:
1. replicate unix socket files

Actual results:
Os Error exception

Expected results:
Files to be skipped and replication continues

Additional info:


Note You need to log in before you can comment on or make changes to this bug.