Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1356068 - observing " Too many levels of symbolic links" after adding bricks and then issuing a replace brick
Summary: observing " Too many levels of symbolic links" after adding bricks and then i...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: nfs
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Soumya Koduri
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1328451 1347903 1357257
TreeView+ depends on / blocked
 
Reported: 2016-07-13 11:02 UTC by Soumya Koduri
Modified: 2017-03-27 18:27 UTC (History)
12 users (show)

Fixed In Version: glusterfs-3.9.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1328451
: 1357257 (view as bug list)
Environment:
Last Closed: 2017-03-27 18:27:08 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Comment 1 Soumya Koduri 2016-07-13 11:05:53 UTC
This seems to be a bug in the nameless lookup resolution of the Gluster/NFS server code-path (typically happens post restart). NFS xlator seems to be bailing out post LOOKUP of parent directory FH (without resolving the child file entry) resulting in sending stat of parent dir. 

NFS-client on receiving the attributes of an entry same as it parent, threw an error "Too many levels of symbolic links" to the application.

Comment 2 Vijay Bellur 2016-07-13 11:39:23 UTC
REVIEW: http://review.gluster.org/14911 (nfs: Reset cs->resolvedhard while resolving an entry) posted (#1) for review on master by soumya k (skoduri@redhat.com)

Comment 3 Vijay Bellur 2016-07-13 11:42:23 UTC
REVIEW: http://review.gluster.org/14911 (nfs: Reset cs->resolvedhard while resolving an entry) posted (#2) for review on master by soumya k (skoduri@redhat.com)

Comment 4 jiademing.dd 2016-07-14 02:07:14 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=1347903  Is this the same problem?

Comment 5 Soumya Koduri 2016-07-14 12:27:34 UTC
(In reply to jiademing.dd from comment #4)
> https://bugzilla.redhat.com/show_bug.cgi?id=1347903  Is this the same
> problem?

Yes. It does seem similar. You could confirm from the pkt trace if there is any invalid lookup response to the client (parent directory attributes are copied as a response to the file lookup) post restart of the server.

Comment 6 jiademing.dd 2016-07-15 02:45:59 UTC
(In reply to Soumya Koduri from comment #5)
> (In reply to jiademing.dd from comment #4)
> > https://bugzilla.redhat.com/show_bug.cgi?id=1347903  Is this the same
> > problem?
> 
> Yes. It does seem similar. You could confirm from the pkt trace if there is
> any invalid lookup response to the client (parent directory attributes are
> copied as a response to the file lookup) post restart of the server.

I reported this bug and appended the analysis of the problem in https://bugzilla.redhat.com/show_bug.cgi?id=1347903, but no one reply.

Comment 7 Soumya Koduri 2016-07-16 16:57:13 UTC
(In reply to jiademing.dd from comment #6)
> (In reply to Soumya Koduri from comment #5)
> > (In reply to jiademing.dd from comment #4)
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1347903  Is this the same
> > > problem?
> > 
> > Yes. It does seem similar. You could confirm from the pkt trace if there is
> > any invalid lookup response to the client (parent directory attributes are
> > copied as a response to the file lookup) post restart of the server.
> 
> I reported this bug and appended the analysis of the problem in
> https://bugzilla.redhat.com/show_bug.cgi?id=1347903, but no one reply.

Sometimes few bug updates may get missed because of the heavy inflow of bugs. Please put a need_info on the ones looking at it or the maintainer (can get fom sources/MAINTAINERS file) or send a mail to gluster-devel/gluster-users mailing list for quick responses. Thanks!

Comment 8 Vijay Bellur 2016-07-17 07:58:54 UTC
COMMIT: http://review.gluster.org/14911 committed in master by Niels de Vos (ndevos@redhat.com) 
------
commit 3c485cb896837c8e362fd0b094325002ce806ac4
Author: Soumya Koduri <skoduri@redhat.com>
Date:   Wed Jul 13 16:24:31 2016 +0530

    nfs: Reset cs->resolvedhard while resolving an entry
    
    If an entry is not found in the inode table, nfs xlator should be
    resolving it by sending an explicit lookup to the brick process.
    But currently its broken in case of NFS3_LOOKUP fop where in the server
    bails out early resulting in sending pargfid attributes to the client.
    To fix the same reset 'cs->resolvedhard' so that an explicit lookup
    is done for the entry in the resume_fn "nfs3_lookup_resume()".
    
    Change-Id: I999f8bca7ad008526c174d13f69886dc809d9552
    Signed-off-by: Soumya Koduri <skoduri@redhat.com>
    BUG: 1356068
    Reviewed-on: http://review.gluster.org/14911
    CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    Smoke: Gluster Build System <jenkins@build.gluster.org>
    Reviewed-by: Niels de Vos <ndevos@redhat.com>

Comment 9 Shyamsundar 2017-03-27 18:27:08 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.9.0, please open a new bug report.

glusterfs-3.9.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2016-November/029281.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.