Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1367266 - EINVAL errors for write, when there are write stalls before and a lookup post a rebalance of the file
Summary: EINVAL errors for write, when there are write stalls before and a lookup post...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Nithya Balachandran
QA Contact:
URL:
Whiteboard:
Depends On: 1059687 1286150 1463907 1476665
Blocks: 1035040 1367285
TreeView+ depends on / blocked
 
Reported: 2016-08-16 05:56 UTC by Raghavendra G
Modified: 2018-08-29 03:18 UTC (History)
19 users (show)

Fixed In Version: glusterfs-4.1.3 (or higher)
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1286150
: 1367285 (view as bug list)
Environment:
Last Closed: 2018-08-29 03:18:02 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Raghavendra G 2016-08-16 05:56:03 UTC
+++ This bug was initially created as a clone of Bug #1286150 +++

+++ This bug was initially created as a clone of Bug #1059687 +++

+++ This bug was initially created as a clone of Bug #1054782 +++

2) File missing
---------------

The root cause of this issue is due to the following scenario (triggered due to the way we were reproducing the bug, but can happen anyway),
- Application starts a write on a file
- Remove-brick(rebalance) is triggered on the sub volume where the file resides
- During the actual file migration by rebalance there is no write IOs happening (this happens with RHOS as the source of the file being copied is from the web, hence there are write stalls, observable using some fop logging on writes)
- Post the migration (which takes about 3-5 seconds on the RHOS setup), say a lookup on the same file is triggered (which we were triggering to check if the file is growing in size using a ls -l on the file, hence the statement on triggered during reproduction of the bug), then the files caches sub volume changes to the new sub volume, but the fd that we hold is of the older sub volume
- A write by the application post this, triggers a fd, sub vol  mismatch due to the above step resulting in EINVAL from the layers below (logs as below seen in the client mount logs)

In (2) there is no data corruption, as an error is sent back to the application and glance in this case decides to remove this image from its store as there is a failure in create_image.

This is simulated in a simple bash open file case as well.

(2) will be forked into a separate bug for further analysis on possible fixes.

For (1) the patch up stream is now posted here and awaiting acceptance, https://code.engineering.redhat.com/gerrit/#/c/19107/1

--- Additional comment from RHEL Product and Program Management on 2014-01-30 16:15:27 MVT ---

Since this issue was entered in bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from Shalaka on 2014-01-31 11:09:12 MVT ---

Please add doc text for this known issue.

--- Additional comment from Shalaka on 2014-02-19 15:13:26 MVT ---

Edited the doc text.

--- Additional comment from Shyamsundar on 2014-02-20 10:09:40 MVT ---

Doc text looks good.

--- Additional comment from John Skeoch on 2014-02-27 05:12:49 MVT ---

User srangana@redhat.com's account has been closed

--- Additional comment from John Skeoch on 2014-02-27 05:14:17 MVT ---

User srangana@redhat.com's account has been closed

--- Additional comment from John Skeoch on 2014-03-31 06:35:20 MVT ---

User vraman@redhat.com's account has been closed

--- Additional comment from Susant Kumar Palai on 2015-11-27 17:04:23 MVT ---

Cloning this to 3.1. To be fixed in future release.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2015-11-27 07:04:59 EST ---

This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-08-09 07:19:07 EDT ---

Since this bug has been approved for the RHGS 3.2.0 release of Red Hat Gluster Storage 3, through release flag 'rhgs-3.2.0+', and through the Internal Whiteboard entry of '3.2.0', the Target Release is being automatically set to 'RHGS 3.2.0'

Comment 1 Nithya Balachandran 2017-08-18 03:42:34 UTC
This will be fixed by the patches sent for https://bugzilla.redhat.com/show_bug.cgi?id=1476665.
Marking this Modified.


Note You need to log in before you can comment on or make changes to this bug.