Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1060434 - [oVirt'S Gluster Inetgration ] Status of remove-brick changed from commit-pending to unknown (?) when left untouched for a few hours
Summary: [oVirt'S Gluster Inetgration ] Status of remove-brick changed from commit-pen...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: krishnan parthasarathi
QA Contact:
URL:
Whiteboard:
Depends On: 1059662
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-01 18:50 UTC by Krutika Dhananjay
Modified: 2015-11-03 23:06 UTC (History)
13 users (show)

Fixed In Version: glusterfs-3.5.0
Doc Type: Bug Fix
Doc Text:
Clone Of: 1059662
Environment:
Last Closed: 2014-04-17 11:52:51 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Krutika Dhananjay 2014-02-01 18:50:39 UTC
+++ This bug was initially created as a clone of Bug #1059662 +++

Description of problem:
-------------------------

After starting remove-brick on a volume, the setup was left untouched for a few hours in the commit-pending state. Later, on trying to retain the bricks, the retain operation failed, and the remove-brick icon changed to unknown (?).

From the events log - 

Could not find information for remove brick on volume dis_rep_vol of Cluster test_2 from CLI. Marking it as unknown.

How reproducible:
Saw it once.

Steps to Reproduce:
1. Start remove-brick on a distributed-replicate volume.
2. Let the data migration finish and the icon change to commit-pending.
3. Leave the setup untouched for a few hours ( I had other clusters managed by the same engine which I continued to use )
4. Try to retain the bricks.

Actual results:
Retain brick failed and the remove-brick icon changed to unknown (?).

Expected results:
Retain should have succeeded and the remove brick icon should have changed to remove-brick-stopped icon.


--- Additional comment from Shruti Sampat on 2014-01-30 05:41:38 EST ---

Hi,

I am seeing that after a while, the icon changes to commit-pending again and the events log says the following - 

Data Migration finished for brick(s) on volume dis_rep_vol. Please review to abort or commit.

--- Additional comment from Sahina Bose on 2014-01-30 06:51:27 EST ---

This is due to the inconsistency of glusterTasksList return as per logs below
For a period in between, the task list was returning no tasks and then again started returning the task (No errors in vdsm):

2014-01-30 08:34:50,634 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand] (DefaultQuartzScheduler_Worker-51) START, GlusterTasksListVDSCommand(HostName = rb_host4, HostId = 9ec85f51-af07-49f8-a0b5-bee6e829f37e), log id: 52c5e91e
2014-01-30 08:34:51,173 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand] (DefaultQuartzScheduler_Worker-51) FINISH, GlusterTasksListVDSCommand, return: [GlusterAsyncTask[4518561a-4267-46c9-a480-80c373e983f4-REMOVE_BRICK-FINISHED]

[org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand] (DefaultQuartzScheduler_Worker-11) [7cc81051] FINISH, GlusterTasksListVDSCommand, return: [], log id: 15240328
2014-01-30 08:35:51,513 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-11) [7cc81051] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Could not find information for remove brick on volume dis_rep_vol of Cluster test_2 from CLI. Marking it as unknown
.
.
.
.
2014-01-30 03:13:34,756 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand] (DefaultQuartzScheduler_Worker-57) [4166d263] START, GlusterTasksListVDSCommand(HostName = rb_host1, HostId = b3b4550a-7532-4ce2-b35e-7df1562cafa9), log id: 4716d22a
2014-01-30 03:13:35,548 INFO  [org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand] (DefaultQuartzScheduler_Worker-57) [4166d263] FINISH, GlusterTasksListVDSCommand, return: [GlusterAsyncTask[a980a28b-7542-43e3-b3a5-54918ecf09aa-REMOVE_BRICK-FINISHED], GlusterAsyncTask[4518561a-4267-46c9-a480-80c373e983f4-REMOVE_BRICK-FINISHED]], log id: 4716d22a

--- Additional comment from RamaKasturi on 2014-01-31 04:39:53 EST ---

i happened to see this issue on my scale config. Following are the steps to reproduce it.

1) Have four volumes 2 distribute and 2 distribute replicate.

2) started remove-brick on all four volumes.

3) After few seconds, i see that a question mark appears in the volume activities column and an event message which says "Could not find information for remove brick on volume vol_dis_1brick_8hosts of Cluster cluster_test_setup from CLI. Marking it as unknown."


--- Additional comment from Shruti Sampat on 2014-01-31 05:06:05 EST ---

I saw it again too. Remove-brick operation failed on a volume as some of its bricks were down. The icon was in the failed state. Tried to start remove-brick again, but the icon changed to ?

Opening the status dialog shows an empty dialog, i.e with zero rows in the table.

--- Additional comment from Krutika Dhananjay on 2014-02-01 13:31:31 EST ---

There are two issues here:

Issue #1:
========

Steps to recreate issue #1:
==========================
a. Build RHS from source with -DDEBUG flag.
b. Create a 2-node cluster.
c. Create a 2x2 distributed-replicate volume with two bricks on each node.
d. Start the volume and FUSE-mount it.
e. Create sufficient data on the mountpoint.
f. Execute 'remove-brick start' on the volume:
   [root@calvin /]# gluster volume remove-brick dis-rep calvin:/brick/3
hobbes:/brick/4 start
volume remove-brick start: success
ID: 65cc4b61-2813-4e8a-86a3-d0f789dbd8bc

g. Execute 'volume status <volname>' in an infinite loop and closely monitor the status of remove-brick on node 1 (say).
h. When remove-brick is complete, execute remove-brick stop from the same node (node 1) while 'volume status <volname>' is still being executed continuously in a loop.

[root@calvin ~]# gluster volume remove-brick dis-rep calvin:/brick/3 hobbes:/brick/4 stop
Connection failed. Please check if gluster daemon is operational.

Result:
=======
glusterd crashes (SIGABRT) with assertion failure.
Attached the core file for the same.

Root-cause:
===========
'volume status' and 'volume remove-brick stop' get into a race to grab the cluster-wide lock as can be seen from the command log history on 10.70.37.162:
[2014-01-30 03:05:21.835188]  : volume status dis_rep_vol : FAILED : Another transaction is in progress. Please try again after sometime.
[2014-01-30 03:05:22.081587]  : volume status dis_vol : FAILED : Another transaction is in progress. Please try again after sometime.
[2014-01-30 03:05:24.228803]  : volume remove-brick dis_rep_vol 10.70.37.162:/rhs/brick3/b1 10.70.37.70:/rhs/brick3/b1 10.70.37.162:/rhs/brick4/b1 10.70.37.70:/rhs/brick4/b1 stop : SUCCESS

The originator glusterd remembers the op in progress (by setting opinfo.op to GD_OP_DEFRAG_BRICK_VOLUME (which equals 24)) in memory. 'remove-brick stop' manages to get the lock first and proceeds to execute other stages in the state machine, and reaches a point where it sends commit op request to peers and waits for their response. At this stage, 'volume status' is invoked from another cli on the same node, which fails to perform self-lock. As a result, glusterd proceeds to perform an unlock.

 26 gd_unlock_op_phase (struct list_head *peers, glusterd_op_t op, int op_ret,       
 25                     rpcsvc_request_t *req, dict_t *op_ctx, char *op_errstr,      
 24                     int npeers, gf_boolean_t is_locked)                          
 23 {                                                                                
 22         glusterd_peerinfo_t *peerinfo   = NULL;                                  
 21         glusterd_peerinfo_t *tmp        = NULL;                                  
 20         uuid_t              tmp_uuid    = {0};                                   
 19         int                 peer_cnt       = 0;                                  
 18         int                 ret         = -1;                                    
 17         xlator_t            *this       = NULL;                                  
 16         glusterd_conf_t     *conf       = NULL;                                  
 15         struct syncargs     args        = {0};                                   
 14                                                                                  
 13         this = THIS;                                                             
 12         conf = this->private;                                                    
 11                                                                                  
 10         /* If the lock has not been held during this                             
  9          * transaction, do not send unlock requests */                           
  8         if (!is_locked)                                                          
  7                 goto out;       
....
....
....
....
//Meanwhile in out:
 15 out:                                                                             
 16         glusterd_op_send_cli_response (op, op_ret, 0, req, op_ctx, op_errstr);
+++++++++++++++++++NOTE+++++++++++++++++++   
 17         glusterd_op_clear_op (op);                                               
 18         if (is_locked)                                                           
 19                 glusterd_unlock (MY_UUID);                                       
 20         /*                                                                       
 21          * If there are any quorum evens while the OP is in progress, process    
 22          * them.                                                                 
 23          */                                                                      
 24         if (conf->pending_quorum_action)                                         
 25                 glusterd_do_quorum_action ();                                    
 26                                                                                  
 27         return 0;                                                                
 28 }

As part of self-unlock, since lock is anyway not taken, originator glusterd proceeds to send the response to cli, and THEN, resets opinfo.op to GD_OP_NONE (which equals 0) through a call to the function glusterd_op_clear_op(op) in line #17 above.

Now, at this point, when it receives a commit op rsp from its peer and attempts to execute the cbk ( __glusterd_commit_op_cbk()), as part of aggregating rsp from peer, in glusterd_volume_rebalance_use_rsp_dict(), the following assertion:

 16         op = glusterd_op_get_op ();                                              
 15         GF_ASSERT ((GD_OP_REBALANCE == op) ||                                    
 14                    (GD_OP_DEFRAG_BRICK_VOLUME == op)); 

 fails since op is now reset to GD_OP_NONE.

This is the reason for the following logs seen on the originator on machine 10.70.37.162 (ref: sosreports in comment #1):

[2014-01-30 03:05:22.081541] E [glusterd-syncop.c:1217:gd_sync_task_begin] 0-management: Unable to acquire lock
[2014-01-30 03:05:24.209248] E [glusterd-utils.c:7938:glusterd_volume_rebalance_use_rsp_dict] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x3e5f80df45] (-->/usr/lib64/glusterfs/3.4.0.58rhs/xlator/mgmt/glusterd.so(glusterd_big_locked_cbk+0x60) [0x7f9b90fc50b0] (-->/usr/lib64/glusterfs/3.4.0.58rhs/xlator/mgmt/glusterd.so(__glusterd_commit_op_cbk+0x5f9) [0x7f9b90fc7639]))) 0-: Assertion failed: (GD_OP_REBALANCE == op) || (GD_OP_DEFRAG_BRICK_VOLUME == op)
[2014-01-30 03:05:24.209554] E [glusterd-utils.c:7938:glusterd_volume_rebalance_use_rsp_dict] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x3e5f80df45] (-->/usr/lib64/glusterfs/3.4.0.58rhs/xlator/mgmt/glusterd.so(glusterd_big_locked_cbk+0x60) [0x7f9b90fc50b0] (-->/usr/lib64/glusterfs/3.4.0.58rhs/xlator/mgmt/glusterd.so(__glusterd_commit_op_cbk+0x5f9) [0x7f9b90fc7639]))) 0-: Assertion failed: (GD_OP_REBALANCE == op) || (GD_OP_DEFRAG_BRICK_VOLUME == op)
[2014-01-30 03:05:24.225575] E [glusterd-utils.c:7938:glusterd_volume_rebalance_use_rsp_dict] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x3e5f80df45] (-->/usr/lib64/glusterfs/3.4.0.58rhs/xlator/mgmt/glusterd.so(glusterd_big_locked_cbk+0x60) [0x7f9b90fc50b0] (-->/usr/lib64/glusterfs/3.4.0.58rhs/xlator/mgmt/glusterd.so(__glusterd_commit_op_cbk+0x5f9) [0x7f9b90fc7639]))) 0-: Assertion failed: (GD_OP_REBALANCE == op) || (GD_OP_DEFRAG_BRICK_VOLUME == op)
[2014-01-30 03:05:24.228681] E [glusterd-rpc-ops.c:120:glusterd_op_send_cli_response] 0-management: invalid operation

Issue #2:
=========
The command log history on 10.70.37.162 suggests that the 'remove-brick stop' operation finished successfully. And the termination of any async task means that it will not show up in the output of 'volume status all tasks' because it is now history and no longer relevant. To proceed further on this part of the bug, I will need further inputs from the RHSC developers on what exactly happens when 'retain bricks' is hit on the UI.

--- Additional comment from Krutika Dhananjay on 2014-02-01 13:35:27 EST ---



--- Additional comment from Krutika Dhananjay on 2014-02-01 13:41:38 EST ---

Also, forgot to add that with the steps i used to recreate the bug, i was able to get the correct output for 'volume status all tasks' on all nodes after remove-brick was stopped, except for one discrepancy:

[root@calvin ~]# gluster volume remove-brick dis-rep calvin:/brick/3 hobbes:/brick/4 stop
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                                  (null)              500       123.3MB          1000             0             0            completed              36.00
                                  (null)                0        0Bytes          1000             0             0            completed              10.00
'remove-brick' process may be in the middle of a file migration.
The process will be fully stopped once the migration of the file is complete.
Please check remove-brick process for completion before doing any further brick related tasks on the volume.

As can be seen above, the "Node" column contains "null" instead of node ids.

[root@calvin ~]# gluster volume status all tasks
Task Status of Volume dis-rep
------------------------------------------------------------------------------
There are no active volume tasks


[root@hobbes ~]# gluster volume status all tasks
Task Status of Volume dis-rep
------------------------------------------------------------------------------
There are no active volume tasks

I need some time to investigate into this as well.

Comment 1 Krutika Dhananjay 2014-02-01 19:01:03 UTC
posted in upstream for review.
http://review.gluster.org/#/c/6885/

Comment 2 Krutika Dhananjay 2014-02-01 19:05:11 UTC
With this fix, the output of remove-brick stop corrcetly prints the node ids.

[root@calvin /]# gluster volume remove-brick dis-rep calvin:/brick/3 hobbes:/brick/4 stop
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost              503       128.2MB          1000             0             0            completed              36.00
                            10.70.43.140                0        0Bytes          1000             0             0            completed              10.00
'remove-brick' process may be in the middle of a file migration.
The process will be fully stopped once the migration of the file is complete.
Please check remove-brick process for completion before doing any further brick related tasks on the volume.


And also, 'volume status all tasks' shows the correct output before and after remove-brick is stopped:

BEFORE:
Task Status of Volume dis-rep
------------------------------------------------------------------------------
Task                 : Remove brick        
ID                   : 89a745e1-540d-48fc-b4ce-da24f561cbc7
Removed bricks:     
calvin:/brick/3     
hobbes:/brick/4     
Status               : completed       


AFTER:
Task Status of Volume dis-rep
------------------------------------------------------------------------------
There are no active volume tasks

Comment 3 Anand Avati 2014-02-03 06:07:06 UTC
REVIEW: http://review.gluster.org/6885 (glusterd: Reset opinfo.op ONLY if lock succeeded) posted (#2) for review on master by Krutika Dhananjay (kdhananj@redhat.com)

Comment 4 Anand Avati 2014-02-04 10:10:09 UTC
REVIEW: http://review.gluster.org/6885 (glusterd: Reset opinfo.op ONLY if lock succeeded) posted (#3) for review on master by Krutika Dhananjay (kdhananj@redhat.com)

Comment 5 Anand Avati 2014-02-04 17:50:22 UTC
COMMIT: http://review.gluster.org/6885 committed in master by Vijay Bellur (vbellur@redhat.com) 
------
commit 797c62eb56797189b51a1dcb35009499f920a3ea
Author: Krutika Dhananjay <kdhananj@redhat.com>
Date:   Sat Feb 1 22:49:22 2014 +0530

    glusterd: Reset opinfo.op ONLY if lock succeeded
    
    Change-Id: I0244a7f61a826b32f4c2dfe51e246f2593a38211
    BUG: 1060434
    Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
    Reviewed-on: http://review.gluster.org/6885
    Reviewed-by: Kaushal M <kaushal@redhat.com>
    Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
    Tested-by: Gluster Build System <jenkins@build.gluster.com>

Comment 6 Anand Avati 2014-02-06 06:56:01 UTC
REVIEW: http://review.gluster.org/6922 (glusterd: Reset opinfo.op ONLY if lock succeeded) posted (#1) for review on release-3.5 by Krutika Dhananjay (kdhananj@redhat.com)

Comment 7 Anand Avati 2014-02-08 03:00:30 UTC
COMMIT: http://review.gluster.org/6922 committed in release-3.5 by Vijay Bellur (vbellur@redhat.com) 
------
commit ef37b4b6a22fe04b5a1789c45b28c8d7e6fe764a
Author: Krutika Dhananjay <kdhananj@redhat.com>
Date:   Sat Feb 1 22:49:22 2014 +0530

    glusterd: Reset opinfo.op ONLY if lock succeeded
    
    ... and also initialise @this before doing anything else.
    
    Change-Id: I0244a7f61a826b32f4c2dfe51e246f2593a38211
    BUG: 1060434
    Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
    Reviewed-on: http://review.gluster.org/6885
    Reviewed-by: Kaushal M <kaushal@redhat.com>
    Reviewed-by: Krishnan Parthasarathi <kparthas@redhat.com>
    Tested-by: Gluster Build System <jenkins@build.gluster.com>
    Reviewed-on: http://review.gluster.org/6922
    Reviewed-by: Vijay Bellur <vbellur@redhat.com>

Comment 8 Niels de Vos 2014-04-17 11:52:51 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.