Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1599587 - [geo-rep]: Worker still ACTIVE after killing bricks on brick-mux enabled setup
Summary: [geo-rep]: Worker still ACTIVE after killing bricks on brick-mux enabled setup
Keywords:
Status: ON_QA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: geo-replication
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.5.0
Assignee: Mohit Agrawal
QA Contact: Rahul Hinduja
URL:
Whiteboard:
Depends On:
Blocks: 1483541 1600145 1601314
TreeView+ depends on / blocked
 
Reported: 2018-07-10 06:24 UTC by Rochelle
Modified: 2019-04-11 09:44 UTC (History)
13 users (show)

Fixed In Version: glusterfs-6.0-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1600145 (view as bug list)
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)

Description Rochelle 2018-07-10 06:24:14 UTC
Description of problem:
=======================
The ACTIVE brick processes for a geo-replication session were killed but it remains ACTIVE even after going down.

Before the bricks were killed:
-----------------------------
[root@dhcp42-18 scripts]# gluster volume geo-replication master 10.70.43.116::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.18     master        /rhs/brick1/b1    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:09:32          
10.70.42.18     master        /rhs/brick2/b4    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:06:17          
10.70.42.18     master        /rhs/brick3/b7    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:06:17          
10.70.41.239    master        /rhs/brick1/b2    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.41.239    master        /rhs/brick2/b5    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.41.239    master        /rhs/brick3/b8    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick1/b3    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick2/b6    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick3/b9    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          
[root@dhcp42-18 scripts]# gluster v status
Status of volume: gluster_shared_storage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.41.239:/var/lib/glusterd/ss_bri
ck                                          49152     0          Y       28814
Brick 10.70.43.179:/var/lib/glusterd/ss_bri
ck                                          49152     0          Y       27173
Brick dhcp42-18.lab.eng.blr.redhat.com:/var
/lib/glusterd/ss_brick                      49152     0          Y       9969 
Self-heal Daemon on localhost               N/A       N/A        Y       10879
Self-heal Daemon on 10.70.41.239            N/A       N/A        Y       29525
Self-heal Daemon on 10.70.43.179            N/A       N/A        Y       27892
 
Task Status of Volume gluster_shared_storage
-----------------------------------------------------------------------------



After the bricks were killed using gf_attach:
---------------------------------------------
[root@dhcp42-18 scripts]# gluster v status
Status of volume: gluster_shared_storage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.41.239:/var/lib/glusterd/ss_bri
ck                                          49152     0          Y       28814
Brick 10.70.43.179:/var/lib/glusterd/ss_bri
ck                                          49152     0          Y       27173
Brick dhcp42-18.lab.eng.blr.redhat.com:/var
/lib/glusterd/ss_brick                      49152     0          Y       9969 
Self-heal Daemon on localhost               N/A       N/A        Y       10879
Self-heal Daemon on 10.70.41.239            N/A       N/A        Y       29525
Self-heal Daemon on 10.70.43.179            N/A       N/A        Y       27892
 
Task Status of Volume gluster_shared_storage
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: master
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.18:/rhs/brick1/b1            N/A       N/A        N       N/A  
Brick 10.70.41.239:/rhs/brick1/b2           49152     0          Y       28814
Brick 10.70.43.179:/rhs/brick1/b3           49152     0          Y       27173
Brick 10.70.42.18:/rhs/brick2/b4            N/A       N/A        N       N/A  
Brick 10.70.41.239:/rhs/brick2/b5           49152     0          Y       28814
Brick 10.70.43.179:/rhs/brick2/b6           49152     0          Y       27173
Brick 10.70.42.18:/rhs/brick3/b7            N/A       N/A        N       N/A  
Brick 10.70.41.239:/rhs/brick3/b8           49152     0          Y       28814
Brick 10.70.43.179:/rhs/brick3/b9           49152     0          Y       27173
Self-heal Daemon on localhost               N/A       N/A        Y       10879
Self-heal Daemon on 10.70.41.239            N/A       N/A        Y       29525
Self-heal Daemon on 10.70.43.179            N/A       N/A        Y       27892
 
Task Status of Volume master
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp42-18 scripts]# gluster volume geo-replication master 10.70.43.116::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.18     master        /rhs/brick1/b1    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:11:33          
10.70.42.18     master        /rhs/brick2/b4    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:12:02          
10.70.42.18     master        /rhs/brick3/b7    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:12:18          
10.70.41.239    master        /rhs/brick1/b2    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.41.239    master        /rhs/brick2/b5    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.41.239    master        /rhs/brick3/b8    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick1/b3    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick2/b6    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick3/b9    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          



Version-Release number of selected component (if applicable):
=============================================================
[root@dhcp42-18 scripts]# rpm -qa | grep gluster
glusterfs-rdma-3.12.2-13.el7rhgs.x86_64
glusterfs-server-3.12.2-13.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-client-xlators-3.12.2-13.el7rhgs.x86_64
glusterfs-cli-3.12.2-13.el7rhgs.x86_64
python2-gluster-3.12.2-13.el7rhgs.x86_64
libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.6.x86_64
glusterfs-3.12.2-13.el7rhgs.x86_64
glusterfs-api-3.12.2-13.el7rhgs.x86_64
gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
glusterfs-libs-3.12.2-13.el7rhgs.x86_64
vdsm-gluster-4.19.43-2.3.el7rhgs.noarch
glusterfs-fuse-3.12.2-13.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-13.el7rhgs.x86_64
glusterfs-events-3.12.2-13.el7rhgs.x86_64


How reproducible:
=================
2/2


Steps to Reproduce:
1.Create a geo-replication session (3x3 master and slave volume)
2.Mount the master and slave volume
3.Create files on the master
4.kill brick using gf_attach 

Actual results:
===============
The workers still remain ACTIVE


Expected results:
================
The 3 ACTIVE workers should go to FAULTY and 3 PASSIVE workers should become ACTIVE and do the syncing


Note You need to log in before you can comment on or make changes to this bug.