Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1479446 - Rebalance estimate(ETA) shows wrong details(as intial message of 10min wait reappears) when still in progress
Summary: Rebalance estimate(ETA) shows wrong details(as intial message of 10min wait r...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: distribute
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: RHGS 3.4.z Batch Update 2
Assignee: Nithya Balachandran
QA Contact: Prasad Desala
URL:
Whiteboard:
Depends On:
Blocks: 1479528 1511271
TreeView+ depends on / blocked
 
Reported: 2017-08-08 14:50 UTC by nchilaka
Modified: 2018-12-17 17:07 UTC (History)
10 users (show)

Fixed In Version: glusterfs-3.12.2-27
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1479528 (view as bug list)
Environment:
Last Closed: 2018-12-17 17:07:02 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3827 None None None 2018-12-17 17:07:17 UTC

Description nchilaka 2017-08-08 14:50:56 UTC
Description of problem:
==============================
I did a removebrick operation to convert 2x2 to 1x2 , while IOs were going on from 3 different ganesha mounts.

I noticed that at a later stage(may be >80% completed), the message of "The estimated time for rebalance to complete will be unavailable for the first 10 minutes." appears again. 

I thinks this comes when the rebalance estimated time is over, but rebalance as such is not yet completed 







Last login: Tue Aug  8 19:32:38 2017 from 10.70.35.77
[root@dhcp46-42 ~]# gluster v rebal nrep2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             5145         7.4MB         10594             0             0          in progress        0:06:38
       dhcp46-101.lab.eng.blr.redhat.com             4142        21.7MB          8722             0             0          in progress        0:06:38
The estimated time for rebalance to complete will be unavailable for the first 10 minutes.
volume rebalance: nrep2: success
[root@dhcp46-42 ~]# gluster v rebal nrep2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             5993        31.3MB         11970             0             0          in progress        0:08:38
       dhcp46-101.lab.eng.blr.redhat.com             5050        26.6MB         10415             0             0          in progress        0:08:38
The estimated time for rebalance to complete will be unavailable for the first 10 minutes.
volume rebalance: nrep2: success

[root@dhcp46-42 ~]# gluster v rebal nrep2 status                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost             8059        62.0MB         16022             0             0          in progress        0:13:13
       dhcp46-101.lab.eng.blr.redhat.com             7208        76.2MB         14071             0             0          in progress        0:13:13
Estimated time left for rebalance to complete :        0:47:28
volume rebalance: nrep2: success
[root@dhcp46-42 ~]# gluster v rebal nrep2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            10699       110.9MB         21188             0             0          in progress        0:19:58
       dhcp46-101.lab.eng.blr.redhat.com             9949       119.4MB         16739             0             0          in progress        0:19:58
Estimated time left for rebalance to complete :        0:47:25
volume rebalance: nrep2: success

[root@dhcp46-42 ~]# gluster v rebal nrep2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            16839       151.7MB         28114             0             0          in progress        0:33:23
       dhcp46-101.lab.eng.blr.redhat.com            16754       184.3MB         27528             0             0          in progress        0:33:23
Estimated time left for rebalance to complete :        0:00:48
volume rebalance: nrep2: success



[root@dhcp46-42 ~]# 
[root@dhcp46-42 ~]# 
[root@dhcp46-42 ~]# gluster v rebal nrep2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            20687       192.2MB         32058             0             0          in progress        0:39:16
       dhcp46-101.lab.eng.blr.redhat.com            20965       189.6MB         32669             0             0          in progress        0:39:16
Estimated time left for rebalance to complete :        0:00:06
volume rebalance: nrep2: success
[root@dhcp46-42 ~]# 

============== SEE FROM BELOW ==================

[root@dhcp46-42 ~]# gluster v rebal nrep2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            21521       192.8MB         33069             0             0          in progress        0:40:28
       dhcp46-101.lab.eng.blr.redhat.com            22456       189.6MB         35708             0             0          in progress        0:40:28
The estimated time for rebalance to complete will be unavailable for the first 10 minutes.
volume rebalance: nrep2: success
[root@dhcp46-42 ~]# gluster v rebal nrep2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            21669       192.8MB         33372             0             0          in progress        0:40:36
       dhcp46-101.lab.eng.blr.redhat.com            22614       189.6MB         35708             0             0          in progress        0:40:36
The estimated time for rebalance to complete will be unavailable for the first 10 minutes.
volume rebalance: nrep2: success
[root@dhcp46-42 ~]# gluster v rebal nrep2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            21718       192.8MB         33372             0             0          in progress        0:40:40
       dhcp46-101.lab.eng.blr.redhat.com            22667       189.6MB         36020             0             0          in progress        0:40:40
The estimated time for rebalance to complete will be unavailable for the first 10 minutes.
volume rebalance: nrep2: success
[root@dhcp46-42 ~]# 
[root@dhcp46-42 ~]# gluster v rebal nrep2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            23842       194.1MB         37488             0             0          in progress        0:43:47
       dhcp46-101.lab.eng.blr.redhat.com            23440       285.5MB         39635             0             0            completed        0:43:29
The estimated time for rebalance to complete will be unavailable for the first 10 minutes.
volume rebalance: nrep2: success


Version-Release number of selected component (if applicable):
[root@dhcp46-42 ~]# rpm -qa|grep gluster
glusterfs-api-3.8.4-38.el7rhgs.x86_64
python-gluster-3.8.4-34.el7rhgs.noarch
glusterfs-server-3.8.4-38.el7rhgs.x86_64
gluster-nagios-addons-0.2.9-1.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.4-16.el7rhgs.x86_64
glusterfs-3.8.4-38.el7rhgs.x86_64
glusterfs-cli-3.8.4-38.el7rhgs.x86_64
glusterfs-rdma-3.8.4-38.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.2.x86_64
vdsm-gluster-4.17.33-1.2.el7rhgs.noarch
glusterfs-libs-3.8.4-38.el7rhgs.x86_64
glusterfs-fuse-3.8.4-38.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-38.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-38.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-38.el7rhgs.x86_64





Steps to Reproduce:
1.had a 1x2 volume add-brick to convert 2x2 and rebalance was done(with some files skipped)
2.did linux untar from one client, lookups from another client(going on till end)
rename,move,chmod,chgrp from another client , but for only sometime, that too these operations were complete much before the rebalance was at this state.

3.observed rebalance eta 

Actual results:
==========
again eta starts to show the initial 10 min wait message

Comment 2 nchilaka 2017-08-08 14:52:04 UTC
rebalance at end 
[root@dhcp46-42 ~]# gluster v rebal nrep2 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost            23842       194.1MB         37488             0             0            completed        0:44:21
       dhcp46-101.lab.eng.blr.redhat.com            23440       285.5MB         39635             0             0            completed        0:43:29
volume rebalance: nrep2: success

Comment 3 Nithya Balachandran 2017-08-08 17:29:11 UTC
Is this reproducible?

Comment 6 nchilaka 2018-04-05 09:19:14 UTC
Prasad, can you check this as part of your testing(comment#3, ie if this is reproducible)

Comment 17 Prasad Desala 2018-12-05 12:44:33 UTC
Verified this BZ on glusterfs version 3.12.2-30. Followed the same steps as in the description, rebalance ETA displayed as expected.

Moving this BZ to Verified.

Comment 18 errata-xmlrpc 2018-12-17 17:07:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3827


Note You need to log in before you can comment on or make changes to this bug.