Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 1362666

Summary: oo-admin-move should move gears to nodes with enough free space + buffer space
Product: OpenShift Container Platform Reporter: Rory Thrasher <rthrashe>
Component: UnknownAssignee: Sally <somalley>
Status: CLOSED ERRATA QA Contact: Johnny Liu <jialiu>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.2.0CC: agrimm, aos-bugs, jokerman, libra-bugs, mmccomas, mwoodson, somalley, tiwillia
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rubygem-openshift-origin-msg-broker-mcollective-1.36.2.2-1.el6op, rubygem-openshift-origin-node-1.38.6.3-1.el6op, openshift-origin-msg-node-mcollective-1.30.2.2-1.el6op Doc Type: Bug Fix
Doc Text:
Cause: A gear move does not take into consideration the amount of free space available on the node a gear is moved to. Consequence: Gears could be moved to a node whose free space was less than what the gear required, resulting in gears on that node failing. Fix: The gear move process now considers the amount of free space on each node when determining which node it should move the gear to. Result: Gears are no longer moved to a node whose storage speace is not adequate for the gear.
Story Points: ---
Clone Of: 1122084 Environment:
Last Closed: 2016-08-24 19:47:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1122084    
Bug Blocks: 1277547    

Comment 3 Johnny Liu 2016-08-08 10:59:26 UTC
Re-test this bug with openshift-origin-msg-node-mcollective-1.30.2.1-1.el6op.noarch using 2.2/2016-08-05.1 puddle, failed.

Even if node's disk space for /var/lib/openshift is enough for moving gear, still failed.

# oo-admin-move --gear_uuid jialiu-ruby18app-1 -i node1.ose22-auto.com.cn
URL: http://ruby18app-jialiu.ose22-auto.com.cn
Login: jialiu
App UUID: 57a864f182611da20d0002b1
Gear UUID: 57a864f182611da20d0002b1
DEBUG: Source district uuid: 57a8243482611d52e5000001
DEBUG: Destination district uuid: 57a8243482611d52e5000001
DEBUG: Getting existing app 'ruby18app' status before moving
DEBUG: Gear component 'ruby-1.8' was running
DEBUG: Unpublishing routing information for gear 'jialiu-ruby18app-1'
DEBUG: Stopping existing app cartridge 'ruby-1.8' before moving
DEBUG: Force stopping existing app before moving
DEBUG: Gear platform is 'linux'
DEBUG: Moving failed.  Rolling back gear 'jialiu-ruby18app-1' in 'ruby18app' with delete on 'node1.ose22-auto.com.cn'
Gear 'jialiu-ruby18app-1' cannot be moved to 'node1.ose22-auto.com.cn'.  Not enough disk space, node would be > 95% full after move.

Seem like the following two lines of code is not added merged into rpm package.
+Facter.add(:node_disk_free) { setcode { results['node_disk_free'] } }
+Facter.add(:node_total_size) { setcode { results['node_total_size'] } }

Comment 6 Johnny Liu 2016-08-09 01:48:16 UTC
Verified this bug with 2.2/2016-08-08.1, and PASS.

node1:
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_dhcp128178-lv_root
                       18G  6.1G   11G  38% /
tmpfs                 1.9G     0  1.9G   0% /dev/shm
/dev/vda1             477M   99M  353M  22% /boot
/dev/loop0            7.8G   36M  7.4G   1% /var/lib/openshift

node2:
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_dhcp128178-lv_root
                       18G   13G  3.4G  80% /
tmpfs                 1.9G     0  1.9G   0% /dev/shm
/dev/vda1             477M   99M  353M  22% /boot
/dev/loop0            7.8G  7.1G  320M  96% /var/lib/openshift

move one gear from node1 to node2, that is not allowed.
# oo-admin-move --gear_uuid jialiu-php54app-1 -i node2.ose22-auto.com.cn
URL: http://php54app-jialiu.ose22-auto.com.cn
Login: jialiu
App UUID: 57a8643882611da20d00029b
Gear UUID: 57a8643882611da20d00029b
DEBUG: Source district uuid: 57a8243482611d52e5000001
DEBUG: Destination district uuid: 57a8243482611d52e5000001
DEBUG: Getting existing app 'php54app' status before moving
DEBUG: Gear component 'php-5.4' was running
DEBUG: Unpublishing routing information for gear 'jialiu-php54app-1'
DEBUG: Stopping existing app cartridge 'php-5.4' before moving
DEBUG: Force stopping existing app before moving
DEBUG: Gear platform is 'linux'
DEBUG: Moving failed.  Rolling back gear 'jialiu-php54app-1' in 'php54app' with delete on 'node2.ose22-auto.com.cn'
Gear 'jialiu-php54app-1' cannot be moved to 'node2.ose22-auto.com.cn'.  Not enough disk space, node would be > 95% full after move.


move one gear from node2 to node1, it succeeded.
# oo-admin-move --gear_uuid jialiu-php53app-1 -i node1.ose22-auto.com.cn
URL: http://php53app-jialiu.ose22-auto.com.cn
Login: jialiu
App UUID: 57a85a0482611da20d000134
Gear UUID: 57a85a0482611da20d000134
DEBUG: Source district uuid: 57a8243482611d52e5000001
DEBUG: Destination district uuid: 57a8243482611d52e5000001
DEBUG: Getting existing app 'php53app' status before moving
DEBUG: Gear component 'php-5.3' was running
DEBUG: Unpublishing routing information for gear 'jialiu-php53app-1'
DEBUG: Stopping existing app cartridge 'php-5.3' before moving
DEBUG: Force stopping existing app before moving
DEBUG: Gear platform is 'linux'
DEBUG: Creating new account for gear 'jialiu-php53app-1' on node1.ose22-auto.com.cn
DEBUG: Moving content for app 'php53app', gear 'jialiu-php53app-1' to node1.ose22-auto.com.cn
Agent pid 17734
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 17734 killed;
DEBUG: Moving system components for app 'php53app', gear 'jialiu-php53app-1' to node1.ose22-auto.com.cn
Agent pid 17742
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 17742 killed;
DEBUG: Starting cartridge 'php-5.3' in 'php53app' after move on node1.ose22-auto.com.cn
DEBUG: Fixing DNS and mongo for gear 'jialiu-php53app-1' after move
DEBUG: Changing server identity of 'jialiu-php53app-1' from 'node2.ose22-auto.com.cn' to 'node1.ose22-auto.com.cn'
DEBUG: Updating routing information for gear 'jialiu-php53app-1' after move
DEBUG: Deconfiguring old app 'php53app' on node2.ose22-auto.com.cn after move
Successfully moved gear with uuid 'jialiu-php53app-1' of app 'php53app' from 'node2.ose22-auto.com.cn' to 'node1.ose22-auto.com.cn'

Comment 8 errata-xmlrpc 2016-08-24 19:47:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-1773.html