Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1362666 - oo-admin-move should move gears to nodes with enough free space + buffer space
Summary: oo-admin-move should move gears to nodes with enough free space + buffer space
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Unknown
Version: 2.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: Sally
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On: 1122084
Blocks: 1277547
TreeView+ depends on / blocked
 
Reported: 2016-08-02 19:39 UTC by Rory Thrasher
Modified: 2016-08-24 19:47 UTC (History)
8 users (show)

Fixed In Version: rubygem-openshift-origin-msg-broker-mcollective-1.36.2.2-1.el6op, rubygem-openshift-origin-node-1.38.6.3-1.el6op, openshift-origin-msg-node-mcollective-1.30.2.2-1.el6op
Doc Type: Bug Fix
Doc Text:
Cause: A gear move does not take into consideration the amount of free space available on the node a gear is moved to. Consequence: Gears could be moved to a node whose free space was less than what the gear required, resulting in gears on that node failing. Fix: The gear move process now considers the amount of free space on each node when determining which node it should move the gear to. Result: Gears are no longer moved to a node whose storage speace is not adequate for the gear.
Clone Of: 1122084
Environment:
Last Closed: 2016-08-24 19:47:16 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:1773 normal SHIPPED_LIVE Important: Red Hat OpenShift Enterprise 2.2.10 security, bug fix, and enhancement update 2016-08-24 23:41:18 UTC

Comment 3 Johnny Liu 2016-08-08 10:59:26 UTC
Re-test this bug with openshift-origin-msg-node-mcollective-1.30.2.1-1.el6op.noarch using 2.2/2016-08-05.1 puddle, failed.

Even if node's disk space for /var/lib/openshift is enough for moving gear, still failed.

# oo-admin-move --gear_uuid jialiu-ruby18app-1 -i node1.ose22-auto.com.cn
URL: http://ruby18app-jialiu.ose22-auto.com.cn
Login: jialiu
App UUID: 57a864f182611da20d0002b1
Gear UUID: 57a864f182611da20d0002b1
DEBUG: Source district uuid: 57a8243482611d52e5000001
DEBUG: Destination district uuid: 57a8243482611d52e5000001
DEBUG: Getting existing app 'ruby18app' status before moving
DEBUG: Gear component 'ruby-1.8' was running
DEBUG: Unpublishing routing information for gear 'jialiu-ruby18app-1'
DEBUG: Stopping existing app cartridge 'ruby-1.8' before moving
DEBUG: Force stopping existing app before moving
DEBUG: Gear platform is 'linux'
DEBUG: Moving failed.  Rolling back gear 'jialiu-ruby18app-1' in 'ruby18app' with delete on 'node1.ose22-auto.com.cn'
Gear 'jialiu-ruby18app-1' cannot be moved to 'node1.ose22-auto.com.cn'.  Not enough disk space, node would be > 95% full after move.

Seem like the following two lines of code is not added merged into rpm package.
+Facter.add(:node_disk_free) { setcode { results['node_disk_free'] } }
+Facter.add(:node_total_size) { setcode { results['node_total_size'] } }

Comment 6 Johnny Liu 2016-08-09 01:48:16 UTC
Verified this bug with 2.2/2016-08-08.1, and PASS.

node1:
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_dhcp128178-lv_root
                       18G  6.1G   11G  38% /
tmpfs                 1.9G     0  1.9G   0% /dev/shm
/dev/vda1             477M   99M  353M  22% /boot
/dev/loop0            7.8G   36M  7.4G   1% /var/lib/openshift

node2:
# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_dhcp128178-lv_root
                       18G   13G  3.4G  80% /
tmpfs                 1.9G     0  1.9G   0% /dev/shm
/dev/vda1             477M   99M  353M  22% /boot
/dev/loop0            7.8G  7.1G  320M  96% /var/lib/openshift

move one gear from node1 to node2, that is not allowed.
# oo-admin-move --gear_uuid jialiu-php54app-1 -i node2.ose22-auto.com.cn
URL: http://php54app-jialiu.ose22-auto.com.cn
Login: jialiu
App UUID: 57a8643882611da20d00029b
Gear UUID: 57a8643882611da20d00029b
DEBUG: Source district uuid: 57a8243482611d52e5000001
DEBUG: Destination district uuid: 57a8243482611d52e5000001
DEBUG: Getting existing app 'php54app' status before moving
DEBUG: Gear component 'php-5.4' was running
DEBUG: Unpublishing routing information for gear 'jialiu-php54app-1'
DEBUG: Stopping existing app cartridge 'php-5.4' before moving
DEBUG: Force stopping existing app before moving
DEBUG: Gear platform is 'linux'
DEBUG: Moving failed.  Rolling back gear 'jialiu-php54app-1' in 'php54app' with delete on 'node2.ose22-auto.com.cn'
Gear 'jialiu-php54app-1' cannot be moved to 'node2.ose22-auto.com.cn'.  Not enough disk space, node would be > 95% full after move.


move one gear from node2 to node1, it succeeded.
# oo-admin-move --gear_uuid jialiu-php53app-1 -i node1.ose22-auto.com.cn
URL: http://php53app-jialiu.ose22-auto.com.cn
Login: jialiu
App UUID: 57a85a0482611da20d000134
Gear UUID: 57a85a0482611da20d000134
DEBUG: Source district uuid: 57a8243482611d52e5000001
DEBUG: Destination district uuid: 57a8243482611d52e5000001
DEBUG: Getting existing app 'php53app' status before moving
DEBUG: Gear component 'php-5.3' was running
DEBUG: Unpublishing routing information for gear 'jialiu-php53app-1'
DEBUG: Stopping existing app cartridge 'php-5.3' before moving
DEBUG: Force stopping existing app before moving
DEBUG: Gear platform is 'linux'
DEBUG: Creating new account for gear 'jialiu-php53app-1' on node1.ose22-auto.com.cn
DEBUG: Moving content for app 'php53app', gear 'jialiu-php53app-1' to node1.ose22-auto.com.cn
Agent pid 17734
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 17734 killed;
DEBUG: Moving system components for app 'php53app', gear 'jialiu-php53app-1' to node1.ose22-auto.com.cn
Agent pid 17742
unset SSH_AUTH_SOCK;
unset SSH_AGENT_PID;
echo Agent pid 17742 killed;
DEBUG: Starting cartridge 'php-5.3' in 'php53app' after move on node1.ose22-auto.com.cn
DEBUG: Fixing DNS and mongo for gear 'jialiu-php53app-1' after move
DEBUG: Changing server identity of 'jialiu-php53app-1' from 'node2.ose22-auto.com.cn' to 'node1.ose22-auto.com.cn'
DEBUG: Updating routing information for gear 'jialiu-php53app-1' after move
DEBUG: Deconfiguring old app 'php53app' on node2.ose22-auto.com.cn after move
Successfully moved gear with uuid 'jialiu-php53app-1' of app 'php53app' from 'node2.ose22-auto.com.cn' to 'node1.ose22-auto.com.cn'

Comment 8 errata-xmlrpc 2016-08-24 19:47:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-1773.html


Note You need to log in before you can comment on or make changes to this bug.