Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1062848 - [RHS-RHOS] Root disk corruption on a nova instance booted from a cinder volume after a remove-brick/rebalance
Summary: [RHS-RHOS] Root disk corruption on a nova instance booted from a cinder volum...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: distribute
Version: 2.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Nithya Balachandran
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1286133
TreeView+ depends on / blocked
 
Reported: 2014-02-08 08:56 UTC by shilpa
Modified: 2015-11-27 11:43 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1286133 (view as bug list)
Environment:
Last Closed: 2015-11-27 11:43:02 UTC


Attachments (Terms of Use)
Log messages from VM instance (deleted)
2014-02-08 08:58 UTC, shilpa
no flags Details

Description shilpa 2014-02-08 08:56:04 UTC
Description of problem:
When a nova instance is rebooted while rebalance is in progress on the gluster volume, the root filesystem is mounted R/O after the instance comes back up. Corruption messages are seen. 


Version-Release number of selected component (if applicable):
glusterfs-3.4.0.59rhs-1.el6_4.x86_64

How reproducible: Always


Steps to Reproduce:
1. Create two 6*2 distribute-replicate volumes called glance-vol and cinder-vol for glance images and cinder volumes respectively.

2. Tag the volumes with group virt
   #gluster volume set glance-vol group virt

3. Set the storage.owner-uid and storage.owner-gid of glance-vol to 161
         gluster volume set glance-vol storage.owner-uid 161
         gluster volume set glance-vol storage.owner-gid 161

4. On RHOS machine, mount the RHS glance volume on /mnt/gluster/glance/images and start the glance-api service. Also configure glance volume for nova instances to use gluster glance-vol.

5. Mount RHS cinder-vol on /var/lib/cinder/volumes and configure RHOS to use RHS volume for cinder storage.

6. Create glance image, create cinder volume and copy the image the image to the volume.

# cinder create --display-name vol3 --image-id dfac4c39-7946-4baa-9fb3-444ec6348a88 10

7. Boot a nova instance out of the bootable cinder volume.

# nova boot --flavor 2 --boot-volume 71973975-7952-4d66-a3d8-3cd38de18431 instance-5

# getfattr -d -etext -m. -n trusted.glusterfs.pathinfo /var/lib/cinder/mnt/4db90e5492997091a102ba6ad764dade/volume-71973975-7952-4d66-a3d8-3cd38de18431
getfattr: Removing leading '/' from absolute path names
# file: var/lib/cinder/mnt/4db90e5492997091a102ba6ad764dade/volume-71973975-7952-4d66-a3d8-3cd38de18431
trusted.glusterfs.pathinfo="(<DISTRIBUTE:cinder-vol-dht> (<REPLICATE:cinder-vol-replicate-0> <POSIX(/rhs/brick1/c2):rhs-vm2:/rhs/brick1/c2/volume-71973975-7952-4d66-a3d8-3cd38de18431> <POSIX(/rhs/brick1/c1):rhs-vm1:/rhs/brick1/c1/volume-71973975-7952-4d66-a3d8-3cd38de18431>))"

8. Now run a remove-brick on the bricks from above output. 

# gluster v remove-brick cinder-vol 10.70.37.180:/rhs/brick1/c1 10.70.37.120:/rhs/brick1/c2 start

9. When the volume 71973975-7952-4d66-a3d8-3cd38de18431 is being migrated, reboot the instance-8 that is created from this volume.

10. Check the instance console once it is rebooted. Look for corruption errors messages. Once the instance is up, the rootfs /dev/vda is mounted R/O. Ran fsck manually to correct errors which did not help. The instance is rendered unuseable. 

Expected results:

The rootfs should be mounted R/W after the reboot and no corruption messages should be seen


Additional info:

Sosreports and VM screenshot is attached.

Comment 1 shilpa 2014-02-08 08:58:17 UTC
Created attachment 860851 [details]
Log messages from VM instance

Comment 2 shilpa 2014-02-08 09:07:39 UTC
sosreports in http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1062848/

Comment 4 Susant Kumar Palai 2015-11-27 11:43:02 UTC
Cloning this to 3.1. To be fixed in future.


Note You need to log in before you can comment on or make changes to this bug.