Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1366454 - Build failed after upgrade OCP 3.2 to OCP 3.3 on Atomic 7.2.5
Summary: Build failed after upgrade OCP 3.2 to OCP 3.3 on Atomic 7.2.5
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: docker
Version: 7.2
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Daniel Walsh
QA Contact: atomic-bugs@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-12 03:01 UTC by wewang
Modified: 2019-03-06 02:45 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-18 14:28:03 UTC


Attachments (Terms of Use)
Docker service (deleted)
2016-08-16 02:30 UTC, Anping Li
no flags Details

Description wewang 2016-08-12 03:01:32 UTC
Version-Release number of selected component
openshift v3.2.1.9
openshift v3.3.0.17


How reproducible:
sometimes

Description of problem:
Image upgrade failed with error:Failed to remove container from OCP 3.2 to OCP 3.3 

Steps to Reproduce:
First prepare data in OCP 3.2 env 
1. create project (project python33 and python34)
  
2. Create app with registry 
   $oc new-app -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/build/python34-imageupgrade-stibuild.json
3. oc get is 
#if python-34-rhel7 no tag, import it:  oc import-image python-34-rhel7 --from registry.access.redhat.com/openshift3/rhscl/python-34-rhel7:latest --confirm=true --insecure=true
4. Check the build and pods
   # oc get pods
NAME               READY     STATUS    RESTARTS   AGE
database-1-sz8j3   1/1       Running   0          15h
frontend-1-6rirh   1/1       Running   0          15h
frontend-1-lomb3   1/1       Running   0          15h

5. Upgrade to OCP 3.3 
6. Import the image to openshift
   $oc import-image python-34-rhel7 --from brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhscl/python-34-rhel7 --confirm=true --insecure=true --all=true
7. Check the build
   $oc get build
NAME                    TYPE      FROM          STATUS     STARTED         DURATION
python-sample-build-1   Source    Git@fa3c4f3   Complete   3 days ago      19m33s
python-sample-build-2   Source    Git@fa3c4f3   Failed     9 minutes ago   5m36s

   $oc build-logs python-sample-build-2
error: Execution of post execute step failed
warning: Failed to remove container "s2i_brew_pulp_docker01_web_prod_ext_phx2_redhat_com_8888_rhscl_python_34_rhel7_latest_0cf7c20f": API error (500): Driver devicemapper failed to remove root filesystem b934bbc032f6475b922808fe011a5d5f5c82333d30cd007846b8f854fa194a7a: mount still active
error: build error: building python34/python-sample-build-2:d41a3ed9 failed when committing the image due to error: No such container: b934bbc032f6475b922808fe011a5d5f5c82333d30cd007846b8f854fa194a7a

Actual results:
7. Import the image to openshift,build failed

Expected results:
When import new image,build should be success

Comment 3 Devan Goodwin 2016-08-12 15:34:35 UTC
Probably going to have to pull in help from Docker folks as this looks outside the scope of upgrade.

From docker journalctl:

Aug 12 15:10:14 openshift-211.lab.eng.nay.redhat.com forward-journal[3258]: time="2016-08-12T15:10:14.687241693Z" level=error msg="Handler for GET /containers/bd426c2f513e70f72e6d4650923133e3d65e9eb2a59ab16d99222e25cb52429d/json returned error: nosuchcontainer: No such container: bd426c2f513e70f72e6d4650923133e3d65e9eb2a59ab16d99222e25cb52429d"
Aug 12 15:10:14 openshift-211.lab.eng.nay.redhat.com forward-journal[3258]: time="2016-08-12T15:10:14.687285403Z" level=error msg="Handler for GET /containers/bd426c2f513e70f72e6d4650923133e3d65e9eb2a59ab16d99222e25cb52429d/json returned error: No such container: bd426c2f513e70f72e6d4650923133e3d65e9eb2a59ab16d99222e25cb52429d"
Aug 12 15:10:14 openshift-211.lab.eng.nay.redhat.com forward-journal[3258]: time="2016-08-12T15:10:14.688544347Z" level=error msg="Handler for GET /containers/08ad718af0d03d93ff4dec1386c43b6383ccca26181a96272d2a7290ccc4b54a/json returned error: nosuchcontainer: No such container: 08ad718af0d03d93ff4dec1386c43b6383ccca26181a96272d2a7290ccc4b54a"
Aug 12 15:10:14 openshift-211.lab.eng.nay.redhat.com forward-journal[3258]: time="2016-08-12T15:10:14.688595414Z" level=error msg="Handler for GET /containers/08ad718af0d03d93ff4dec1386c43b6383ccca26181a96272d2a7290ccc4b54a/json returned error: No such container: 08ad718af0d03d93ff4dec1386c43b6383ccca26181a96272d2a7290ccc4b54a"
Aug 12 15:10:14 openshift-211.lab.eng.nay.redhat.com forward-journal[3258]: time="2016-08-12T15:10:14.691144807Z" level=error msg="Handler for GET /containers/7e9855f83322f6284928b1e831547013579109e0dcf331dcb7a8b240dad183cc/json returned error: nosuchcontainer: No such container: 7e9855f83322f6284928b1e831547013579109e0dcf331dcb7a8b240dad183cc"
Aug 12 15:10:14 openshift-211.lab.eng.nay.redhat.com forward-journal[3258]: time="2016-08-12T15:10:14.691176687Z" level=error msg="Handler for GET /containers/7e9855f83322f6284928b1e831547013579109e0dcf331dcb7a8b240dad183cc/json returned error: No such container: 7e9855f83322f6284928b1e831547013579109e0dcf331dcb7a8b240dad183cc"

Version is currently 1.10.3. 

Can't remove any Docker containers. Reassigning to RHEL Docker for assistance.

-bash-4.2# docker run -ti fedora /bin/bash
Unable to find image 'fedora:latest' locally
Trying to pull repository virt-openshift-05.lab.eng.nay.redhat.com:5000/fedora ... 
latest: Pulling from virt-openshift-05.lab.eng.nay.redhat.com:5000/fedora
a3ed95caeb02: Already exists 
93410896e1b1: Pull complete 
Digest: sha256:7e17ce78fa3ed97214fbc964077aa48582876722ee0199fda8a1695dac0db619
Status: Downloaded newer image for virt-openshift-05.lab.eng.nay.redhat.com:5000/fedora:latest
bash-4.3# exit
-bash-4.2# docker rm 12f939aa68eb
Failed to remove container (12f939aa68eb): Error response from daemon: Driver devicemapper failed to remove root filesystem 12f939aa68ebd45d23240d3c4444f62cb1e1b79878e60e2b3516d33891a7db6a: remove /var/lib/docker/devicemapper/mnt/a2c736892aa1f5ca2dc629906c63c1c6081e6f4505d4cf1693faa721ed084637: device or resource busy

Comment 5 Daniel Walsh 2016-08-15 16:19:44 UTC
Could you attach the docker.service file you are using.  We expect that mountflags=slave is not set in the systemd unit file, which could cause a problem like this on RHEL systems.

Comment 6 Anping Li 2016-08-16 02:30:58 UTC
Created attachment 1191035 [details]
Docker service

Comment 7 Daniel Walsh 2016-08-16 12:48:34 UTC
Could you add 

MountFlags=slave 

To your docker.service file and try again.  This should be the default now in RHEL.

Comment 8 Anping Li 2016-08-16 13:24:18 UTC
I didn't know how to add MountFlags=slave in Atomic,  Compared with the other atomic 7.2.5, there are no MountFlags=slave on my Env. 

I was doubt if I am using is a correct atomic hosts. I upgraded it to a new version and see what happened.

* 2016-06-06 18:12:07     7.2.5       4bf265cf86     rhel-atomic-host     rhel-atomic-host:rhel-atomic-host/7/x86_64/standard

Comment 9 smahajan@redhat.com 2016-08-16 15:58:44 UTC
Anping,

I am trying to add the `MountFlags=slave` in the systemd unit file, but failing to update the file.

I believe the filesystem mounted at /usr is ro (read only). 

$$ cat /proc/mounts|grep /dev/mapper/atomicos-root 
/dev/mapper/atomicos-root /usr xfs ro,seclabel,relatime,attr2,inode64,noquota 0 0 

Do you know how to make it writable ? so that I can update the unit file.

Shishir

Comment 10 Anping Li 2016-08-18 10:25:59 UTC
ostree admin unlock and then modify the file. But it is pending with following messages now. 

Waiting for sysroot lock...
Waiting for sysroot lock...
Waiting for sysroot lock...
Waiting for sysroot lock...
Waiting for sysroot lock...
Waiting for sysroot lock...
Waiting for sysroot lock...
Waiting for sysroot lock...

Comment 11 Alex Jia 2016-08-18 10:48:40 UTC
(In reply to Anping Li from comment #10)
> ostree admin unlock and then modify the file. But it is pending with
> following messages now. 
> 
> Waiting for sysroot lock...

although ostree admin unlock/atomic host unlock makes /usr writtable, but it's a overlayfs, I'm not sure it's okay for this case.

[cloud-user@atomic-00 atomic]$ sudo atomic host unlock
Development mode enabled.  A writable overlayfs is now mounted on /usr.
All changes there will be discarded on reboot.

[cloud-user@atomic-00 atomic]$ grep overlay /proc/mounts 
overlay /usr overlay rw,seclabel,relatime,lowerdir=usr,upperdir=/var/tmp/ostree-unlock-ovl.HTESMY/upper,workdir=/var/tmp/ostree-unlock-ovl.HTESMY/work 0 0

Comment 12 Alex Jia 2016-08-18 10:50:47 UTC
(In reply to Alex Jia from comment #11)
> (In reply to Anping Li from comment #10)
> > ostree admin unlock and then modify the file. But it is pending with
> > following messages now. 
> > 
> > Waiting for sysroot lock...
> 
> although ostree admin unlock/atomic host unlock makes /usr writtable, but
> it's a overlayfs, I'm not sure it's okay for this case.
> 

Especially, a overlayfs doesn't work well with SELinux enforcing mode.

Comment 13 Alex Jia 2016-08-18 10:57:59 UTC
(In reply to Anping Li from comment #8)
> I didn't know how to add MountFlags=slave in Atomic,  Compared with the
> other atomic 7.2.5, there are no MountFlags=slave on my Env. 
> 
> I was doubt if I am using is a correct atomic hosts. I upgraded it to a new
> version and see what happened.
> 
> * 2016-06-06 18:12:07     7.2.5       4bf265cf86     rhel-atomic-host    
> rhel-atomic-host:rhel-atomic-host/7/x86_64/standard

To append MountFlags=slave into section Service of /usr/lib/systemd/system/docker.service then restart docker service.

Comment 14 Anping Li 2016-08-19 01:37:46 UTC
I had add MountFlags=slave to docker service. 

Openshift sti build doesn't work. Shall we remove /var/lib/docker?  BTW, You can do anything on this ENV.

bash-4.2# oc logs cakephp-mysql-example-4-build -n cakephpmysql
Downloading "https://github.com/openshift/cakephp-ex.git" ...

---> Installing application source...


Pushing image 172.30.160.151:5000/cakephpmysql/cakephp-mysql-example:latest ...
Pushed 2/4 layers, 50% complete
Pushed 2/4 layers, 53% complete
Pushed 2/4 layers, 52% complete
Registry server Address: 
Registry server User Name: serviceaccount
Registry server Email: serviceaccount@example.org
Registry server Password: <<non-empty>>
error: build error: Failed to push image: open /var/lib/docker/devicemapper/mnt/fb60ae7e0b9324d6196d81a1a1813710f364647d1903b74036eea34f1ba06e06/rootfs/usr/include/asm-generic/bitsperlong.h: no such file or directory

Comment 15 Daniel Walsh 2016-08-19 20:35:30 UTC
Certainly seems like /var/lib/docker is hosed up


Note You need to log in before you can comment on or make changes to this bug.