Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1512679 - Failed docker builds leave temporary containers on node
Summary: Failed docker builds leave temporary containers on node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Build
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.7.z
Assignee: Cesar Wong
QA Contact: Wenjing Zheng
URL:
Whiteboard:
: 1515358 (view as bug list)
Depends On:
Blocks: 1533181
TreeView+ depends on / blocked
 
Reported: 2017-11-13 20:12 UTC by Cesar Wong
Modified: 2018-01-25 03:35 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The OpenShift Docker builder invokes the Docker build API without the ForceRmTemp flag Consequence: Containers from failed builds remain on the node where the build ran. These containers are not recognized by the kubelet for gc and are therefore accumulated until the node runs out of space. Fix: Modified the Docker build API call from the OpenShift Docker builder to force the removal of temporary containers. Result: Failed containers no longer remain on the node where a Docker build ran.
Clone Of:
: 1538413 (view as bug list)
Environment:
Last Closed: 2017-12-18 13:23:56 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:3464 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.7 bug fix and enhancement update 2017-12-18 18:22:05 UTC

Description Cesar Wong 2017-11-13 20:12:40 UTC
Description of problem:
After running a Docker strategy build that fails on a node, a container that represents that build remains on the node. The container is not cleaned up by the Kubelet because it's not a container managed by Kubernetes. This causes the node to keep containers that will not get cleaned up, eventually causing the node to run out of space.

Version-Release number of selected component (if applicable):
All versions

How reproducible:
Always

Steps to Reproduce:
1. Create a Docker build that will fail:
   echo "FROM openshift/origin:latest\nRUN exit 1" | oc new-build -D - --name failing-build
2. Wait for the build to finish
3. Inspect containers on the node where the build ran with 'docker ps -a'

Actual results:
A container that runs the last failing RUN instruction will exist ('exit 1')

Expected results:
No containers related to the failed build should exist on the node


Additional info:

Comment 1 Cesar Wong 2017-11-13 20:20:50 UTC
PR https://github.com/openshift/origin/pull/17285

Comment 2 Cesar Wong 2017-11-27 15:14:35 UTC
PR for origin master https://github.com/openshift/origin/pull/17283

Comment 4 Dongbo Yan 2017-12-06 05:58:20 UTC
Verified
# openshift version
openshift v3.7.11
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8

Comment 5 Ben Parees 2017-12-06 20:30:19 UTC
*** Bug 1515358 has been marked as a duplicate of this bug. ***

Comment 8 errata-xmlrpc 2017-12-18 13:23:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3464


Note You need to log in before you can comment on or make changes to this bug.