Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1597903 - OpenShift on Openstack - pending csrs on scaleup
Summary: OpenShift on Openstack - pending csrs on scaleup
Keywords:
Status: CLOSED DUPLICATE of bug 1597904
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Auth
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Simo Sorce
QA Contact: Chuan Yu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-03 21:36 UTC by Matt Bruzek
Modified: 2018-07-05 01:37 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-07-05 01:37:34 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Matt Bruzek 2018-07-03 21:36:31 UTC
Description of problem:

We have automation to install OpenShift on OpenStack in a repeatable way. The recent 3.10 install completes successfully. On the attempt to scale to 250 nodes our install gets stuck on the approval step and I see several hundred Pending certificate signing request (csr)s. 

The scaleup operation ran until about 161 nodes and eventually failed to approve nodes. The log message was:

TASK [Approve bootstrap nodes] *************************************************
task path: /home/cloud-user/openshift-ansible/playbooks/openshift-node/private/join.yml:40

Version-Release number of selected component (if applicable):
$ oc version
oc v3.10.10
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://lb-0.scale-ci.example.com:8443
openshift v3.10.10
kubernetes v1.10.0+b81c8f8

$ git describe
v3.10.0-rc.0-115-g1d59617

How reproducible: We can often get this csr problem.


Steps to Reproduce:
1. Install OpenStack
2. Install OpenShift on OpenStack
3. Attempt to scale up to 250 nodes and notice the failure to approve nodes. 

Actual results:

The openshift-ansible playbook openshift-ansible/playbooks/openshift-node/scaleup.yml fails with the following error:


TASK [Approve bootstrap nodes] *************************************************
task path: /home/cloud-user/openshift-ansible/playbooks/openshift-node/private/join.yml:40
Tuesday 03 July 2018  12:56:29 -0400 (0:00:00.179)       0:08:23.501 **********
fatal: [master-1.scale-ci.example.com]: FAILED! => {"changed": true, "finished": false, "msg": "Timed out accepting certificate signing requests. Failing as requested.

When I went to the cluster I saw just over 500 csrs in "Pending" state.

root@master-1: /home/openshift # oc get csr --all-namespaces | grep Pending | wc -l                                                       
507 

Expected results:
I expected the scale up to succeed.

Additional info:

I will attach the logs in further comments.

Comment 1 Xiaoli Tian 2018-07-05 01:37:34 UTC

*** This bug has been marked as a duplicate of bug 1597904 ***


Note You need to log in before you can comment on or make changes to this bug.