Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1367199 - iptablesSyncPeriod should default to 30s OOTB
Summary: iptablesSyncPeriod should default to 30s OOTB
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.3.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Samuel Munilla
QA Contact: Mike Fiedler
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-15 20:22 UTC by Mike Fiedler
Modified: 2016-09-27 09:44 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-27 09:44:12 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1933 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.3 Release Advisory 2016-09-27 13:24:36 UTC

Description Mike Fiedler 2016-08-15 20:22:43 UTC
Description of problem:

iptablesSyncPeriod was made configurable in 

https://github.com/openshift/openshift-docs/issues/1051
https://github.com/openshift/openshift-ansible/pull/743

The default value (5 seconds) in the code and the installer is too aggressive.  Based on scalability testing during 3.3 with over 7K nodes, iptables consumes too much CPU at this setting.

Version-Release number of selected component (if applicable): 

3.3.0.18


How reproducible: Always


Steps to Reproduce:
1. Install an HA cluster (3 masters, 3 etcd, 2 router/registry, 300 nodes
2. Run cluster-loader from the SVT Git repo to create 5000 projects (20K pods) using the node vertical test configuration.


Actual results:

Not all pods run - only about 7K can start
iptables consumes a full core and stays  pegged for most of the test
Many errors like this show up in the system log:

Aug 15 16:00:33 mvirt-m-1 atomic-openshift-node: E0815 16:00:33.497082   49875 node_iptables.go:64] Syncing openshift iptables failed: Failed to ensure rule {nat POSTROUTING [-s 10.128.0.0/10 ! -d 10.128.0.0/10 -j MASQUERADE]} exists: error checking rule: exit status 4: iptables: Resource temporarily unavailable.



Expected results:

No errors starting all 20K pods

Comment 1 Timothy St. Clair 2016-08-15 20:24:39 UTC
Default upstream sync period is 30s.

Comment 2 Mike Fiedler 2016-08-15 20:33:15 UTC
The default should be 30s.   kubernetes originally had 5s upstream but has since changed it to 30s in the code.

Comment 3 openshift-github-bot 2016-08-16 17:22:48 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/735729b08506be08b2a5215a8c1d628cac6d7741
Bug 1367199 - iptablesSyncPeriod should default to 30s OOTB

Update the default to thirty seconds.

https://github.com/openshift/openshift-ansible/commit/dcfddb882554b7f1a9aa1f4024ba9eb2ebf07204
Merge pull request #2306 from smunilla/BZ1367199

Bug 1367199 - iptablesSyncPeriod should default to 30s OOTB

Comment 5 Gan Huang 2016-08-23 02:14:44 UTC
Verified with atomic-openshift-utils-3.3.13-1.git.0.7435ce7.el7.noarch

$ grep "iptables_sync_period" /usr/share/ansible/openshift-ansible/roles/openshift_facts/library/openshift_facts.py
                                    iptables_sync_period='30s',

Comment 6 Gan Huang 2016-08-23 02:46:29 UTC
iptablesSyncPeriod is set to 30s by default in node-config.yaml.

# grep "iptablesSyncPeriod" /etc/origin/node/node-config.yaml 
iptablesSyncPeriod: "30s"

Comment 8 Mike Fiedler 2016-08-23 12:32:18 UTC
I will verify, but this is actually hard to verify.   The 30 seconds is a MAX limit on the resync.   Changes to services or endpoints will still force resyncs at more frequent intervals.   tstclair, I think there should be a separate bz/issue opened to establish a MIN sync.   Agree?

Comment 9 Mike Fiedler 2016-09-02 11:55:07 UTC
Opened https://bugzilla.redhat.com/show_bug.cgi?id=1371971 to follow up on the min resync interval.

Verified in 3.3.0.28 that iptablesResyncPeriod correctly set to 30 seconds during install.

Comment 11 errata-xmlrpc 2016-09-27 09:44:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933


Note You need to log in before you can comment on or make changes to this bug.