Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 1692566

Summary: Master should not use port tcp/9 check to check if node is healthy for automatically assigning egress IP
Product: OpenShift Container Platform Reporter: Ryan Howe <rhowe>
Component: NetworkingAssignee: Casey Callendrello <cdc>
Status: CLOSED NOTABUG QA Contact: Meng Bo <bmeng>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.11.0CC: aos-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-28 15:38:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Ryan Howe 2019-03-25 21:59:54 UTC
Description of problem:

The check if the node is healthy or not is a plain tcp check on port 9. If we return with a "No route to host" vs "Connection refused" this check considers the node not healthy. 

https://github.com/openshift/origin/blob/release-3.11/pkg/network/common/egressip.go#L473-L496

# telnet 10.0.88.167 9
Trying 10.0.88.167...
telnet: connect to address 10.0.88.167: No route to host


Some firewall are set up to reject-with icmp-host-unreachable or drop traffic to unopen ports. Resulting in the egress controller thinking this node is unhealthy and moving the egress IP. 

If we do a check we should rely on the kubelet port which we know should be open.

Comment 1 Ryan Howe 2019-03-28 15:38:05 UTC
This is not a bug the traffic will arrive
over tun0 and be accepted by the "-A INPUT -i tun0 -j ACCEPT" rule
before hitting the RHEL firewall rule.


https://github.com/openshift/origin/blob/release-3.11/pkg/network/common/egressip.go#L463-L481