Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1514854 - httpd does not serve iPXE files [NEEDINFO]
Summary: httpd does not serve iPXE files
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: instack-undercloud
Version: 11.0 (Ocata)
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: ---
Assignee: RHOS Maint
QA Contact: mlammon
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-18 23:17 UTC by Siggy Sigwald
Modified: 2018-01-22 16:09 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-01-22 16:09:31 UTC
bfournie: needinfo? (ssigwald)


Attachments (Terms of Use)

Description Siggy Sigwald 2017-11-18 23:17:50 UTC
Description of problem:

What is working:
*) We can get the IP address from DHCP server.

What is not working:
*) httpd ignores the packets coming on port 8088 from remote nodes, it works locally only. 

What we found out is that the httpd is not serving the requests sent to the server:

[root@fsldir01 ~]# tcpdump -i br-ctlplane -nn port 8088
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on br-ctlplane, link-type EN10MB (Ethernet), capture size 262144 bytes
21:23:37.120568 IP 192.168.24.100.53509 > 192.168.24.10.8088: Flags [S], seq 828634003:828634017, win 65532, options [nop,nop,TS val 11523 ecr 0,nop,nop,sackOK,nop,nop,nop,nop,mss 1460], length 14
21:23:37.496464 IP 192.168.24.100.53509 > 192.168.24.10.8088: Flags [S], seq 828634003:828634017, win 65532, options [nop,nop,TS val 11530 ecr 0,nop,nop,sackOK,nop,nop,nop,nop,mss 1460], length 14
21:23:38.265415 IP 192.168.24.100.53509 > 192.168.24.10.8088: Flags [S], seq 828634003:828634017, win 65532, options [nop,nop,TS val 11544 ecr 0,nop,nop,sackOK,nop,nop,nop,nop,mss 1460], length 14
21:23:39.803376 IP 192.168.24.100.53509 > 192.168.24.10.8088: Flags [S], seq 828634003:828634017, win 65532, options [nop,nop,TS val 11572 ecr 0,nop,nop,sackOK,nop,nop,nop,nop,mss 1460], length 14
21:23:42.879222 IP 192.168.24.100.53509 > 192.168.24.10.8088: Flags [S], seq 828634003:828634017, win 65532, options [nop,nop,TS val 11628 ecr 0,nop,nop,sackOK,nop,nop,nop,nop,mss 1460], length 14
^C
5 packets captured
5 packets received by filter
0 packets dropped by kernel

Locally this works:

$ curl http://192.168.24.10:8088/inspector.ipxe

Also we tried with disabled SELinux, to make sure we have also reset the SELinux context before with Enforcing setup:

# restorecon -Rv /httpboot/

And changed the permissions:

# chown -R ironic:ironic /httpboot/

We have updated the ironic.conf file with:

    [pxe]
    pxe_config_template=$pybasedir/drivers/modules/ipxe_config.template
    uefi_pxe_config_template=$pybasedir/drivers/modules/ipxe_config.template
    pxe_bootfile_name=undionly.kpxe
    uefi_pxe_bootfile_name=ipxe.efi
    ipxe_enabled=True
    ipxe_timeout=60
    http_root=/httpboot
    http_url=http://$my_ip:8088

/etc/ironic-inspector/dnsmasq.conf is configured with ipxe.

We have verified together the configuration for /etc/httpd/conf.d/10-ipxe_vhost.conf is correct and is listening on all interfaces on 8088:

[root@undercloud ~]# ss -tulnp | grep 8088
tcp    LISTEN     0      128      :::8088                 :::*                   users:(("httpd",pid=6128,fd=14),("httpd",pid=6127,fd=14),("httpd",pid=6126,fd=14),("httpd",pid=6125,fd=14),("httpd",pid=6124,fd=14),("httpd",pid=6123,fd=14),("httpd",pid=6122,fd=14),("httpd",pid=6121,fd=14),("httpd",pid=6106,fd=14))


Version-Release number of selected component (if applicable):

openstack-ironic-api-7.0.2-2.el7ost.noarch
openstack-ironic-common-7.0.2-2.el7ost.noarch
openstack-ironic-conductor-7.0.2-2.el7ost.noarch
openstack-ironic-inspector-5.0.1-2.el7ost.noarch
puppet-ironic-10.4.1-1.el7ost.noarch
python-ironic-inspector-client-1.11.0-1.el7ost.noarch
python-ironic-lib-2.5.2-1.el7ost.noarch
python-ironicclient-1.11.1-1.el7ost.noarch

How reproducible:
Every time

Comment 1 Bob Fournier 2017-11-19 23:04:15 UTC
As Andreas noted in the case, can you check the iptables output to verify traffic from this mac address is not being blocked (although if you are capturing the SYN request from this mac it should be getting through).

Can you check for any errors logged in /var/log/httpd/ or in journal?

Comment 2 Dmitry Tantsur 2017-11-20 09:30:31 UTC
Hi!

Ironic does not manage the httpd instance used for iPXE, it's just a normal Apache installed by instack-undercloud. Changing the component to instack-undercloud until we have a better suspect. Please do not modify ironic.conf, it is extremely unlikely have any effect on your situation, but it may break you later.

Can you download the files from any other machine on the network, not overcloud nodes? If yes, check that STP is not enabled on the leaf switch ports corresponding to nodes - https://docs.openstack.org/ironic/latest/admin/troubleshooting.html#dhcp-during-pxe-or-ipxe-is-inconsistent-or-unreliable. Please make sure you don't have MTU inconsistencies anywhere along the path. And +1 to double-checking the firewall and selinux logs.

Comment 3 Bob Fournier 2017-11-21 11:23:40 UTC
According to Sai - the networking team says the TCP SYN checksum on IBM node is incorrect.  We are waiting for customer to respond back after disabling tx and rx checksum offload, to see if that helps.

Comment 10 Siggy Sigwald 2017-12-07 16:27:51 UTC
Hi,
Customer came back after upgrading firmware for his nic cards he stil has isseus. Please se last 2 images attached.

Comment 11 Bob Fournier 2017-12-08 14:47:34 UTC
Thanks Siggy.  From the screenshot in c8, it looks like the situation has regressed from prior to the NIC firmware update.  iPXE is downloaded but cannot use the NIC and its report "Link status: unknown".  NIC is showing as an Emulex OCl11102-F-X Virtual Fabric Adapter 2-port 10GB LOM running F/W 11.2.1193.

So either:
1) Cable or switch port are faulty (unlikely as iPXE was able to be downloaded)
2) The switch port is in SpanningTree Listening/Learning (possible but customer has confirmed STP is disabled)
3) The NIC is misconfigured or its configuration is incompatible with the iPXE version (probable)

In either case there isn't anything Director can do here.

Some questions:
1) Is this the only system under test or have other similar systems either successfully  PXE booted or also failed?
2) This NIC is highly configurable and is currently configured for Advanced mode. Has any configuration changes been attempted and have they had any affect? Is its configuration the same as prior to the upgrade, when iPXE was able to establish a connection to Director?
3) Would it be possible to re-attempt an alternate boot method as was done in the case and confirm the system can boot when not using iPXE?

Comment 12 Bob Fournier 2017-12-13 23:33:20 UTC
Adding needinfo for previous questions.

Comment 13 Bob Fournier 2018-01-22 14:29:21 UTC
Any update on this?  Its been 6 weeks since we requested more info.

Comment 14 Dave Maley 2018-01-22 16:09:31 UTC
case has closed so closing bz


Note You need to log in before you can comment on or make changes to this bug.