Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 156397 - LTC13414-32-bit ping6 on 64-bit kernel not working
Summary: LTC13414-32-bit ping6 on 64-bit kernel not working
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: David Woodhouse
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks: 168424
TreeView+ depends on / blocked
 
Reported: 2005-04-29 18:59 UTC by Issue Tracker
Modified: 2007-11-30 22:07 UTC (History)
5 users (show)

Fixed In Version: RHSA-2006-0144
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-03-15 15:57:43 UTC


Attachments (Terms of Use)
Patch. (deleted)
2005-10-19 11:48 UTC, David Woodhouse
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0144 qe-ready SHIPPED_LIVE Moderate: Updated kernel packages available for Red Hat Enterprise Linux 3 Update 7 2006-03-15 05:00:00 UTC

Description Issue Tracker 2005-04-29 18:59:35 UTC
Escalated to Bugzilla from IssueTracker

Comment 9 David Howells 2005-10-13 14:40:20 UTC
Okay... I borrowed an x86_64 machine that had RHEL-3 installed. The ping6 
installed by default is 64-bit and works. If I stick a 32-bit i386 ping6 on 
there, that doesn't work, just like with ppc32/ppc64. 
 
It may still be in the arch 32->64 bit translation, since it's mostly the same 
for both archs. 

Comment 10 David Howells 2005-10-13 16:29:41 UTC
I used gdb to examine the parameters supplied to recvmsg() in userspace 
[strace won't show them unless the syscall returns successfully]: 
 
Breakpoint 1, 0x0ff4b8ec in recvmsg () from /lib/tls/libc.so.6 
(gdb) i r $r3 $r4 $r5 
r3             0x6              6               [arg 0: sockfd] 
r4             0xffffd618       4294956568      [arg 1: msg] 
r5             0x0              0               [arg 2: flags] 
 
(gdb) x/7 $r4   [struct msghdr *msg] 
0xffffd618:     0xffffc598  [msg_name] 
                0x00000080  [msg_namelen == 128] 
                0xffffd640  [msg_iov] 
                0x00000001  [msg_iovlen] 
0xffffd628:     0xffffc618  [msg_control] 
                0x00001000  [msg_controllen == 4096] 
                0x00000000  [msg_flags] 
 
(gdb) x/2 0xffffd640   [struct iovec *msg->msg_iovlen] 
0xffffd640:     0x1003a008  [iov_base] 
                0x00001070  [iov_len == 4208] 
 
(gdb) fini 
Run till exit from #0  0x0ff4b8ec in recvmsg () from /lib/tls/libc.so.6 
0x10003b14 in ?? () 
 
The parameters here look reasonable, and in any case, the syscall isn't 
returning EINVAL or EFAULT. 
 
I instrumented sys_recvmsg32() in the ppc64 kernel: 
 
+ 
+               printk("recvmsg(,{%p,%d,%p,%lu,%p,%lu,%x},,,)\n", 
+                      kern_msg.msg_name, kern_msg.msg_namelen, 
+                      kern_msg.msg_iov, (unsigned long) kern_msg.msg_iovlen, 
+                      kern_msg.msg_control, (unsigned long) 
kern_msg.msg_controllen, 
+                      kern_msg.msg_flags); 
+               printk("iov[0] = {%p,%lu}\n", 
+                      kern_msg.msg_iov[0].iov_base, 
+                      kern_msg.msg_iov[0].iov_len); 
+               printk("recvmsg(,,%d,%x,)\n", total_len, user_flags); 
+ 
                err = sock->ops->recvmsg(sock, &kern_msg, total_len, 
                                         user_flags, &scm); 
+ 
+               printk("recvmsg() = %d\n", err); 
+ 
 
Which gave results that look exactly like the userspace results, except where 
the addressed objects have been teleported to kernelspace: 
 
recvmsg(,{c00000000e0afb40,128,c00000000e0afa80,1,00000000ffffc618,4096,0},,,) 
iov[0] = {000000001003a008,4208} 
recvmsg(,,4208,0,) 
recvmsg() = -11 [EAGAIN] 
 

Comment 11 David Howells 2005-10-13 17:27:49 UTC
I've instrumented sys_recvmsg() too, to see how the parameters given to the 
64-bit ping64 are arrayed when passed on to the protocol handler: 
 
ping6: recvmsg64() 
recvmsg64(,
{c00000000ea4baa8,128,c00000000ea4b9e0,1,000001ff7fffe2a0,4096,0},,,) 
iov64[0] = {000000001003b010,4208} 
recvmsg64(,,4208,0,) 
recvmsg64() = 64 
64 bytes from fec0:ac10:1269:4242:20e:a6ff:fe20:4978: icmp_seq=0 ttl=64 
time=0.566 ms 
ping6: recvmsg64() 
recvmsg64(,
{c00000000ea4baa8,128,c00000000ea4b9e0,1,000001ff7fffe2a0,4096,0},,,) 
iov64[0] = {000000001003b010,4208} 
recvmsg64(,,4208,40,) 
recvmsg64() = -11 
 
Note that the first call to recvmsg() looks almost identical to the 32-bit 
version, apart from the fact that it returns successfully. The second call has 
an extra flag set (MSG_DONTWAIT I think), and fails with EAGAIN, but this 
seems reasonable as I think it's just to clean up extra copies of the ping 
reply. 

Comment 12 Ernie Petrides 2005-10-13 21:13:25 UTC
Fixing "hardware" field.

Comment 13 David Howells 2005-10-14 09:24:50 UTC
Fixing "hardware" field back again. 

Comment 14 David Woodhouse 2005-10-19 11:29:29 UTC
The problem here seems to be that the 32-bit compatibility setsockopt() in ppc64
and x86_64 manually swaps each pair of 32-bit words in the 'struct icmp6_filter'
argument when setting a filter....

        for (i = 0; i < 8; i += 2) {
                u32 tmp = kfilter.data[i];

                kfilter.data[i] = kfilter.data[i + 1];
                kfilter.data[i + 1] = tmp;
        }

I don't quite understand why it's doing this, but just bypassing it and letting
sockopt(SOL_ICMPV6, ICMPV6_FILTER)  through unmangled appears to make ping6 work
correctly. This also seems at first glance to match what the 2.6 kernel does.

It seems strange that someone added code for doing a conversion which is
entirely gratuitous though -- I need to double-check that removing it is really
the correct thing to do.

Comment 15 David Woodhouse 2005-10-19 11:38:54 UTC
Definitely looks like it can go. Here's the patch where the offending code was
removed from 2.6:

http://www.kernel.org/git/?p=linux/kernel/git/tglx/history.git;a=commit;h=531066f2b238b5aef235be9027fa3464f6b2d125

I'll generate a patch for 2.4 to remove the various instances of the same
conversion.


Comment 16 David Woodhouse 2005-10-19 11:48:34 UTC
Created attachment 120155 [details]
Patch.

Appears to affect only x86_64 and ppc64 (of the platforms we care about for
RHEL3).

Comment 17 Ernie Petrides 2005-10-27 02:21:09 UTC
A fix for this problem has just been committed to the RHEL3 U7
patch pool this evening (in kernel version 2.4.21-37.7.EL).


Comment 23 Red Hat Bugzilla 2006-03-15 15:57:43 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0144.html



Note You need to log in before you can comment on or make changes to this bug.