Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 178131 - syslog-only netdump still tries to dump memory
Summary: syslog-only netdump still tries to dump memory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Dave Anderson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: RHEL3U8CanFix
TreeView+ depends on / blocked
 
Reported: 2006-01-17 22:07 UTC by Bryan Mason
Modified: 2007-11-30 22:07 UTC (History)
3 users (show)

Fixed In Version: RHSA-2006-0437
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-07-20 13:41:47 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0437 normal SHIPPED_LIVE Important: Updated kernel packages for Red Hat Enterprise Linux 3 Update 8 2006-07-20 13:11:00 UTC

Description Bryan Mason 2006-01-17 22:07:28 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7

Description of problem:
According to /etc/sysconfig/netdump,

# Alternatively, to merely syslog all messages without doing network
# crash dumps, you can set only SYSLOGADDR and leave NETDUMPADDR unset.
# You can also set both. 

However, if this configuration is used,  the crash signature will successfully be sent to the netdump-sever, but then it seems like netdump still wants to dump memory contents.  The crashed system will hang on the message "< netdump activated - performing handshake with the server . >" and not reboot.

Version-Release number of selected component (if applicable):
netdump-0.7.7-2 kernel-2.4.21-37.EL

How reproducible:
Always

Steps to Reproduce:
On the server: 
1. Reconfigure syslog to accept remote messages (set SYSLOGD_OPTIONS="-m 0 -r" in /etc/sysconfig/syslog).
2. Restart syslog.
3. Start netdump-server (or not, this seems to make no difference in client behavior).

On the client system to be crashed:
1. Set SYSLOGADDR=<netdump-server-IP> in /etc/sysconfig/netdump.  Do not change anything else from default.
2. Start netdump
3. Crash the client by running (echo "1" > /proc/sys/kernel/sysrq; echo "c" > /proc/sysrq-trigger)  

Actual Results:  Crashed system sends crash signature to netdump-server and then hangs with the message "< netdump activated - performing handshake with the server . >" on the console.

Expected Results:  Crashed system should send crash signature to netdump-sever and then reboot.

Additional info:

Additional information from Dave Anderson:

It's a RHEL3 kernel bug.

The kicking off of the netdump operation is predicated by the netconsole module pre-registering its "netdump_netdump" function into the kernel's "netdump_func" pointer when the module is loaded.

Looking at the module's init_netconsole() function, it does the registration regardless of the NETDUMPADDR or SYSLOGADDR arguments passed in.  Here's the end of init_netconsole(), where if the platform supports it, it does the registration regardless of the arguments passed in:

        if (platform_supports_netdump) {
                if (netdump_register_hooks(netconsole_rx,
                                              netconsole_receive_skb,
                                              netconsole_netdump)) {
                        printk("netdump: failed to register hooks.\n");
                }
        }
        netconsole_dev = ndev;
#define STARTUP_MSG "[...network console startup...]\n"
        write_netconsole_msg(NULL, STARTUP_MSG, strlen(STARTUP_MSG));

        register_console(&netconsole);
        printk(KERN_INFO "netlog: network logging started up successfully!\n");
        return 0;
}

It should pass a NULL as the 3rd argument if no netdump target addresses were passed in.

For example, note that the kernel's netdump_func starts out life as a NULL:

# crash
...
crash> p netdump_func
netdump_func = $1 = (void (*)(struct pt_regs *)) 0
crash>

Here's a netdump session after setting only the SYSLOGADDR in /etc/netdump.config:

# service netdump start
initializing netdump                                       [  OK  ]
# tail -10 /var/log/messages
Jan 17 09:10:59 crash netdump:: inserting netconsole module with arguments magic1=0x11111111 magic2=0x11111111 dev=eth0
source_port=6666 syslog_target_ip=0xAC105012 syslog_target_port=514 syslog_target_eth_byte0=0x00
syslog_target_eth_byte1=0x30 syslog_target_eth_byte2=0x6E syslog_target_eth_byte3=0x1E syslog_target_eth_byte4=0xFE
syslog_target_eth_byte5=0x40
Jan 17 09:10:59 crash kernel: netlog: using network device <eth0>
Jan 17 09:10:59 crash kernel: netlog: using source IP 172.16.80.17
Jan 17 09:10:59 crash kernel: netlog: using source UDP port: 6666
Jan 17 09:10:59 crash kernel: netlog: using syslog target IP 172.16.80.18, port: 514
Jan 17 09:10:59 crash kernel: netlog: using broadcast ethernet frames to send netdump packets.
Jan 17 09:10:59 crash kernel: netlog: using broadcast ethernet frames to send netdump packets.
Jan 17 09:10:59 crash kernel: netlog: using syslog target ethernet address 00:30:6e:1e:fe:40.
Jan 17 09:10:59 crash kernel: netlog: network logging started up successfully!
Jan 17 09:10:59 crash netdump: initializing netdump succeeded
[root@crash root]#

If NETDUMPADDR had been set, you'd see a bunch of "netdump_target_eth_byte*" values above in the "inserting" message.

But looking at the kernel's netdump_func pointer, you can see it has been set:

# crash
...
crash> p netdump_func
netdump_func = $1 = (void (*)(struct pt_regs *)) 0xe2d8e710
crash> sym 0xe2d8e710
e2d8e710 (t) netconsole_netdump
crash>

Comment 1 Dave Anderson 2006-01-17 22:36:31 UTC
Changed component to "kernel" from "netdump", as it's a bug with
the netconsole kernel module.

Linda -- can you link this to RHEL3-U8 and give it a devel_ack?

Comment 3 Dave Anderson 2006-01-20 17:05:01 UTC
Please give a qa_ack+ to this BZ.

QA procedure:

1. Set up only SYSLOGADDR in /etc/sysconfig/netdump, pointing to a remote
   system who's /etc/sysconfig/syslog file has the "-r" flag turned on.
2. Do a "service netdump start".
3. Crash the system with alt-sysrq-c or "echo c > /proc/sysrq-trigger".
4. Verify: 
   - the oops message made it to the remote /var/log/messages.
   - the client did not attempt to do a netdump operation.


Comment 4 Ernie Petrides 2006-02-18 00:23:05 UTC
A fix for this problem has just been committed to the RHEL3 U8
patch pool this evening (in kernel version 2.4.21-40.2.EL).


Comment 8 Red Hat Bugzilla 2006-07-20 13:41:48 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0437.html



Note You need to log in before you can comment on or make changes to this bug.