Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 453961 - cluster-snmp deadlocks snmpd
Summary: cluster-snmp deadlocks snmpd
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: clustermon
Version: 4
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Ryan McCabe
QA Contact: Cluster QE
: 484880 (view as bug list)
Depends On:
TreeView+ depends on / blocked
Reported: 2008-07-03 14:58 UTC by Ryan McCabe
Modified: 2009-05-25 20:34 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2009-05-25 20:34:15 UTC

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:1064 normal SHIPPED_LIVE clustermon bug-fix update 2009-05-25 20:34:12 UTC

Description Ryan McCabe 2008-07-03 14:58:45 UTC
+++ This bug was initially created as a clone of Bug #453600 +++

Description of problem:
The SNMPD plugin for clustersuite uses the
ClusterMonitoring::ClusterMonitor::get_cluster() method to retrieve the cluster

This in turn calls ClientSocket::recv() -> read_restart().

The read_restart function is designed to fill a buffer with all data currently
buffered on the socket and to return when the underlying read() returns with EAGAIN.

This will only work if the socket has O_NONBLOCK set. Using this method on a
blocking socket will cause the thread calling get_cluster() to block
indefinitely waiting for additional data to arrive on the socket.

Version-Release number of selected component (if applicable):
0.10.0-5.el5 contains the defect but it is masked by bug 441947; rebuilding the
package to avoid the dlopen problem or using a later package (e.g. 0.12.0-7.el5)
allows the bug to be triggered.

How reproducible:

Steps to Reproduce:
1. Configure a cluster with snmpd enabled on the nodes
2. Enable cluster-snmp
3. Try to access a REDHAT-CLUSTER-MIB MIB, e.g. REDHAT-CLUSTER-MIB::rhcMIBVersion.0
Actual results:
$ cat /etc/snmp/snmpd.conf
dlmod RedHatCluster     /usr/lib/cluster-snmp/
rocommunity public
$ snmpwalk -v2c -c public localhost
[tons of output, works fine but doesn't show REDHAT-CLUSTER-MIB::RedHatCluster]
$ snmpwalk -v2c -c public localhost REDHAT-CLUSTER-MIB::RedHatCluster
Timeout: No Response from localhost
$ snmpwalk -v2c -c public localhost
Timeout: No Response from localhost

After this snmpd can only be interrupted by SIGKILL.

Expected results:
MIB output correctly, no hang of snmpd.

Additional info:
Analysis & proposed patch from Adrien Kunysz

-- Additional comment from on 2008-07-01 10:41 EST --
Created an attachment (id=310677)
Set sock.nonblocking(true) in ClusterMonitor::get_cluster()

-- Additional comment from on 2008-07-01 11:27 EST --
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update

-- Additional comment from on 2008-07-03 10:56 EST --
Thanks for the patch. Applied to the current CVS trees.

Comment 1 Ryan McCabe 2009-02-17 20:33:26 UTC
*** Bug 484880 has been marked as a duplicate of this bug. ***

Comment 4 Brian Brock 2009-04-28 21:20:55 UTC
fix verified in clustermon-0.11.2-1.el4

I can run the snmpwalk commands above, and do not see any hangs or timeouts.

Comment 6 errata-xmlrpc 2009-05-25 20:34:15 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.