Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1356542 - pam_sss fails to close files
Summary: pam_sss fails to close files
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: sssd
Version: 6.8
Hardware: All
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: SSSD Maintainers
QA Contact: Steeve Goveas
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-07-14 10:48 UTC by Anand Buddhdev
Modified: 2016-10-19 08:15 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-19 08:15:48 UTC


Attachments (Terms of Use)

Description Anand Buddhdev 2016-07-14 10:48:03 UTC
Description of problem:

pam_sss.so opens /var/lib/sss/mc/passwd, /var/lib/sss/mc/group and /var/lib/sss/mc/initgroups, but does not close the file. When sssd refreshes these files, the old ones held open are not deleted from disk, and cause the disk to eventually fill up.

Version-Release number of selected component (if applicable):

1.13.3-22

How reproducible:

Always

Steps to Reproduce:
1. Enable sssd on a CentOS 6.8 system and enable LDAP auth
2. Log into and out of the server a few times with SSH
3. Finally log in again, and become root
4. Run "lsof -nP | fgrep passwd" | fgrep deleted | head

Actual results:

vmtoolsd   1130      root    3r      REG              253,2    8406312       6275 /var/lib/sss/mc/passwd (deleted)
VGAuthSer  1167      root    3r      REG              253,2    8406312       6275 /var/lib/sss/mc/passwd (deleted)
nslcd      1330     nslcd    4r      REG              253,2    8406312       6275 /var/lib/sss/mc/passwd (deleted)

Expected results:

We should not see such files hanging around. A server that sees a lot of login activity, for example with ssh, keep accumulating such deleted-but-held-open files, and eventually causes /var to fill up, even though it's not immediately obvious what is causing it to fill up.

Additional info:

Comment 1 Lukas Slebodnik 2016-07-14 11:02:50 UTC
(In reply to Anand Buddhdev from comment #0)
> Description of problem:
> 
> pam_sss.so opens /var/lib/sss/mc/passwd, /var/lib/sss/mc/group and
> /var/lib/sss/mc/initgroups, but does not close the file. When sssd refreshes
> these files, the old ones held open are not deleted from disk, and cause the
> disk to eventually fill up.
> 
> Version-Release number of selected component (if applicable):
> 
> 1.13.3-22
> 
> How reproducible:
> 
> Always
> 
> Steps to Reproduce:
> 1. Enable sssd on a CentOS 6.8 system and enable LDAP auth
> 2. Log into and out of the server a few times with SSH
> 3. Finally log in again, and become root
> 4. Run "lsof -nP | fgrep passwd" | fgrep deleted | head
> 
> Actual results:
> 
> vmtoolsd   1130      root    3r      REG              253,2    8406312      
> 6275 /var/lib/sss/mc/passwd (deleted)
> VGAuthSer  1167      root    3r      REG              253,2    8406312      
> 6275 /var/lib/sss/mc/passwd (deleted)
> nslcd      1330     nslcd    4r      REG              253,2    8406312      
> 6275 /var/lib/sss/mc/passwd (deleted)
> 
> Expected results:
> 
> We should not see such files hanging around. A server that sees a lot of
> login activity, for example with ssh, keep accumulating such
> deleted-but-held-open files, and eventually causes /var to fill up, even
> though it's not immediately obvious what is causing it to fill up.
> 
> Additional info:

A) it isn't a pam_sss.so but libnss_sss.so. Which is loaded by glibc for every process which requested identity information from sssd.

B) libnss_sss.so does not create files in /var/lib/sss/mc/. It just open these files.

C) If you can see string "(deleted)" in output of lsof then it just mean that
file was removed on disk. The sssd client (libnss_sss.so) will detect removed file and reopen files with next request (getpwnam, getgrnam ...) So /var/ will not be fill up.

BTW Files in /var/lib/sss/mc will usually be removed after restarting of sssd. and I do not expect that sssd is restarted every minute.

Comment 3 Lukas Slebodnik 2016-07-14 15:38:37 UTC
(In reply to Lukas Slebodnik from comment #1)
> B) libnss_sss.so does not create files in /var/lib/sss/mc/. It just open
> these files.
> 
> C) If you can see string "(deleted)" in output of lsof then it just mean that
> file was removed on disk. The sssd client (libnss_sss.so) will detect
> removed file and reopen files with next request (getpwnam, getgrnam ...) So
> /var/ will not be fill up.
> 
I am 100% sure you that files are reopened with next nss request
and /var cannot be filled up.

Bug was fixed in 1.13.2
https://fedorahosted.org/sssd/ticket/2726.

If you thing it does not work and /var can be filled up then please provide
a reproducer. Otherwise I will close the ticket as NOT A BUG in a week.

Comment 4 Lukas Slebodnik 2016-07-27 13:40:36 UTC
If you do not agree with statement in comment 3 then please provide a reproducer.

Comment 5 Jakub Hrozek 2016-08-03 20:14:12 UTC
There was no reply in this bug for ~3 weeks. I'm going to close it. Please reopen if you disagree with Lukas' explanation.

Comment 6 Phil Anderson 2016-09-05 05:43:36 UTC
I'm getting the same behaviour.  Sure, each process only holds one handle open and probably re-opens it when at the next request, but is the design of sssd really such that every process holds 8MB+ space on disk in /var while running?  Seems like a bug to me.

Comment 7 Lukas Slebodnik 2016-09-05 07:36:03 UTC
It is very unlikely that every process "holds 8MB+". They would share the same deleted file.
Unless you restart sssd after starting any process and each process calls "getpwnam, getgrnam .." just once and therefore cannot reopen memory cache.

BTW please do not comment closed BZ if you do not want to reopen them.
And if you still think it's a bug then please provide detailed steps for reproducing.

Comment 8 Phil Anderson 2016-09-05 07:46:03 UTC
Although there were 1000 or so processes showing deleted passwd files open, I looked closer at the lsof output and you are right - there is only a hand full unique nodes, so ignore me.

Comment 9 Anand Buddhdev 2016-09-30 14:04:04 UTC
Hello Lukas,

Apologies for this late reply. But the problem still persists for us, and I have more information.

> A) it isn't a pam_sss.so but libnss_sss.so. Which is loaded by glibc for
> every process which requested identity information from sssd.
> 
> B) libnss_sss.so does not create files in /var/lib/sss/mc/. It just open
> these files.
> 
> C) If you can see string "(deleted)" in output of lsof then it just mean that
> file was removed on disk. The sssd client (libnss_sss.so) will detect
> removed file and reopen files with next request (getpwnam, getgrnam ...) So
> /var/ will not be fill up.
> 
> BTW Files in /var/lib/sss/mc will usually be removed after restarting of
> sssd. and I do not expect that sssd is restarted every minute.

On our servers, there is a Nagios check that runs every few minutes and does:

/usr/sbin/sss_cache -u $USER

This causes sssd to recreate /var/lib/sss/mc/passwd, /var/lib/sss/mc/group and /var/lib/sss/mc/initgroups. This server has many hundreds of SSH logins, and each SSH process calls getpwnam, so it opens those files for reading. The SSH process is long-running, so those files are held open. In the meantime, sss_cache is being called frequently, so each new SSH process is opening NEW files in /var/lib/sss/mc and the older SSH processes are holding open deleted files.

This is the reason that the /var partition fills up. The only way we can fix this is to reboot the server when /var is full, and then after the reboot, all is well. But slowly, /var starts filling up as the server gets more and more SSH connections.

When I use lsof, I can see all these deleted files, and each one has a different inode.

Comment 11 Lukas Slebodnik 2016-10-19 08:15:48 UTC
(In reply to Anand Buddhdev from comment #9)
> Hello Lukas,
> 
> Apologies for this late reply. But the problem still persists for us, and I
> have more information.
> 
> > A) it isn't a pam_sss.so but libnss_sss.so. Which is loaded by glibc for
> > every process which requested identity information from sssd.
> > 
> > B) libnss_sss.so does not create files in /var/lib/sss/mc/. It just open
> > these files.
> > 
> > C) If you can see string "(deleted)" in output of lsof then it just mean that
> > file was removed on disk. The sssd client (libnss_sss.so) will detect
> > removed file and reopen files with next request (getpwnam, getgrnam ...) So
> > /var/ will not be fill up.
> > 
> > BTW Files in /var/lib/sss/mc will usually be removed after restarting of
> > sssd. and I do not expect that sssd is restarted every minute.
> 
> On our servers, there is a Nagios check that runs every few minutes and does:
> 
> /usr/sbin/sss_cache -u $USER
> 
There's no reason for running this command periodically.

sssd fetches current group membership with each authentication.


Note You need to log in before you can comment on or make changes to this bug.