Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1364144 - performance issue related to compat-plugin while adding member to a large static group.
Summary: performance issue related to compat-plugin while adding member to a large sta...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: slapi-nis
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Alexander Bokovoy
QA Contact: Kaleem
URL:
Whiteboard:
Depends On:
Blocks: 1467835
TreeView+ depends on / blocked
 
Reported: 2016-08-04 14:27 UTC by German Parente
Modified: 2018-10-23 09:10 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)
internal operations between ADD and RESULT (deleted)
2016-10-10 15:35 UTC, German Parente
no flags Details

Description German Parente 2016-08-04 14:27:20 UTC
Description of problem:

Prasad has done a great analysis of the issue I am describing here.

Customer has a group of 1400 members more or less but plenty of nested groups. This group is member of 13 other groups.



Adding a new member to this group takes 53 seconds. We have asked for pstacks and we see this:

=====================================
Thread 21 (Thread 0x7fbad9ffb700 (LWP 23197)):
#0  0x00007fbb16f081a7 in slapi_log_access () from /usr/lib64/dirsrv/libslapd.so.0
#1  0x00007fbb16f16bea in op_shared_search () from /usr/lib64/dirsrv/libslapd.so.0
#2  0x00007fbb16f278de in search_internal_callback_pb () from /usr/lib64/dirsrv/libslapd.so.0
#3  0x00007fbb07dd27bd in wrap_search_internal_get_entry () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#4  0x00007fbb07dc7287 in format_maybe_add_sdn_list () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#5  0x00007fbb07dc9875 in format_deref_rx.isra.6 () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#6  0x00007fbb07dc9e79 in format_deref_r () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#7  0x00007fbb07dc7e19 in format_expand () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#8  0x00007fbb07dc8f66 in format_format () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#9  0x00007fbb07dcab79 in format_get_data_set () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#10 0x00007fbb07dccd43 in format_collect () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#11 0x00007fbb07dc7e19 in format_expand () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#12 0x00007fbb07dcd3b1 in format_ifeq () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#13 0x00007fbb07dc7e19 in format_expand () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#14 0x00007fbb07dc8f66 in format_format () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#15 0x00007fbb07dcab79 in format_get_data_set () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#16 0x00007fbb07dcc39b in format_link () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#17 0x00007fbb07dc7e19 in format_expand () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#18 0x00007fbb07dc8f66 in format_format () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#19 0x00007fbb07dcab79 in format_get_data_set () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#20 0x00007fbb07dbd459 in backend_set_entry_from () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#21 0x00007fbb07dc0b16 in backend_shr_set_entry_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#22 0x00007fbb16f3858d in send_ldap_search_entry_ext () from /usr/lib64/dirsrv/libslapd.so.0
#23 0x00007fbb16f38dcc in send_ldap_search_entry () from /usr/lib64/dirsrv/libslapd.so.0
#24 0x00007fbb16f15ad3 in iterate.isra.0.constprop.3 () from /usr/lib64/dirsrv/libslapd.so.0
#25 0x00007fbb16f15c7a in send_results_ext.constprop.2 () from /usr/lib64/dirsrv/libslapd.so.0
#26 0x00007fbb16f1765e in op_shared_search () from /usr/lib64/dirsrv/libslapd.so.0
#27 0x00007fbb16f278de in search_internal_callback_pb () from /usr/lib64/dirsrv/libslapd.so.0
#28 0x00007fbb07dc29e6 in backend_shr_update_references_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#29 0x00007fbb07dd069f in map_data_foreach_map () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#30 0x00007fbb07dc0dfb in backend_shr_update_references () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#31 0x00007fbb07dc43f1 in backend_shr_modify_cb.part.20 () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#32 0x00007fbb07dc47d1 in backend_shr_betxn_post_modify_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#33 0x00007fbb16f227d8 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0
#34 0x00007fbb16f22aa3 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0
#35 0x00007fbb0a1d43c9 in ldbm_back_modify () from /usr/lib64/dirsrv/plugins/libback-ldbm.so
#36 0x00007fbb16f104e1 in op_shared_modify () from /usr/lib64/dirsrv/libslapd.so.0
#37 0x00007fbb16f1182f in do_modify () from /usr/lib64/dirsrv/libslapd.so.0
#38 0x00007fbb173f5320 in connection_threadmain ()
#39 0x00007fbb1510496b in _pt_root () from /lib64/libnspr4.so
#40 0x00007fbb14aa5dc5 in start_thread () from /lib64/libpthread.so.0
#41 0x00007fbb147d2ced in clone () from /lib64/libc.so.6
=====================================

The quantity of total internal operations is more or less:

=====================================
[20/Jul/2016:15:41:58 -0500] conn=180 op=5 MOD dn="cn=role_il_users,cn=groups,cn=accounts,dc=dev,dc=fpd,dc=cat,dc=com"
[20/Jul/2016:15:42:54 -0500] conn=180 op=5 RESULT err=0 tag=103 nentries=0 etime=56 csn=578fe293000300040000

in between we have 408302 operations !

most of them taking less than a second and searching only the user entry. Except one taking 34 secs. but we cannot match internal op and its result (a RFE was already logged)
======================================


So, we decide to disable compat plugin:

# ipa-compat-manage disable

and the command is completed in only 4 seconds from cli side but from ldap side:

[03/Aug/2016:12:05:27 -0500] conn=14 op=5 MOD dn="cn=role_il_users,cn=groups,cn=accounts,dc=dev,dc=fpd,dc=cat,dc=com"
[03/Aug/2016:12:05:27 -0500] conn=14 op=5 RESULT err=0 tag=103 nentries=0 etime=0 csn=57a224de001200040000

The quantity of internal ops is 57.

Version-Release number of selected component (if applicable):

slapi-nis-0.54-9.el7_2.x86_64


How reproducible: always


Steps to Reproduce: we could re-generated customer scenario if needed.

We could give more information on the operations by uploading customer access logs if needed.

Regards,

Prasad and German.

Comment 2 thierry bordaz 2016-08-04 16:52:03 UTC
Some comments:

In fact one MOD triggers 400K internal searches. 
 * The vast majority of them are base search of user/host entries.
 * 60% of those entries are searched > 100 times. 
 * for each entry, at least one search retrieves all attributes.
 * searches are done under the same txn, so those entries do not change during the MOD

Comment 5 Prasad Kulkarni 2016-09-20 09:01:07 UTC
Hello,

Here is the brief description customer has mentioned about the issue :

In IPA master-replica environment, create a group which is member of 13-15 other groups each containing about 500-1400 users.

Now, try addition/removal of users to that group, which takes about a minute to finish and will make both master and replica unresponsive to any ldap queries or login attempts during the time.

Make sure compat plugin is enabled on both the systems.

Comment 8 thierry bordaz 2016-10-10 15:00:56 UTC
Hi German,

Thanks for reporting this new related case.
This bug is high priority to 7.4 and will be fixed in that release.
Can you confirm that during this long MOD, DS has a similar pstack https://bugzilla.redhat.com/show_bug.cgi?id=1364144#c0.
Did you reproduce it in-house or give some detail how to reproduce it.

thanks

Comment 10 German Parente 2016-10-10 15:35:14 UTC
Created attachment 1208892 [details]
internal operations between ADD and RESULT


Note You need to log in before you can comment on or make changes to this bug.