Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1061425 - Packages search in Satellite web UI takes up to a couple minutes to complete
Summary: Packages search in Satellite web UI takes up to a couple minutes to complete
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Spacewalk
Classification: Community
Component: WebUI
Version: 2.0
Hardware: All
OS: All
high
high
Target Milestone: ---
Assignee: Stephen Herr
QA Contact: Red Hat Satellite QA List
URL:
Whiteboard:
Depends On:
Blocks: space21
TreeView+ depends on / blocked
 
Reported: 2014-02-04 20:13 UTC by Stephen Herr
Modified: 2014-03-04 13:09 UTC (History)
6 users (show)

Fixed In Version: spacewalk-java-2.1.150-1 spacewalk-search-2.1.14-1
Doc Type: Bug Fix
Doc Text:
Clone Of: 1035429
Environment:
Last Closed: 2014-03-04 13:08:05 UTC


Attachments (Terms of Use)

Description Stephen Herr 2014-02-04 20:13:52 UTC
+++ This bug was initially created as a clone of Bug #1035429 +++

Description of problem:

As reported by a customer and reproduced on an internal Satellite with the customer's database, a package search for 'postgres' in the search box in the search box in the top right of the web UI takes up to 160 seconds to complete, and a search for 'postfix' takes about 30 seconds.

After some debugging, I see that the search results from rhn-search return very quickly, but the code in PackageSearchAction performs various actions on the rhn-search results, to exclude the matches that don't meet the selected criteria. These additional steps can take quite a while.

If I narrow down the search terms on the Package Search page, for example with "Specific channel you have access to" set to "RHN Tools for RHEL (v. 6 for 64-bit x86_64)", then the total search completes in less than 1 second. Repeating the search with a base channel selected, regardless of whether the package actually exists in the channel, results in longer total page loads again.

How reproducible:

100%.

Steps to Reproduce:
1.) On Satellite 5.6 with several base channels and RHN Tools child channels synced, and several clones of the channels as well, search for the "osad" package in the search box in the top right of the web UI.

2.) See that the results take several seconds (sometimes up to a couple minutes) to load.

3.) Repeat the search for "osad" or "postfix" on the Package Search page, with the different search option radio buttons selected.

4.) See similar results.

Actual results:

Slow search result page loading for package search (up to a couple minutes).

Expected results:

Package search result page loading within a few seconds.

Additional info:

When "Packages of a specific architecture in any channel you have access to" is selected, the largest amount of time is spent in a Hibernate query that selects only the package matches that are in channels with the selected arch types:

***
./com/redhat/rhn/domain/rhnpackage/PackageFactory.java:

        if (archLabels != null && !archLabels.isEmpty()) {

            params.put("channel_arch_labels", archLabels);
            results = singleton.listObjectsByNamedQuery("Package.searchByIdAndArches",
                    params);

./com/redhat/rhn/domain/rhnpackage/Package_satellite.hbm.xml:

    <query name="Package.searchByIdAndArches">
            <![CDATA[select distinct p.packageName.id, p.packageName.name, p.summary,
                        p.packageArch.label, p.description, p.packageEvr.epoch,
                        p.packageEvr.version, p.packageEvr.release
                       from com.redhat.rhn.domain.rhnpackage.Package as p,
                            com.redhat.rhn.domain.channel.Channel as c
                      where p.id IN (:pids)
                        and c.channelArch.label IN (:channel_arch_labels)
                        and p in elements(c.packages)
                      ]]>
    </query>
***

When "Only channels relevant to your systems" is selected, the slow part is in a similar query that selects only the package matches that are in channels with systems subscribed to them:

***
./java/code/src/com/redhat/rhn/domain/rhnpackage/PackageFactory.java:

        else if (relevantFlag) {
            results =
                singleton.listObjectsByNamedQuery("Package.relevantSearchById", params);
        }

./java/code/src/com/redhat/rhn/domain/rhnpackage/Package_satellite.hbm.xml:

    <query name="Package.relevantSearchById">
            <![CDATA[select distinct p.packageName.id, p.packageName.name, p.summary,
                        p.packageArch.label, p.description, p.packageEvr.epoch,
                        p.packageEvr.version, p.packageEvr.release
                       from com.redhat.rhn.domain.rhnpackage.Package as p,
                            com.redhat.rhn.domain.channel.Channel as c,
                            com.redhat.rhn.domain.server.Server as s
                      where p.id IN (:pids)
                        and c IN elements (s.channels)
                        and p IN elements (c.packages)
                      ]]>
    </query>
***

When a specific base channel is selected, the slow part is in the filterByChannel code:


***
./java/code/src/com/redhat/rhn/frontend/action/channel/PackageSearchAction.java:

    private List<PackageOverview> filterByChannel(User user, Long channelId,
                                                  List<PackageOverview> pkgs) {

        Channel channel = ChannelManager.lookupByIdAndUser(channelId, user);
        List<PackageDto> allPackagesList = ChannelManager.listAllPackages(channel);

        // Convert the package list into a map for quicker lookup
        Map<String, String> packageNamesMap =
            new HashMap<String, String>(allPackagesList.size());

        for (PackageDto dto : allPackagesList) {
            String name = dto.getName();
            packageNamesMap.put(name, name);
        }

        // Iterate results and remove if not in the channel
        List<PackageOverview> newResult = new ArrayList<PackageOverview>();
        for (PackageOverview pkg : pkgs) {
            String packageName = pkg.getPackageName();
            if (packageNamesMap.get(packageName) != null) {
                newResult.add(pkg);
            }
        }

        return newResult;
    }
***

In my testing with the "Packages of a specific architecture in any channel you have access to" option, the following diff for the Hibernate query resulted in much quicker search times:

***
# diff -pruN com/redhat/rhn/domain/rhnpackage/Package_satellite.hbm.xml.bak com/redhat/rhn/domain/rhnpackage/Package_satellite.hbm.xml
--- com/redhat/rhn/domain/rhnpackage/Package_satellite.hbm.xml.bak	2013-11-21 18:05:14.862438311 -0500
+++ com/redhat/rhn/domain/rhnpackage/Package_satellite.hbm.xml	2013-11-22 08:50:54.339714722 -0500
@@ -193,8 +193,8 @@ PUBLIC "-//Hibernate/Hibernate Mapping D
                        from com.redhat.rhn.domain.rhnpackage.Package as p,
                             com.redhat.rhn.domain.channel.Channel as c
                       where p.id IN (:pids)
+                        and c IN elements(p.channels)
                         and c.channelArch.label IN (:channel_arch_labels)
-                        and p in elements(c.packages)
                       ]]>
     </query>
***

The original query compared the list of package ids returned by rhn-search to the list of all packages in all channels that match the list of selected channel arches. This seems very inefficient to grab the entire list of packages in all these channels. My modified query just looks at the list of channels associated with each matching package id, then compares each channel's arch to the list of selected channel arches. This completely avoids grabbing the full list of packages for each channel.

I don't have any proposed patches for the other two search options ("Only channels relevant to your systems" and "Specific channel you have access to"), but hopefully they can be improved as well as part of the same update.

--- Additional comment from Mark Huth on 2013-11-27 17:46:15 EST ---

With the original rhn.jar file in place, search for package 'postgres' and it will take up to 3 minutes for the result page to be displayed.  

Note: rhn-search cleanindex has been run but it made no change in performance, neither did db-control gather-stats.

Use Tasos' patched rhn.jar (with patch from comment #0):
# cp /usr/share/rhn/lib/rhn.jar.00985786 /usr/share/rhn/lib/rhn.jar
# service rhn-satellite restart

... and repeat the search for 'postgres' and now it takes only about 10 seconds to complete.  tasos++

--- Additional comment from Mark Huth on 2013-11-27 21:07:40 EST ---

The queries Tasos mentioned in comment #0 are exactly the same on Satellite 5.5 but they complete within 5 seconds on Satellite 5.5.  That is, login to the 5.5 satellite and search for the 'postgres' package and results are returned within 5 seconds.  However the same search on the 5.6 satellite takes 3 minutes.

On both Satellites I have done a cleanindex and a gather-stats.

--- Additional comment from Tomas Lestach on 2013-11-29 04:05:13 EST ---

Hello Tasos,

I applied your patch from the #Description as ...

spacewalk.git: 8cf56428862e2165d9b74b6933cee225d1622af6


Thank you!
(Keeping the BZ status on NEW, as the patch addresses just a 1/3 of the BZ report.)

--- Additional comment from Mark Huth on 2013-12-02 02:26:46 EST ---

Had a bit of a look at this today.  Re this line from comment #0:

"When a specific base channel is selected, the slow part is in the filterByChannel code:"

In particular this line is the one that is slow:
        List<PackageDto> allPackagesList = ChannelManager.listAllPackages(channel);
... which involves the 'all_packages_in_channel' package query.

I confirmed this query performs slowly on both the 5.6 and 5.5 versions of the customer's database (actually 5.5 was slower than 5.6), so at least this isn't a new problem in 5.6 then.

--- Additional comment from Mark Huth on 2014-01-07 00:57:38 EST ---

The "Only channels relevant to your systems" search involves the Package.relevantSearchById query and it seems to have a similar problem as the one Tasos highlighted in comment #0.  

Could it be modified in a similar way?

[root@example rhnpackage]# diff -u Package_satellite.hbm.xml.orig Package_satellite.hbm.xml
--- Package_satellite.hbm.xml.orig	1980-01-01 00:00:00.000000000 +1000
+++ Package_satellite.hbm.xml	2014-01-07 15:02:41.636695616 +1000
@@ -215,8 +215,8 @@
                             com.redhat.rhn.domain.channel.Channel as c,
                             com.redhat.rhn.domain.server.Server as s
                       where p.id IN (:pids)
+                        and c IN elements (p.channels)
                         and c IN elements (s.channels)
-                        and p IN elements (c.packages)
                       ]]>
     </query>


As for the "Specific channel you have access to" search, as mentioned in update #4, its slowness appears to be mainly caused by the 'all_packages_in_channel' query in Package_queries.xml.  I will try running that query manually and get an explain plan ...

--- Additional comment from Mark Huth on 2014-01-13 01:01:39 EST ---

ATM its got a modified rhn.jar on it with the following patch applied to Package_satellite.hbm.xml:

[root@example rhnpackage]# diff -u Package_satellite.hbm.xml.orig Package_satellite.hbm.xml
--- Package_satellite.hbm.xml.orig	1980-01-01 00:00:00.000000000 +1000
+++ Package_satellite.hbm.xml	2014-01-07 15:02:41.636695616 +1000
@@ -193,8 +193,8 @@
                        from com.redhat.rhn.domain.rhnpackage.Package as p,
                             com.redhat.rhn.domain.channel.Channel as c
                       where p.id IN (:pids)
+                        and c IN elements(p.channels)
                         and c.channelArch.label IN (:channel_arch_labels)
-                        and p in elements(c.packages)
                       ]]>
     </query>
 
@@ -215,8 +215,8 @@
                             com.redhat.rhn.domain.channel.Channel as c,
                             com.redhat.rhn.domain.server.Server as s
                       where p.id IN (:pids)
+                        and c IN elements (p.channels)
                         and c IN elements (s.channels)
-                        and p IN elements (c.packages)
                       ]]>
     </query>

These patches improve the search times for "Only channels relevant to you systems" from about 3 minutes to 30 seconds and for "Packages of a specific architecture in any channel you have access to" from about 3 minutes to 5 seconds.  I gave the patched rhn.jar to the customer and they confirmed similar results.  Hopefully there is still room for improvement with the "Only channels relevant to you systems" query.

Comment 1 Stephen Herr 2014-02-04 21:17:22 UTC
Previously committed:
8cf56428862e2165d9b74b6933cee225d1622af6

Adding:
f480feb2469cdbb8d19c296db398dc5a98d6b18a
24ee3a62be4144e05b714eccd0f1e6b83dde9a56

Comment 2 Stephen Herr 2014-02-10 21:50:39 UTC
In the course of investigating fixing this bug, I have encountered several other problems with package search that I have fixed along the way. They are:

1) Searching "Only channels relevant to your systems" did not search for packages that were relevant to *your* systems, it would search for packages that were relevant to some system somewhere, regardless of whether you could see it or not. This has been updated to search only for packages that are in channels that your systems (systems you can see) are subscribed to.

2) Searching "Packages of a specific architecture in any channel you have access to" did not search for packages in channels you had access to, but rather any channel of that architecture anywhere, regardless of if you have access to it or not. This has been updated to search only in channels the user has access to with the selected architectures.

3) In the API, the packages.search.* methods would always return results with a package provider of "unknown'. This has been updated to return the package provider if we know who it is.

Comment 3 Stephen Herr 2014-02-12 23:53:08 UTC
Disregard #2 in comment 2. The other two are valid though.

Committing re-worked package search to Spacewalk master:
5b36a2e9cc24ec983218d48410139578e1895426

Comment 4 Matej Kollar 2014-03-04 13:08:05 UTC
Spacewalk 2.1 has been released.
https://fedorahosted.org/spacewalk/wiki/ReleaseNotes21

Comment 5 Matej Kollar 2014-03-04 13:09:00 UTC
Spacewalk 2.1 has been released.
https://fedorahosted.org/spacewalk/wiki/ReleaseNotes21


Note You need to log in before you can comment on or make changes to this bug.