Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1669186 - Manifest upload task takes too much time
Summary: Manifest upload task takes too much time
Alias: None
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Subscription Management
Version: 6.5.0
Hardware: Unspecified
OS: Unspecified
medium with 1 vote vote
Target Milestone: 6.5.0
Assignee: Justin Sherrill
QA Contact: Roman Plevka
: 1669216 (view as bug list)
Depends On:
Blocks: 1669216
TreeView+ depends on / blocked
Reported: 2019-01-24 14:53 UTC by Roman Plevka
Modified: 2019-04-10 14:07 UTC (History)
7 users (show)

Fixed In Version: tfm-rubygem-katello-
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1669216 1684703 1698513 (view as bug list)
Last Closed:
Target Upstream Version:

Attachments (Terms of Use)
traceback (deleted)
2019-03-19 16:00 UTC, Roman Plevka
no flags Details

System ID Priority Status Summary Last Updated
Foreman Issue Tracker 25981 None None None 2019-02-05 03:03:09 UTC

Comment 4 Roman Plevka 2019-01-24 15:32:46 UTC
here are some timings for 10 consecutive manifest imports:


Comment 6 Brad Buckingham 2019-01-25 16:13:30 UTC
*** Bug 1669216 has been marked as a duplicate of this bug. ***

Comment 11 Mike McCune 2019-01-31 17:53:25 UTC
All the below are on identical hardware, importing the manifest attached to this bug:

* 6.3.5 Satellite: 1m36sec

* 6.4.1 Satellite with external database: 2m49s

* 6.5 SNAP 6: 0m33s

* 6.5 SNAP 13: 3m53s

definite gradual progression getting worse. The 0m33s almost looks like an anomoly and I wonder if we were skipping something critical but we should at least get it back into the same time levels as 6.3 in the 1.5minute range if we can.

Comment 12 John Mitsch 2019-01-31 18:05:19 UTC
We have added steps to the manifest import such as storing the product content in the database This is to provide better searching and avoid calls to candlepin later, it's my understanding this is a trade-off of manifest import time for better searching and less backend calls later

This and other changes have been around since Katello 3.6, so it may help to explain the upward trend in manifest import time, but doesn't explain the recent change in 6.5 snaps. I'm also wondering if something had been missing or skipped in the previous 6.5 snaps, that is now functioning in the latest snaps.

Roman, do we know for sure the time regressed in the 6.4 snap? I see per the following comment that the times were low in 6.5 snaps:

But having data on the 6.4 snaps and releases would help to tell us more, from mike's testing it seems the manifest took ~3 mins in 6.4, so there doesn't seem to be a change.

Comment 16 Justin Sherrill 2019-02-05 03:03:09 UTC
Created redmine issue from this bug

Comment 18 Bryan Kearney 2019-02-08 05:03:49 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue has been resolved.

Comment 20 Mike McCune 2019-03-01 22:16:34 UTC
This bug was cloned and is still going to be included in the 6.4.3 release. It no longer has the sat-6.4.z+ flag and 6.4.3 Target Milestone Set which are now on the 6.4.z cloned bug. Please see the Clones field to track the progress of this bug in the 6.4.3 release.

Comment 22 Justin Sherrill 2019-03-06 15:12:40 UTC
Hey Roman, 

There are several steps to the manifest import process, and i didn't compare the overall time (including candlepin importing), but if you look at the  Actions::Candlepin::Owner::ImportProducts step within the dynflow console of the manifest import foreman task, it was taking a large portion of the overall time.  That particular step should see a speed up of 70-90%.

Jonathan did some over all timings and you can find them here:

manifest import prior (clean environment): 350 seconds
manifest refresh prior: 90 seconds

manifest refresh with change: 43 seconds
manifest import with change (existing content): 36 seconds
manifest import with change (no content): 50 seconds

Comment 23 Roman Plevka 2019-03-19 13:56:43 UTC
Hello Justin,
so I tried to verify this with the same test that produced the earlier timings and the new timings look really great!

21.97s call[1/10]
15.51s call[2/10]
10.38s call[3/10]
10.57s call[4/10]
15.53s call[5/10]
12.35s call[6/10]
10.55s call[7/10]
10.51s call[8/10]
10.48s call[9/10]
10.41s call[10/10]

However, the last iteration errored out with the following message:

PG::UniqueViolation: ERROR:  duplicate key value violates unique constraint "index_katello_pools_on_cp_id"
DETAIL:  Key (cp_id)=(4028f9536995fe480169960bde580a6c) already exists.
: INSERT INTO "katello_pools" ("cp_id", "created_at", "updated_at", "organization_id") VALUES ($1, $2, $3, $4) RETURNING "id"

is it possible that this might be related to the changes you made?

- i tried to run the test (in batches of 10) several times, and this errors occurs in almost every batch. The operations are carried out synchronously.

Comment 24 Justin Sherrill 2019-03-19 14:05:30 UTC
Hey Roman, 

That error shouldn't be related to this, as this converted products and product content to use bulk importing, but the error is related to 'pools'.  I'm happy to take a closer look though, do you have a traceback?

Its possible that my change caused the error to occur because a step that was taking a long time now isn't, which exposed an existing race condition.  The more I think about it, I would put my money on that.  I think the fix should be fairly simple, if you want to handle it as part of this bz.


Comment 25 Nikhil Kathole 2019-03-19 14:10:52 UTC
I see the bz [1] for same error.



Comment 26 Roman Plevka 2019-03-19 16:00:16 UTC
Created attachment 1545740 [details]

attaching the requested traceback

Comment 27 Justin Sherrill 2019-03-19 18:02:34 UTC
If anyone is open to testing the upstream pr:

it would be very helpful,  its hard to reproduce as a race condition

Comment 28 Roman Plevka 2019-03-20 10:38:50 UTC
(In reply to Justin Sherrill from comment #27)
> If anyone is open to testing the upstream pr:
> it would be very helpful,  its hard to reproduce as a race condition

Thanks for a quick fix, however, 3 retries might not be enough:

Comment 29 Roman Plevka 2019-04-05 06:39:56 UTC
will you include the race condition fix mentioned in Comment #27, so I can properly test and Verify this BZ?

Comment 31 Brad Buckingham 2019-04-05 18:00:48 UTC
Hi Roman,

I am clearing the needinfo for Justin and placing this back to ON_QA.

The fix that was referenced in comment 27 was associated with bug 1686604.  That bug was delivered in snap 22 and since been verified.

Comment 32 Roman Plevka 2019-04-08 12:07:37 UTC
Thank you for confirmation.

with the commit in place we are now good to put this to VERIFIED

Note You need to log in before you can comment on or make changes to this bug.