Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1056064 - [SCALE] Events are being pulled from audit_log in a very inefficient way
Summary: [SCALE] Events are being pulled from audit_log in a very inefficient way
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.4.0
Assignee: Liran Zelkha
QA Contact: Eldad Marciano
Whiteboard: infra
Depends On:
Blocks: GSS_RHEV_33_BETA 1057561 1061165 rhev3.4beta 1142926
TreeView+ depends on / blocked
Reported: 2014-01-21 13:45 UTC by Tomas Dosek
Modified: 2018-12-05 17:00 UTC (History)
14 users (show)

Fixed In Version: ovirt-3.4.0-beta2
Doc Type: Bug Fix
Doc Text:
Previously, an inefficient query caused information pulled from the database to take a long time. This resulted in the query taking longer than expected to display results. Now, an updated query has fixed the problem. As a result, information is displayed more quickly.
Clone Of:
: 1057561 1061165 (view as bug list)
Last Closed: 2014-06-09 15:09:09 UTC
oVirt Team: Infra
Target Upstream Version:

Attachments (Terms of Use)
Postgres log (deleted)
2014-01-21 13:45 UTC, Tomas Dosek
no flags Details

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2014:0506 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Virtualization Manager 3.4.0 update 2014-06-09 18:55:38 UTC
oVirt gerrit 23567 None None None Never
oVirt gerrit 24048 None None None Never
oVirt gerrit 24094 None None None Never

Description Tomas Dosek 2014-01-21 13:45:30 UTC
Created attachment 853237 [details]
Postgres log

Description of problem:

Events displayed in UI (used by reports and API as well) are being pulled from database by very inefficient query.

This was found during scalability testing done by strategic EMEA customer.

The query is (taken from postgres logs with enhanced verbosity level):
[pid 3766]DEBUG:  00000: parse <unnamed>: SELECT * FROM (SELECT * FROM audit_log WHERE ( audit_log_id IN (SELECT audit_log.audit_log_id FROM  audit_log   WHERE  audit_log.severity = '10' ))  ORDER BY audit_log_id DESC ) as T1 OFFSET (1 -1) LIMIT 100

Version-Release number of selected component (if applicable):
is29 (basically any RHEV-M until now)

How reproducible:
100 %

Steps to Reproduce:
1. enhance log verbosity of postgres (in postgresql.conf):
        log_error_verbosity = verbose
        log_min_messages = debug2
        log_line_prefix = '[pid %p %r]'
2. restart postgres
3. go to events in portal, or fetch event via api

Actual results:
The query used is very inefficient and takes very long time to finish.

Expected results:
We should most likely use just one SELECT (ain't ;))

Additional info:
Hopefully not needed. But still for the record attaching the postgres log

Comment 2 Itamar Heim 2014-01-21 22:15:58 UTC
liran/eli - didn't we fix something in these for 3.4?

Comment 3 Eli Mesika 2014-01-22 09:07:13 UTC
(In reply to Itamar Heim from comment #2)
> liran/eli - didn't we fix something in these for 3.4?

As far as I remember Liran handled that for a diffrent query for 3.4 
Liran ?

Comment 4 Eli Mesika 2014-01-22 09:14:37 UTC
Actually , looking again in the query described in BZ description , it seems that this is the query that populate the Alerts TAB ( WHERE  audit_log.severity = '10' ) => 10 is the severity of Alerts 

So, we had optimized the regular query that populates the Events TAB but not that one

Comment 6 Tareq Alayan 2014-02-17 10:17:38 UTC
what is considered to be OK time  ?
now it takes ~2seconds is it ok?

tested on ovirt-engine-3.4.0-0.7.beta2.el6.noarch

Comment 7 Tomas Dosek 2014-02-17 10:39:37 UTC
That depends on the amount events you have in your audit_log. Is that a clear environment or some long-termly used one?

Comment 8 Tareq Alayan 2014-02-17 11:30:55 UTC
clear environment

Comment 9 Tomas Dosek 2014-02-17 11:35:28 UTC
In that case 2 seconds is quite a lot, but we should try to feed the audit log with some entries to see how much UI limits this and what is the performance degradation per 100, 1000, 10 000 lines of audit log (10k should be the upper limit where we start cleaning up the audit_log table automatically so there's no need to test more than that).

Comment 10 Tareq Alayan 2014-02-17 11:43:06 UTC
functional testing point of view: i see that the audit_log and event_log are viewed ok. 

Gil, do u see apropriate to do scale testing here by Yuri's team? 
Or shall i move it to verify?

Comment 11 Gil Klein 2014-02-17 16:18:41 UTC
Eldad, would you be able to verify this on the scale lab?

Comment 13 Eldad Marciano 2014-05-13 11:03:08 UTC
Not reproduced 

not in scale setup and even small one,

tested on 3.3 and also 3.4

Comment 14 errata-xmlrpc 2014-06-09 15:09:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

Note You need to log in before you can comment on or make changes to this bug.