Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.

Bug 1510432

Summary: [RFE] Dashboard and history queries use lots of big temp file (increase pg work_mem)
Product: [oVirt] ovirt-engine Reporter: Roy Golan <rgolan>
Component: Database.CoreAssignee: Roy Golan <rgolan>
Status: CLOSED CURRENTRELEASE QA Contact: guy chen <guchen>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.2.0CC: bugs, lleistne, mgoldboi, mperina, rgolan, sradco
Target Milestone: ovirt-4.2.0Keywords: FutureFeature, Improvement, Performance
Target Release: 4.2.0Flags: rule-engine: ovirt-4.2+
mgoldboi: planning_ack+
mperina: devel_ack+
pstehlik: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Enhancement
Doc Text:
We have found out that by default PostgreSQL set its work_mem parameter to 4MB, but this is not enough even for medium sized setups and when this memory is filled up, PostgreSQL creates many temporary files, which significantly affects the database performance. So we have decided to change the default value of work_mem to 8MB, which seemed to be enough to medium sized setups. For automatically provisioned databases engine-setup performs increase of the value automatically if it's set to lower value, but for unmanaged local or remote databases administrators needs to increase it manually to at least 8MB (engine-setup raises an error for unmanaged databases).
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-27 07:25:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1538716    
Description Flags
PgBadger report demonstaring the size of temp files generated
Number of temp files generated
Temporary files activity none

Description Roy Golan 2017-11-07 12:02:00 UTC
Description of problem:
To perform sort and avg function a big data sets, postgres needs memory. When the memory is out, it will create temp files to store the intermediate data.

Observing an installation with ~60 hosts and few hundreds of vms, I see hundreds of temp files, with gigs of info, create over a period of an hour.

On pg 9.2 the work_mem parameter is 1mb which is very low for this kind of workload. 

pg 9.5 default work_mem is 4mb.

Changing the value to 8mb resulted almost no usage of temp file, and no visible increase of mem usage. Probably it is freed fast enough or just low number of client connections needs it (dashboard + 2 dwhd connections?)

4mb may be too low but okay for small installations

8mb did real difference.

16mb may completely remove temp files, but may be if a lot of connections will need the memory for complex queries.

Comment 1 Oved Ourfali 2017-11-08 06:46:39 UTC
So I assume your suggesting is to change the setting to 8mb?
If so, please change the title accordingly to reflect what's requested in this bug.


Comment 2 Yaniv Kaul 2017-11-08 07:06:48 UTC
BTW, it's also interesting that the files are created on /usr/local/... - which means on the local disk, even if the PG DB itself is stored on a fast shared storage of some kind!

Comment 3 Roy Golan 2017-11-08 07:25:38 UTC
Oved I'd like to describe the problem in the subject and not the solution, cause this might lead to other solution (or no solution at all :) )

(In reply to Yaniv Kaul from comment #2)
> BTW, it's also interesting that the files are created on /usr/local/... -
> which means on the local disk, even if the PG DB itself is stored on a fast
> shared storage of some kind!

It is an interesting trade-off and really depends on on the machine setup.

Since the setting is per instance, potentially this may be a case were we will want to have 2 different pg instances, one for DWH and one for engine. Currently I don't see engine connections utilizing the work_mem but this may change when if we will change a frequent query in the engine.

Comment 4 Yaniv Kaul 2017-11-16 18:31:01 UTC
Roy, are you pushing this to 4.2?

Comment 5 Roy Golan 2017-11-19 07:05:14 UTC
Let me clear this out with mperina and emesika first, cause I was under the impression they are going to.

Comment 6 Roy Golan 2017-11-19 07:26:04 UTC
Here is an excerpt from a pgbadger report analysis (will attach parts of it to this bug)
from a system with 30/700/1300 hosts/vms/disks

 Temporary Files Activity

    Temp Files

Day 	Hour 	Count 	Total size 	Average size
Oct 31	15	660	5.76 GiB	8.94 MiB
	16	1,418	11.70 GiB	8.45 MiB
	17	715	2.33 GiB	3.34 MiB

Top queries responsible for that is *audit_log*, *tags*, and *dashboard queries* (naturally needs sorting and hashing to produce the charts)

 Queries generating the most temporary files (N)
Rank 	Count 	Total size 	Min size 	Max size 	Avg size 	Query
*1 	748 	15.99 GiB 	21.86 MiB 	21.91 MiB 	21.89 MiB 	*
SELECT * FROM ( ( SELECT DISTINCT audit_log.* FROM audit_log WHERE NOT deleted ) ORDER BY audit_log_id DESC ) AS t1 offset ( 0 0 ) LIMIT 0;

*2 	97 	1.51 GiB 	15.91 MiB 	15.91 MiB 	15.91 MiB 	*
SELECT upper ( CAST ( entity_id AS char ( 0 ) ) ) AS entity_join_id, upper ( CAST ( parent_id AS char ( 0 ) ) ) AS parent_join_id FROM v4_1_latest_tags_relations upper ( CAST ( parent_id AS char ( 0 ) ) ) AS parent_join_id FROM v4_1_latest_tags_relations upper ( CAST ( parent_id AS char ( 0 ) ) ) AS parent_join_id FROM v4_1_latest_tags_relations;

*3 	92 	800.98 MiB 	4.82 KiB 	38.48 MiB 	8.71 MiB 	*
SELECT name, avg ( total ) AS total, avg ( used ) AS used, max ( previous_used ) AS previous_used, max ( previous_total ) AS previous_total, max ( vm_disk_usage_percent ) AS usage_percent FROM ( SELECT history_datetime, vms.vm_name AS name, sum ( vm_disk_size_mb ) AS total, sum ( vm_disk_actual_size_mb ) AS used, COALESCE ( max ( previous_actual_size ), 0 ) AS previous_used, COALESCE ( max ( previous_total_size ), 0 ) AS previous_t

Comment 7 Roy Golan 2017-11-19 07:35:30 UTC
Note: The impact of this settings is that each connection can allocate 
this amount of memory to perform the query.
Before we go on and increase this number to make sure any
installation size would fit into the work_mem we first optimize the
queries to consume less memory(is sorting needed, is merging needed, do
we have the right indexes to work on the proper dataset size etc)

Comment 8 Roy Golan 2017-11-19 08:24:38 UTC
Created attachment 1355054 [details]
PgBadger report demonstaring the size of temp files generated

Comment 9 Roy Golan 2017-11-19 08:26:15 UTC
Created attachment 1355055 [details]
Number of temp files generated

Comment 10 Roy Golan 2017-11-19 08:27:46 UTC
Created attachment 1355056 [details]
Temporary files activity

Comment 11 Roy Golan 2017-11-19 11:04:25 UTC
Further analysis show we are generating sql query that does distinct * event though we don't need to. This generates very large structure in order to sort it to make sure we are distinct. See this explain analyze with distinct *

Explain Analyze for audit_log query generate by the search[1] with 143,000 records:

#  with distinct audit_log.*                              : cost=149597 rows=142941 width=2091
#  with distinct audit_log.audit_log_id                   : cost=14809  rows=142941 width=8
# data set size to process (rows * width / 1024 / 1024)   : 285mb vs 1mb

[1] EXPLAIN ANALYZE SELECT * FROM ( ( SELECT DISTINCT audit_log.* FROM audit_log WHERE NOT deleted ) ORDER BY audit_log_id DESC ) AS t1  LIMIT 100;

Comment 12 Roy Golan 2017-11-19 16:44:06 UTC
correction for comment 11 - the 'distinct' keyword needs to go away and not adding a field to the distinct expression

First part of the solution is to eliminate useless distinct on search queries for tables/view which are unique by they're very definition. This alone creates a far more small dataset and demands few resources.
Since search queries, speciially for audit_log, are common queries the UI fires (nt only) it should be a nice improvement.

Example of one enviornment with ~150,000 records, the differences are notable.

Each record is a query, each column result is with/without distinct usage:

search query:
avg query time:  400ms/90ms (3x-4x improvement)
Disk usage Sort: 45456kb/0kb
Query plan: Sorting and merging/Index Scan

Comment 13 Pavel Stehlik 2017-11-22 09:33:49 UTC
On scale should be easily reproducible, thx, P.

Comment 14 Sandro Bonazzola 2017-11-24 11:41:24 UTC
Does this change the minimum and recommended requirements in terms of memory to be allocated for the engine?

Comment 15 Roy Golan 2017-11-26 08:14:00 UTC
I don't think so but we need the infra/scale QE to verify my analysis. The production setup I was tweaking + my dev setup didn't show increased consumption yet.

Comment 17 guy chen 2018-04-22 14:15:55 UTC
On a system with 4,000 Vms and 400 hosts work_mem is set to 8MB and tmp folder is mounted to the main storage, I can see very little temp_files activity,thus verified the bug.

postgres=# SHOW work_mem;
(1 row)

postgres=# select datname,temp_files,temp_bytes from pg_stat_database;
  datname  | temp_files | temp_bytes 
 template1 |          0 |          0
 template0 |          0 |          0
 postgres  |          0 |          0
 engine    |        135 |  168329476

DB server
-=>>ls -ltr /var/opt/rh/rh-postgresql95/lib/pgsql/data/base/pgsql_tmp/
total 0

DWH server
-=>>ls -ltr /mnt/dwh_custom_pg/pg_data/base/pgsql_tmp/
total 0

Comment 18 Sandro Bonazzola 2018-04-27 07:25:30 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.