PostgreSQL and heap-only-tuples updates - part 3

Nov 26, 2018 · 8 min read
post Postgres

Here is a series of articles that will focus on a new feature in version 11.

During the development of this version, a feature caught my attention. It can be found in releases notes : https://www.postgresql.org/docs/11/static/release-11.html

Allow heap-only-tuple (HOT) updates for expression indexes when the values of the expressions are unchanged (Konstantin Knizhnik)

I admit that this is not very explicit and this feature requires some knowledge about how postgres works, that I will try to explain through several articles:

  1. How MVCC works and heap-only-tuples updates
  2. When postgres do not use heap-only-tuple updates and introduction to the new feature in v11
  3. Impact on performances

This feature was disabled in 11.1 because it could lead to instance crashes1. I chose to publish these articles because they help to understand the mechanism of HOT updates and the benefits that this feature could bring.

I thank Guillaume Lelarge for his review of this article ;).

Impact on performance

Here is a simple test to demonstrate the benefits of this feature. We could expect performance gains because postgres avoids updating the indexes, as well as in terms of index size, as seen above, fragmentation is avoided.

 1CREATE TABLE t5 (c1 jsonb, c2 int,c3 int);
 2CREATE INDEX ON t5 ((c1->>'prenom')) ;
 3CREATE INDEX ON t5 (c2);
 4INSERT INTO t5 VALUES ('{ "prenom":"adrien" , "valeur" : "1"}'::jsonb,1,1);
 5INSERT INTO t5 VALUES ('{ "prenom":"guillaume" , "valeur" : "2"}'::jsonb,2,2);
 6\dt+ t5
 7                   List of relations
 8 Schema | Name | Type  |  Owner   | Size  | Description
 9--------+------+-------+----------+-------+-------------
10 public | t5   | table | postgres | 16 kB |
11(1 row)
12
13\di+ t5*
14                           List of relations
15 Schema |    Name     | Type  |  Owner   | Table | Size  | Description
16--------+-------------+-------+----------+-------+-------+-------------
17 public | t5_c2_idx   | index | postgres | t5    | 16 kB |
18 public | t5_expr_idx | index | postgres | t5    | 16 kB |
19(2 rows)

Then this pgbench test:

1\set id  random(1, 100000)
2\set id2  random(1, 100000)
3
4UPDATE t5 SET c1 = '{"valeur": ":id", "prenom": "guillaume"}' WHERE c2=2;
5UPDATE t5 SET c1 = '{"valeur": ":id2", "prenom": "adrien"}' WHERE c2=1;

That we execute for 60 seconds with recheck_on_update=on (default):

 1pgbench -f test.sql -n -c6 -T 120
 2transaction type: test.sql
 3scaling factor: 1
 4query mode: simple
 5number of clients: 6
 6number of threads: 1
 7duration: 120 s
 8number of transactions actually processed: 2743163
 9latency average = 0.262 ms
10tps = 22859.646914 (including connections establishing)
11tps = 22859.938191 (excluding connections establishing)
12
13 \dt+ t5*
14                    List of relations
15 Schema | Name | Type  |  Owner   |  Size  | Description
16--------+------+-------+----------+--------+-------------
17 public | t5   | table | postgres | 376 kB |
18(1 row)
19\di+ t5*
20                           List of relations
21 Schema |    Name     | Type  |  Owner   | Table | Size  | Description
22--------+-------------+-------+----------+-------+-------+-------------
23 public | t5_c2_idx   | index | postgres | t5    | 16 kB |
24 public | t5_expr_idx | index | postgres | t5    | 32 kB |
25(2 rows)
26
27SELECT * from pg_stat_user_tables where relname = 't5';
28-[ RECORD 1 ]-------+------------------------------
29relid               | 8890622
30schemaname          | public
31relname             | t5
32seq_scan            | 4
33seq_tup_read        | 0
34idx_scan            | 7999055
35idx_tup_fetch       | 7999055
36n_tup_ins           | 4
37n_tup_upd           | 7999055
38n_tup_del           | 0
39n_tup_hot_upd       | 7998236
40n_live_tup          | 2
41n_dead_tup          | 0
42n_mod_since_analyze | 0
43last_vacuum         |
44last_autovacuum     | 2018-09-19 06:29:37.690575+00
45last_analyze        |
46last_autoanalyze    | 2018-09-19 06:29:37.719911+00
47vacuum_count        | 0
48autovacuum_count    | 5
49analyze_count       | 0
50autoanalyze_count   | 5

Now with recheck_on_update=off. So, why the same data set as before but this time the index is created with this order: CREATE INDEX ON t5 ((c1->>'prenom')) WITH (recheck_on_update=off);

 1pgbench -f test.sql -n -c6 -T 120
 2transaction type: test.sql
 3scaling factor: 1
 4query mode: simple
 5number of clients: 6
 6number of threads: 1
 7duration: 120 s
 8number of transactions actually processed: 1065688
 9latency average = 0.676 ms
10tps = 8880.679565 (including connections establishing)
11tps = 8880.796478 (excluding connections establishing)
12
13\dt+ t5
14                    List of relations
15 Schema | Name | Type  |  Owner   |  Size   | Description
16--------+------+-------+----------+---------+-------------
17 public | t5   | table | postgres | 9496 kB |
18(1 row)
19
20\di+ t5*
21                           List of relations
22 Schema |    Name     | Type  |  Owner   | Table |  Size  | Description
23--------+-------------+-------+----------+-------+--------+-------------
24 public | t5_c2_idx   | index | postgres | t5    | 768 kB |
25 public | t5_expr_idx | index | postgres | t5    | 58 MB  |
26(2 rows)
27
28select * from pg_stat_user_tables where relname = 't5';
29-[ RECORD 1 ]-------+------------------------------
30relid               | 8890635
31schemaname          | public
32relname             | t5
33seq_scan            | 2
34seq_tup_read        | 0
35idx_scan            | 2131376
36idx_tup_fetch       | 2131376
37n_tup_ins           | 2
38n_tup_upd           | 2131376
39n_tup_del           | 0
40n_tup_hot_upd       | 19
41n_live_tup          | 2
42n_dead_tup          | 0
43n_mod_since_analyze | 0
44last_vacuum         |
45last_autovacuum     | 2018-09-19 06:34:42.045905+00
46last_analyze        |
47last_autoanalyze    | 2018-09-19 06:34:42.251183+00
48vacuum_count        | 0
49autovacuum_count    | 3
50analyze_count       | 0
51autoanalyze_count   | 3
recheck_on_update on off Gain
TPS 22859 8880 157%
t5 size 376 kB 9496 kB -96%
t5_c2_idx size 16 kB 768 kB -98%
t5_expr_idx size 32 kB 58 MB -99.9%

The performance difference is quite impressive, as well as the size of the tables and indexes.

I did the first test again by disabling the autovacuum and here is the result:

 1pgbench -f test.sql -n -c6 -T 120
 2transaction type: test.sql
 3scaling factor: 1
 4query mode: simple
 5number of clients: 6
 6number of threads: 1
 7duration: 120 s
 8number of transactions actually processed: 2752479
 9latency average = 0.262 ms
10tps = 22937.271749 (including connections establishing)
11tps = 22937.545872 (excluding connections establishing)
12
13
14select * from pg_stat_user_tables where relname = 't5';
15-[ RECORD 1 ]-------+--------
16relid               | 8890643
17schemaname          | public
18relname             | t5
19seq_scan            | 2
20seq_tup_read        | 0
21idx_scan            | 5504958
22idx_tup_fetch       | 5504958
23n_tup_ins           | 2
24n_tup_upd           | 5504958
25n_tup_del           | 0
26n_tup_hot_upd       | 5504258
27n_live_tup          | 2
28n_dead_tup          | 2416
29n_mod_since_analyze | 5504960
30last_vacuum         |
31last_autovacuum     |
32last_analyze        |
33last_autoanalyze    |
34vacuum_count        | 0
35autovacuum_count    | 0
36analyze_count       | 0
37autoanalyze_count   | 0
38
39\di+ t5*
40List of relations
41-[ RECORD 1 ]------------
42Schema      | public
43Name        | t5_c2_idx
44Type        | index
45Owner       | postgres
46Table       | t5
47Size        | 16 kB
48Description |
49-[ RECORD 2 ]------------
50Schema      | public
51Name        | t5_expr_idx
52Type        | index
53Owner       | postgres
54Table       | t5
55Size        | 40 kB
56Description |
57
58\dt+ t5
59List of relations
60-[ RECORD 1 ]---------
61Schema      | public
62Name        | t5
63Type        | table
64Owner       | postgres
65Size        | 1080 kB
66Description |

Then the second test:

 1pgbench -f test.sql -n -c6 -T 120
 2transaction type: test.sql
 3scaling factor: 1
 4query mode: simple
 5number of clients: 6
 6number of threads: 1
 7duration: 120 s
 8number of transactions actually processed: 881434
 9latency average = 0.817 ms
10tps = 7345.208875 (including connections establishing)
11tps = 7345.304797 (excluding connections establishing)
12
13select * from pg_stat_user_tables where relname = 't5';
14-[ RECORD 1 ]-------+--------
15relid               | 8890651
16schemaname          | public
17relname             | t5
18seq_scan            | 2
19seq_tup_read        | 0
20idx_scan            | 1762868
21idx_tup_fetch       | 1762868
22n_tup_ins           | 2
23n_tup_upd           | 1762868
24n_tup_del           | 0
25n_tup_hot_upd       | 23
26n_live_tup          | 2
27n_dead_tup          | 1762845
28n_mod_since_analyze | 1762870
29last_vacuum         |
30last_autovacuum     |
31last_analyze        |
32last_autoanalyze    |
33vacuum_count        | 0
34autovacuum_count    | 0
35analyze_count       | 0
36autoanalyze_count   | 0
37
38\di+ t5*
39List of relations
40-[ RECORD 1 ]------------
41Schema      | public
42Name        | t5_c2_idx
43Type        | index
44Owner       | postgres
45Table       | t5
46Size        | 600 kB
47Description |
48-[ RECORD 2 ]------------
49Schema      | public
50Name        | t5_expr_idx
51Type        | index
52Owner       | postgres
53Table       | t5
54Size        | 56 MB
55Description |
56
57\dt+ t5*
58List of relations
59-[ RECORD 1 ]---------
60Schema      | public
61Name        | t5
62Type        | table
63Owner       | postgres
64Size        | 55 MB
65Description |
recheck_on_update on off Gain
TPS 22937 7345 212%
t5 size 1080 kB 55 MB -98%
t5_c2_idx size 16 kB 600 kB -97%
t5_expr_idx size 40 kB 56 MB -99.9%

Once again, the performance gap is significant, as is the size of tables and indexes. We also note the importance of leaving the autovacuum activated.

Why do we have such a large difference on the indexes and the table?

For the indexes, this is due to the mechanism explained above. Postgres was able to chain the records by avoiding updating the index. The index has nevertheless slightly increased in size, it may happen that postgres cannot make a HOT. For example when there is no more space in the block.

As for the size of the table, during the test with autovacuum activated, the autovacuum had more difficulty to pass on the table with the HOT disabled. The index growing, it resulted in more “work”. During the test without autovacuum, the difference is explained by the fact that even a simple SELECT can prune blocs2.

Remember that this feature was removed with version 11.1. I wrote these articles shortly after the release of version 11.0 and I still choose to publish them in order to explain how UPDATES HOT works. Hopefully this feature will be fixed in future versions.