Speeding up Query Execution
Andres is a PostgreSQL committer and developer, focusing on replication, scalability and robustness. Before Citus Data he worked as a PostgreSQL Developer and Consultant at 2ndQuadrant and as a freelancing consultant in the areas of databases and software engineering. He has been developing Postgres and other Open Source projects since 2005. In his free time he enjoys climbing, diving and reading paper books.
PostgreSQL query execution performance is pretty good for many typical OLTP (i.e. individually short and concurrently issued) type queries, but performance for queries processing large amounts of data (e.g. OLAP queries) sometimes is a bit lacking. Some of those concerns can e addressed by increasing parallelism, and work on that is in progress, with the first queries being parallelized in PostgreSQL 9.6. But even so, single-process performance is very important for performance and for efficient use of resources. Among the prominent issues leading to inadequate performance: - expression evaluation (e.g. WHERE (x*2) < 3) and projection are performed in a recursive manner - interpreting on-disk tuples is not particularly fast, and is done at a very high frequency - some of the datastructures used not appropriate for current hardware, despite being good ideas in years past - tuples are processed one-by-one, from the bottom of the plan-tree to the top. That leads to poor cache locality and redundantly executed code. In this talk I'll discuss a few of these problems, and their solutions. Some of them will be about recently integrated changes, others about proposed changes. E.g. - integration of a newer hashtable design for hash aggregates - a faster expression evaluation and projection design, using an opcode-dispatch design - using just-in-time compilation to speed up critical parts of query execution - rewriting query execution to handle batches of tuples
- 50 min
- PGConf US 2017