Tomas Vondra

Tomas Vondra

blog about Postgres code and community

Tuning AIO in PostgreSQL 18

PostgreSQL 18 was stamped earlier this week, and as usual there’s a lot of improvements. One of the big architectural changes is asynchronous I/O (AIO), allowing asynchronous scheduling of I/O, giving the database more control and better utilizing the storage. I’m not going to explain how AIO works, or present detailed benchmark results. There have been multiple really good blog posts about that. There’s also a great talk from pgconf.dev 2025 about AIO, and a recent “Talk...

Using JWT to establish a trusted context for RLS

Row-level security (RLS) is a great feature. It allows restricting access to rows by applying filters defined by a policy. It’s a tool useful for cases when the data set can’t be split into separate databases. Sadly, using RLS may be quite cumbersome. RLS requires some sort of “trusted context” for the RLS policies. The policies need to filter using data the user can’t change. If the filter uses some sort of “tenant ID”, and the user can change it to an ...

Fun and weirdness with SSDs

When I started working with Postgres (or databases in general) 25 years ago, storage systems looked very different. All storage was “spinning rust” - rotational disks with various interfaces (SATA/SAS/…) and speeds (7.2K/10k/15k/…). The spindle speed was the main performance determining feature, and everyone knew what IOPS and bandwidth to expect from a disk. The general behavior was pretty much the same. With SSDs it’s more complicated. The interface may be the s...

So why don't we pick the optimal query plan?

Last week I posted about how we often don’t pick the optimal plan. I got asked about difficulties when trying to reproduce my results, so I’ll address that first (I forgot to mention a couple details). I also got questions about how to best spot this issue, and ways to mitigate this. I’ll discuss that too, although I don’t have any great solutions, but I’ll briefly discuss a couple possible planner/executor improvements that might allow handling this better.

How often is the query plan optimal?

The basic promise of a query optimizer is that it picks the “optimal” query plan. But there’s a catch - the plan selection relies on cost estimates, calculated from selectivity estimates and cost of basic resources (I/O, CPU, …). So the question is, how often do we actually pick the “fastest” plan? And the truth is we actually make mistakes quite often. Consider the following chart, with durations of a simple SELECT query with a range condition. The condition...