Tomas Vondra

Tomas Vondra

blog about Postgres code and community

Using JWT to establish a trusted context for RLS

Row-level security (RLS) is a great feature. It allows restricting access to rows by applying filters defined by a policy. It’s a tool useful for cases when the data set can’t be split into separate databases. Sadly, using RLS may be quite cumbersome. RLS requires some sort of “trusted context” for the RLS policies. The policies need to filter using data the user can’t change. If the filter uses some sort of “tenant ID”, and the user can change it to an ...

Fun and weirdness with SSDs

When I started working with Postgres (or databases in general) 25 years ago, storage systems looked very different. All storage was “spinning rust” - rotational disks with various interfaces (SATA/SAS/…) and speeds (7.2K/10k/15k/…). The spindle speed was the main performance determining feature, and everyone knew what IOPS and bandwidth to expect from a disk. The general behavior was pretty much the same. With SSDs it’s more complicated. The interface may be the s...

So why don't we pick the optimal query plan?

Last week I posted about how we often don’t pick the optimal plan. I got asked about difficulties when trying to reproduce my results, so I’ll address that first (I forgot to mention a couple details). I also got questions about how to best spot this issue, and ways to mitigate this. I’ll discuss that too, although I don’t have any great solutions, but I’ll briefly discuss a couple possible planner/executor improvements that might allow handling this better.

How often is the query plan optimal?

The basic promise of a query optimizer is that it picks the “optimal” query plan. But there’s a catch - the plan selection relies on cost estimates, calculated from selectivity estimates and cost of basic resources (I/O, CPU, …). So the question is, how often do we actually pick the “fastest” plan? And the truth is we actually make mistakes quite often. Consider the following chart, with durations of a simple SELECT query with a range condition. The condition...

Benchmarking is hard, sometimes ...

I do a fair number of benchmarks, not only to validate patches, but also to find interesting (suspicious) stuff to improve. It’s an important part of my development workflow. And it’s fun ;-) But we’re dealing with complex systems (hardware, OS, DB, application), and that brings challenges. Every now and then I run into something that I don’t quite understand. Consider a read-only pgbench, the simplest workload there is, with a single SELECT doing lookup by PK. If you do ...