From time to time I get to investigate issues with some sort of a batch process. It’s getting more and more common that such processes use very high memory limits (maintenance_work_mem and work_mem). I suppose some DBAs follow the logic that “more is better”, not realizing it can hurt the performance quite a bit.
Let me demonstrate this using an example I ran across while testing a fix for parallel builds of GIN indexes. The bug is not particularly interesting or complex, but it required a fairly high value for maintenance_work_mem (the initial report used 20GB).
PostgreSQL 18 was stamped earlier this week, and as usual there’s a lot of improvements. One of the big architectural changes is asynchronous I/O (AIO), allowing asynchronous scheduling of I/O, giving the database more control and better utilizing the storage.
I’m not going to explain how AIO works, or present detailed benchmark results. There have been multiple really good blog posts about that. There’s also a great talk from pgconf.dev 2025 about AIO, and a recent “Talk...
Row-level security (RLS) is a great feature. It allows restricting access to rows by applying filters defined by a policy. It’s a tool useful for cases when the data set can’t be split into separate databases.
Sadly, using RLS may be quite cumbersome. RLS requires some sort of “trusted context” for the RLS policies. The policies need to filter using data the user can’t change. If the filter uses some sort of “tenant ID”, and the user can change it to an ...
When I started working with Postgres (or databases in general) 25 years ago, storage systems looked very different. All storage was “spinning rust” - rotational disks with various interfaces (SATA/SAS/…) and speeds (7.2K/10k/15k/…). The spindle speed was the main performance determining feature, and everyone knew what IOPS and bandwidth to expect from a disk. The general behavior was pretty much the same.
With SSDs it’s more complicated. The interface may be the s...
Last week I posted about how we often don’t pick the optimal plan. I got asked about difficulties when trying to reproduce my results, so I’ll address that first (I forgot to mention a couple details). I also got questions about how to best spot this issue, and ways to mitigate this. I’ll discuss that too, although I don’t have any great solutions, but I’ll briefly discuss a couple possible planner/executor improvements that might allow handling this better.