Good time to test io_method (for Postgres 18)

We’re now in the “feature freeze” phase of Postgres 18 development. That means no new features will get in - only bugfixes and cleanups of already committed changes. The goal is to test and stabilize the code before a release. PG 18 beta1 was released a couple days ago, so it’s a perfect time to do some testing and benchmarking.

One of the fundamental changes in PG 18 is going to be support for asynchronous I/O. And with beta1 out, it’s the right time to run your tests and benchmarks to test this new feature. Both for correctness and regression.

Until now Postgres did only synchronous file I/O, using the traditional API read(), write() and a couple variants. A couple places also did explicit prefetching by calling posix_fadvise, but I wouldn’t count that as proper async I/O. For a long time this worked well enough, but we’ve been hitting more and more limitations of synchronous I/O.

Lukas Fittl published a really nice introduction, explaining the basics of how it works, why, how to collect and interpret stats, and so on. It’s a really nice explanation, clearer than I could write - and now I don’t have to. Go read it.

I’m not sure I’d describe async I/O as a performance improvement, but the other goals are more about development and internals, so users will probably see it that way.

It however also brings some new configuration options:

io_method - How is the async I/O actually handled?
io_workers - Number of I/O workers for io_method = worker.

The question is what should be the defaults for these parameters, and we need some help with that.

The io_method has three possible values:

sync - Regular sync I/O with prefetching using fadvise, and each backend is performing its I/O. This is the “backwards compatibility” choice, it’s as close to PG17 as possible.
worker - There’s a pool of I/O workers performing the I/O. Backends queue requests, workers handle them and notify the backends. The number of workers is set by io_workers, by default 3.
io_uring - Uses liburing to pass I/O requests to the kernel io_uring interface. This is the most modern option, but also optional (requires --with-liburing).

There is no “correct” default value. Each option has its pros and cons, and may work depending on the hardware, workload, etc. For now the default is worker with 3 workers, but that’s temporary. We wanted to test with the “properly” asynchronous methods, and the io_uring is optional (and not available on some platforms).

We’ll need to pick the actual default in a couple months, perhaps in July or so.

And this is where you can help! Do you have some sort of test suite for your application (running on Postgres)? Even better if you have some benchmarks with I/O intensive queries, or less common hardware.

Please try running that on PG 18 with different io_method and io_workers values, and let us know. Did it work? Was it faster or slower?

Perhaps you have other arguments for picking a particular default value? For example, worker has the benefit that the worker processes are not responsible just for the I/O, but also for tasks like checksum verification. Which PG 18 enables by default for new clusters, and the verification costs a bit of CPU time.

Conclusions

Help us with stabilizing PG 18, and with figuring out the defaults for io_method and io_worker. Install the beta1, and run as many tests and benchmarks as possible. Ideally, you’d have some application specific test/benchmark. Now is a great time to run it on PG 18 beta1, and report the interesting results to pgsql-hackers.

Do you have feedback on this post? Please reach out by e-mail to tomas@vondra.me.

Tomas Vondra

Good time to test io_method (for Postgres 18)

Conclusions