Good time to test io_method (for Postgres 18)
We’re now in the “feature freeze” phase of Postgres 18 development. That means no new features will get in - only bugfixes and cleanups of already committed changes. The goal is to test and stabilize the code before a release. PG 18 beta1 was released a couple days ago, so it’s a perfect time to do some testing and benchmarking.
One of the fundamental changes in PG 18 is going to be support for asynchronous I/O. And with beta1 out, it’s the right time to run your tests and benchmarks to test this new feature. Both for correctness and regression.
Until now Postgres did only synchronous file I/O, using the traditional
API read()
, write()
and a couple variants. A couple places also did
explicit prefetching by calling posix_fadvise
, but I wouldn’t count
that as proper async I/O. For a long time this worked well enough, but
we’ve been hitting more and more limitations of synchronous I/O.
Lukas Fittl published a really nice introduction, explaining the basics of how it works, why, how to collect and interpret stats, and so on. It’s a really nice explanation, clearer than I could write - and now I don’t have to. Go read it.
I’m not sure I’d describe async I/O as a performance improvement, but the other goals are more about development and internals, so users will probably see it that way.
It however also brings some new configuration options:
io_method
- How is the async I/O actually handled?io_workers
- Number of I/O workers forio_method = worker
.
The question is what should be the defaults for these parameters, and we need some help with that.
The io_method
has three possible values:
sync
- Regular sync I/O with prefetching usingfadvise
, and each backend is performing its I/O. This is the “backwards compatibility” choice, it’s as close to PG17 as possible.worker
- There’s a pool of I/O workers performing the I/O. Backends queue requests, workers handle them and notify the backends. The number of workers is set byio_workers
, by default 3.io_uring
- Uses liburing to pass I/O requests to the kernelio_uring
interface. This is the most modern option, but also optional (requires--with-liburing
).
There is no “correct” default value. Each option has its pros and cons,
and may work depending on the hardware, workload, etc. For now the
default is worker
with 3 workers, but that’s temporary. We wanted to
test with the “properly” asynchronous methods, and the io_uring
is
optional (and not available on some platforms).
We’ll need to pick the actual default in a couple months, perhaps in July or so.
And this is where you can help! Do you have some sort of test suite for your application (running on Postgres)? Even better if you have some benchmarks with I/O intensive queries, or less common hardware.
Please try running that on PG 18 with different io_method
and
io_workers
values, and let us know. Did it work? Was it faster or
slower?
Perhaps you have other arguments for picking a particular default value?
For example, worker
has the benefit that the worker processes are
not responsible just for the I/O, but also for tasks like checksum
verification. Which PG 18 enables by default for new clusters, and the
verification costs a bit of CPU time.
Conclusions
Help us with stabilizing PG 18, and with figuring out the defaults for
io_method
and io_worker
. Install the
beta1,
and run as many tests and benchmarks as possible. Ideally, you’d have
some application specific test/benchmark. Now is a great time to run it
on PG 18 beta1, and report the interesting results to
pgsql-hackers.