There are multiple tools to run benchmarks on Postgres, but pgbench is probably the most widely used one. The workload is very simple and perhaps a bit synthetic, but almost everyone is familiar with it and it’s a very convenient way to do quick tests and assessments. It was improved in various ways (e.g. to do partitioning), but the initial data load is still serial - only a single process does the COPY. Which annoys me - it may take a lot of time before I can start with the benchmarks it...
A couple days ago I had a bit of free time in the evening, and I was bored, so I decided to play with BOLT a little bit. No, not the dog from a Disney movie, the BOLT tool from LLVM project, aimed at optimizing binaries. It took me a while to get it working, but the results are unexpectedly good, in some cases up to 40%. So let me share my notes and benchmark results, and maybe there’s something we can learn from it. We’ll start by going through a couple rabbit holes first, though.
Time for yet another “first patch” idea post ;-) This time it’s about BRIN indexes. Postgres has a contrib module called amcheck, meant to check logical consistency of objects (tables and indexes). At the moment the module supports heap relations (i.e. tables) and B-Tree indexes (by far the most commonly used index type). There is a patch adding support for GiST and GIN indexes, and the idea is to also allow checking BRIN indexes.
I’ve submitted a lot of talk proposals to a lot of Postgres conferences over the years. Some got accepted, many more were not. And I’ve been on the other side of this process too, as a member of the CfP committee responsible for selecting talks. So let me give you a couple suggestions on how to write a good talk proposal.
Let me present another “first patch” idea, related to a runtime stats on access to files storing data. Having this kind of information would be very valuable on instances with many files (which can happen for many reasons).
This is a very different area than the patch idea, which was about an extension. The runtime stats are at the core of the system, and so is the interaction with the file systems. But it’s still fairly isolated, and thus suitable for new contributors.