I'm learning PostgreSQL Clustering abilities and I would like to compare performance of the same query with table not clustered and with table clustered.
I tried to generate 25 million user events and run query before clustering and after.
Yet, running EXPLAIN ANALYSE doesn't give monotonic results in each case, and it's hard to compare the values. I mean, running it before clusterization results in query time ~100-200ms, and after clusterization results in somewhat similar, though I see that Heap Fetches: 0 in that case.
My question is how do I benchmark query before and query after to analyze it? Are there any tools available that allow to do it? Maybe I can collect stats from multiple query runs and get the visualization of percentiles in each case?
I have seen that it's possible to collect the sum of values and to compare it, but isn't it possible to get percentiles somehow? Maybe you use some data visualization tools for that?
Heap Fetches: 0it could mean your test ended up being fine with an index-only scan, for example, and in that case, it wouldn't matter if your table is freshly clustered or a complete unordered mess full of unvacuumed tuples, because it's not being scanned at all. Everything came from the index, which is a separate object, entire point of which is to keep things ordered inside it.EXPLAIN ANALYZEis the right tool but it's just that not all queries will benefit from their target table beingCLUSTERed - if yours can get everything it needs from the index, that's actually better. If you're certainCLUSTERshould help, make sure the table isANALYZEd after clustering, before you start your tests. Once your measurements start making sense,pg_bench.explain(analyze, verbose, buffers, settings). And you have to share the results if you need help with it.