Skip to content

jli/brin-overlap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Postgres BRIN index visualization

This repo contains code for visualizing what your timestamp BRIN index looks like. It generates plots that look like this:

example BRIN visualization

Why visualize BRIN indexes?

BRIN indexes have pitfalls. They are best suited for timestamp columns that are monotonically non-decreasing on append-only tables. Deleting columns or adding rows out of order or concurrently can result in pathological indexes that cause a lot of extraneous data reading. See: https://blog.crunchydata.com/blog/avoiding-the-pitfalls-of-brin-indexes-in-postgres

Ideal BRIN index

An ideal BRIN index has no overlapping block ranges: each timestamp value maps to exactly 1 block range. When querying for a time span, the extra data read is at most up to 2 block ranges (up to 2MB, with the default blocks_per_range of 128 and block size of 8kb). An ideal BRIN example:

example of ideal BRIN index

Problematic BRIN index

Overlapping isn't necessarily bad, as long as each block range still covers a small time window. The real problem is when your block ranges are (1) heavily overlapping and (2) cover large time spans. The following is an example from a real database:

example BRIN visualization with overlaps

Here, there's a large stack of block ranges from ~2020-01 to 2020-03. Queries for a single day's data in this date range will match many block ranges, resulting in a lot of unnecessary data being read. This will show up in EXPLAIN ANALYZE as a high Rows Removed by Index Recheck value (example EXPLAIN on a bad BRIN).

How to use these tools

Export BRIN internals with export_brin_items.sh

PGDB=your-db-hostname
PGUSER=your-db-user
PGPASSWORD=your-db-user-password
PGIDX=your-brin-index-name
./export_brin_items.sh

export_brin_items.sh uses the pageinspect extension to output a CSV containing the BRIN index's internals.

Note: it's expected that the script will output an error like: ERROR: block number N is out of range for relation "TABLE_idx". The script simply keeps querying until it hits this error.

Compute and plot overlaps with bro_viz.py

./bro_viz.py -i brinexport_20210412_174650.csv

bro_viz.py reads the exported CSV from the previous step and generates an SVG plot showing the block ranges in the index.

For large tables with many block ranges, it can take a while to compute the overlap. bro_viz.py will save the intermediate BrinOverlap data as JSON. You can use this JSON as the input for bro_viz.py if you'd like to tweak the rendering options (e.g., width or colormap) without recomputing the overlap again.

Other tools

  • bro_relevant_blocks.py: Specify a date (-d) or date range (-d and -d2) and see what block ranges match.
  • bro_timespan_hist.py: Render a histogram of block range time spans.

Reference

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published