June 30, 2022
Qbeast
A few days ago I ran into Qbeast which is an open-source project on top of delta lake I needed to dig into.
This introductory post explains it quite well: https://qbeast.io/qbeast-format-enhanced-data-lakehouse/
The project is quite good and it seems helpful if you need to write your custom data source as everything is documented. And well as I’m in love with note-taking I want to dig into the following three topics:
Explaining how the format works (including optimizations) Describing how the sampling push is implementing Understanding the table tolerance 1. Qbeast format This would be better explained with diagrams. Remember delta lake? We had a _delta_log folder with files pointing to files. Now Qbeast has extended this delta_log and has added some new properties.
Read more