Exciting to see Quokka at the top of Hacker News (written by Ziheng Wang).
In ~1000 lines of Python, Quokka is a high performance fault-tolerant query engine built on
1⃣ Ray (@raydistributed) - distributed execution
2⃣ Polars - fast dataframes
3⃣ Arrow (@ApacheArrow) - fast I/O
We tried to design #Ray to be as flexible as possible, and this makes it possible to build not only scalable applications with Ray, but also to build entire scalable systems and products on top.
@raydistributed's flexibility comes from separating functionality into two layers
✅ Lower-level core APIs for scalable Python (tasks and actors).
✅ Higher level libraries built on top of the core APIs (for scaling data ingest, deep learning, serving, etc).
The lower-level APIs are very general (because they are lower level) and can be used to scale arbitrary Python code.
The higher-level libraries are performant and easier to use out of the box for specific use cases (data ingest, training, inference, serving, etc).
I think Python is so successful in large part due to it's great library ecosystem.
Similarly, @raydistributed will thrive because of its rich ecosystem of scalable libraries and applications (and it's critical that the interoperate well).
Also, a gem from the documentation (which has yet to be updated):
"Quokka is not fault tolerant, though it will be by the end of 2022. This is how I intend to be collecting my PhD, so you can be pretty darn sure it will happen."