Ok, perhaps let's pump the brakes for a second. Systems and database engineering isn't dead; this paper didn't just replace Postgres with a GPU and a copy of the deep learning book. (1/n)
arxiv-vanity.com/papers/1712.01…
I mean, what's been done here is fascinating (partly in its simplicity). Replacing B-Trees - which are functions from keys to ranges on disks - with a learned function is kind of cool.
The fact that the learned representation is more compact is very neat. But also it's not really a surprise that, given the entire dataset, we can construct a more compact function than a B-tree which is *designed* to support efficient updates.
The interesting part, to me, is the integration of this kind of computation with existing database engines: how we leverage GPUs for max parallelism, how we possibly integrate training into a transactional environment, etc.
That's systems engineering, right there.
But we must avoid the trap of extrapolating from the success of this approach to all the required future work needed to make it practical.
The paper is at best speculative on issues of updates and concurrency. B-trees are still required where the model can't achieve the required accuracy - even in the static case.
The basic idea of an index that might be _wrong_ means there's still a place for computations of precise bounds to compensate: the paper doesn't throw away B-trees, and neither should you, yet.
(In fact, I do like the decomposition of a computation into a probablistic fn + a smaller 'fix-up' data structure. Sorting naturally fits that model - as the paper mentions in future work. I wonder what else does?)
(Another concern: given that the model performs best when trained on the whole data set - I strongly doubt B-trees are the best we can do with the current state-of-the art).
More generally: database and systems engineering continue to exist and be relevant because of advances like these, not in spite of them. Research combines big steps and little ones. There's much more work to do than say "all the world's a model" and break for an early lunch.(n/n)
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
