Can’t wait for this panel discussion at #current22 with ⁦@takidau⁩ ⁦@notamyfromdbt⁩ ⁦@krisajenkins⁩ ⁦@AdiPolak⁩ ⁦@esammer⁩ - it’s gonna be awesome!
(and it’s being live-streamed - make sure you tune in!) Image
Are we going to have batch and streaming forever, or will they converge? @esammer says at the heart of systems lambda arch will go away and kappa will eventually win out. Once in DW perhaps batch will remain for its familiarity to analytics engineers. Image
@notamyfromdbt - Microbatching gets used to simulate streaming but with same toolset for familiarity, but it doesn’t scale
What’s holding people back from streaming? @takidau says streaming answers everything, but the complexity is currently a challenge. Aslo low-latency isn’t always so necessary to be worth the effort
Is streaming a superset? @esammer says in theory yes. However there are some optimisations that you can do in batch that are easier than in streaming (eg checkpoints aren’t needed in batch)
@notamyfromdbt notes that there isn’t a consensus or best practice from the streaming side of things yet unlike the modern data stack on the batch side (echoing @bennstancil’s point from his talk earlier - make it boring)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Robin Moffatt 🍻🏃🥓

Robin Moffatt 🍻🏃🥓 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @rmoff

Oct 5
#Current22 @AdiPolak talking about chaos engineering Image
A scary list of all the things that could go wrong with data flows #Current22 Image
“Disagree and commit” h/t @matryer #Current22 Image
Read 20 tweets
Oct 5
Apparently data people are really boring people, so the hype around big data dying down fitted well #current22 Image
The most boring diagram in IT. We’ve standardised the tooling around all this (except BI) #current22 Image
Fivetran, dbt, snowflake are the boring defaults #Current22 Image
Read 20 tweets
Oct 5
#Current22 @bennstancil talks about the end of big data industrial complex Image
Benn got into big data in 2012 at Yammer, right at the beginning of the hype
Recounts the story of Target using data science to send coupons to customers who were determined to be pregnant based on purchasing habits
Read 6 tweets
Oct 5
“A lot of the time you don’t need real time” *gasp* #current22
“A lot of the Modern Data Stack is marketing bullshit” #current22

OMG I love this talk Image
ELT vs ETL
#current22 Image
Read 17 tweets
Oct 5
Dan Sotolongo at #current22: RDBMS and SQL have stood the test of time. Sets the scene for stream processing by covering core concepts of tables and steams ImageImage
#current22 handling event time joins in SQL using functions. Image
The next problem is making sure we have all the data. It’s watermarks, but not really

#Current22 Image
Read 11 tweets
Sep 8
Having watched @gwenshap and @ozkatz100 talk about "git for data" I would definitely say is a serious idea.
However to the point at the end of the video, RTFM—it took reading docs.lakefs.io/using_lakefs/d… and some other pages subsequently to really grok the concept in practice.
Where I struggled at first with the git analogy alone was that data changes, and I couldn't see how branch/merge fitted into that outside of the idea of branching for throwaway testing alone. The 1PB accident was useful for illustrating the latter point for sure.
But then reading docs.lakefs.io/understand/roa… made me realise that I was thinking about the whole thing from a streaming PoV—when actually the idea of running a batch against a branch with a hook to validate and then merge is a freakin awesome idea
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(