“A lot of the time you don’t need real time” *gasp* #current22
“A lot of the Modern Data Stack is marketing bullshit” #current22

OMG I love this talk Image
ELT vs ETL
#current22 Image
dbt sits on top of your data platform of choice Image
dbt workflow #Current22 Image
The view of batch vs streaming from @notamyfromdbt #current22
Strong resonance with @esammer’s talk yesterday Image
the convergence of batch and streaming worlds #current22 Image
Being able to write the same SQL without needing to code for time windows etc is more accessible and makes it feel much more like a regular database #current22 Image
working with streaming data in dbt and snowflake. Streaming and batch nodes in the same lineage chart #current22 Image
Who owns all of this? The analytics engineer. Image
🎯 Shouldn’t be talking about batch vs streaming, but what your company needs #Current22
Where could it go wrong? #current22

(Love that @notamyfromdbt looks forward on this and not just a best case scenario) Image
🔥 #current22 Image
Spicy take, but important one. *why* does it need to be real-time? What are they doing with that data? What is the business impact if we *don’t* have real-time #Current22
Sometimes though you *do* need real-time. @notamyfromdbt’s example was an airline that had a screen showing when to close the gate for a flight. A five minute SLA was ok and only what was possible with the tooling at the time - but real-time *would* have been better #Current22
Sometimes it’s not either/or - it’s both #Current22 Image
The world is shifting. It should be about the use cases, not batch vs streaming. Analytics engineer is well placed to own this intersection. #current22

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Robin Moffatt 🍻🏃🥓

Robin Moffatt 🍻🏃🥓 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @rmoff

Oct 5
Can’t wait for this panel discussion at #current22 with ⁦@takidau⁩ ⁦@notamyfromdbt⁩ ⁦@krisajenkins⁩ ⁦@AdiPolak⁩ ⁦@esammer⁩ - it’s gonna be awesome!
(and it’s being live-streamed - make sure you tune in!) Image
Are we going to have batch and streaming forever, or will they converge? @esammer says at the heart of systems lambda arch will go away and kappa will eventually win out. Once in DW perhaps batch will remain for its familiarity to analytics engineers. Image
@notamyfromdbt - Microbatching gets used to simulate streaming but with same toolset for familiarity, but it doesn’t scale
Read 6 tweets
Oct 5
#Current22 @AdiPolak talking about chaos engineering Image
A scary list of all the things that could go wrong with data flows #Current22 Image
“Disagree and commit” h/t @matryer #Current22 Image
Read 20 tweets
Oct 5
Apparently data people are really boring people, so the hype around big data dying down fitted well #current22 Image
The most boring diagram in IT. We’ve standardised the tooling around all this (except BI) #current22 Image
Fivetran, dbt, snowflake are the boring defaults #Current22 Image
Read 20 tweets
Oct 5
#Current22 @bennstancil talks about the end of big data industrial complex Image
Benn got into big data in 2012 at Yammer, right at the beginning of the hype
Recounts the story of Target using data science to send coupons to customers who were determined to be pregnant based on purchasing habits
Read 6 tweets
Oct 5
Dan Sotolongo at #current22: RDBMS and SQL have stood the test of time. Sets the scene for stream processing by covering core concepts of tables and steams ImageImage
#current22 handling event time joins in SQL using functions. Image
The next problem is making sure we have all the data. It’s watermarks, but not really

#Current22 Image
Read 11 tweets
Sep 8
Having watched @gwenshap and @ozkatz100 talk about "git for data" I would definitely say is a serious idea.
However to the point at the end of the video, RTFM—it took reading docs.lakefs.io/using_lakefs/d… and some other pages subsequently to really grok the concept in practice.
Where I struggled at first with the git analogy alone was that data changes, and I couldn't see how branch/merge fitted into that outside of the idea of branching for throwaway testing alone. The 1PB accident was useful for illustrating the latter point for sure.
But then reading docs.lakefs.io/understand/roa… made me realise that I was thinking about the whole thing from a streaming PoV—when actually the idea of running a batch against a branch with a hook to validate and then merge is a freakin awesome idea
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(