Jay Kreps Profile picture
May 16, 2023 23 tweets 7 min read Read on X
1. Here’s a summary of my #kafkasummit talk for those who missed the live stream with links and pics of all the things we announced! Image
2. This was probably the most new open source stuff, cool at-scale cloud details, and new Confluent features I’ve ever had in a keynote. But it was too exciting to leave anything out.
3. I gave an overview of the stack that is emerging around Kafka, the Data Streaming Platform. There are a couple of components to this. There’s Kafka of course, but increasingly the key other layers are the connectors, stream processing, and tools to govern streaming data. Image
4. Kafka is taking off and there is exciting innovation happening in open source and commercial products at each layer of this stack. I'll covered some of the open source work and some new product announcements from Confluent. Image
5. First, the core stream itself, Apache Kafka. As usual with open source the work is done by a broad base of committers from dozens of companies with a strong commitment to making Kafka succeed. Image
6. There is a rich roadmap of features including the recent work on Zookeeper-free Kafka, the upcoming tiered storage work, and the new work on Queues for Kafka. Image
7. Queuing in Kafka is one of the more exciting bits and there and those wanting to know more can check out KIP-932 for more details on the proposal. cwiki.apache.org/confluence/dis…
8. I also covered a bit about the internals of the Kafka service offered in Confluent Cloud and gave an in depth walk through on Kora, the core engine that runs our cloud. Image
9. Some of the advantages of Kora include 30x improvements in elasticity, 10x advantage in resilience, scalable infinite storage, substantial improvements in performance, and the cost structure that enables our Cost Challenge confluent.io/en-gb/blog/und…
10. There is more to say there than fits in twitter, so I’ll link out to the longer blog I did on Kora. confluent.io/en-gb/blog/clo…
11. Okay that is the stream layer, now let’s move up the stack a bit and talk about the rest of the data streaming platform. Image
12. The first announcement is Custom Connectors in Confluent Cloud. This means that in addition to the 70+ fully managed connectors we offer, you can now bring any Kafka Connector and run it in Confluent Cloud. Image
13. Next is Data Quality Rules in Stream Governance, these let you check richer assertions beyond just syntactic correctness in your data. Image
14. Finally the big one: Flink!!! This is probably the biggest and most important product effort at Confluent since we launched Confluent Cloud. I could not be more excited. Image
15. Why is this so important? I think stream processing has a similar role with streaming data that databases have with stored data---they make building applications easier. And Flink is in a leadership position in stream processing. Image
16. Today we are opening up early access to our fully managed Flink SQL offering. We have an exciting roadmap beyond that: Image
17. Why Flink? Well it’s a combination an amazing community and a fantastic platform. Image
18. What makes our Flink offering special? Two dimensions I’ll highlight. First, it’s truly cloud native. Image
19. Second it’s fully integrated with the rest of Confluent Cloud so all the parts of the Data Streaming Platform natively talk to each other--Kafka, connectors, Flink and Governance all work together. If you create a topic, it shows up automatically in Flink for SQL queries. Image
20. We still early in this effort but I couldn’t be more excited about the team, the progress so far, and where it is heading.
21. Finally one last thing! All of this is about sharing data within an organization. But often companies need to share beyond their own walls. Can we make that equally easy? We can, that is where Stream Sharing comes in. Image
22. Okay so all these announcements may leave you wanting more details. This blog and the recording of my talk both a bit more depth and we’ll have deep dive blogs and online sessions on many of these in the weeks ahead.confluent.io/en-gb/blog/str… kafka-summit.org/events/kafka-s…
23. Whew! That’s a lot. Huge thanks for the team at Confluent and the larger open source community for sustaining the incredible pace of innovation. Big things are happening in streaming!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jay Kreps

Jay Kreps Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @jaykreps

Aug 18, 2023
1. The two-phase commit proposal for @apachekafka (KIP-939) is pretty interesting. Quick thread on why it matters. cwiki.apache.org/confluence/dis…
2. The first use case was actually to make @apachekafka and @ApacheFlink
work better together. That was one driver for @confluentinc to work on it right now (we have a Flink service coming!). But the applicability is much broader.
3. There's a very general problem in integrating events into apps: how do you keep your events and DB state in sync? Example: a user joins your service. You want to update some DB tables and publish a “user joined” event to Kafka to let the rest of the org react, what do you do?
Read 13 tweets
Oct 29, 2021
1. Thoughtworks notes that "Kafka continues toward its status as a de facto standard." noting that Kubernetes, Kafka, and the CSPs are becoming stable layers in the next gen stack and churn around alternative platforms seems to have waned. thoughtworks.com/content/dam/th…
2. This matches our internal data as well. Measuring open source usage is pretty hard, but our best data is that Kafka adoption is growing 7x faster than the fastest growing alternative off a base that is more than 15x the scale.
3. It's interesting to reflect on why data streaming has trended towards consolidation while other areas have trended towards greater diversity (there is always room for another database). I think there are three reasons.
Read 16 tweets
Jun 24, 2021
1. A quick reflection on Confluent's IPO today and the journey so far (a thread!).
2. We wrote the initial Kafka code base at LinkedIn in 2009-2010. In 2011 we released the initial Kafka code as open source to...resounding silence. No one cared!
3. We had a lot of big ideas: building a data architecture around events, moving from batch to real-time stream processing, doing this around a kind of commit log that brought together real-time change and data storage. We knew it could be a big deal!
Read 16 tweets
Aug 25, 2020
1/ In April we at @confluentinc kicked off what we call Project Metamorphosis, which is all about building a real cloud-native service around Kafka and it's ecosystem. I talked about why I think this is a big deal in my Kafka Summit Keynote today. Here's a twitter summary:
2/ My talk's central thesis - There are two major trends that so far have been largely disjoint: cloud-native data systems and event streaming, and these need to converge. What do I mean by that?
3/ We think Kafka and event streams are on a path to take on a major role as a kind of central nervous system in a modern company, and this represents the rise of a major new paradigm for working with data.
Read 34 tweets
Aug 3, 2018
1/ Faust is a python library from for stream processing with @apachekafka from @RobinhoodApp. I think it's really cool. It highlights one of the things I think we got right with Kafka Streams: supporting stream processing in Kafka at the protocol level. github.com/robinhood/faust
2/ This means having a model in Kafka's core protocol for elastic scalability, partitioning, stateful processing, and transactionally correct processing that covers both input, output, and state changes but is decoupled from any implementation of code that does this processing.
3/ This functionality is all part of core Kafka and supported in the consumer/producer APIs (albeit in a low-level way).
Kafka Streams is really just a Java library that uses this protocol and gives reusable operators, but our hope is that will come to exist in every language.
Read 7 tweets
Nov 1, 2017
1/ People often ask why it took so long for Kafka to go 1.0.
2/ The answer isn't about stability: Kafka's been in production at thousands of companies for years.
3/ It was about completeness of the vision we were building towards. We didn't want to call it 1.0 when it was only half-way there.
Read 36 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(