Project idea 1β£: A stand-alone tool for compacting the schema history topic of Debezium connectors, allowing for faster start-up of connectors with large histories.
Project idea 2β£: Porting the Debezium Cassandra connector to Debezium Server, allowing for a unified user experience across all the different connectors.
Project idea 3β£: Implementing a JDBC sink connector which is fully aware of the Debezium event format and semantics, allowing for a simpler configuration of end-to-end data pipelines.
Interested in any of these project ideas, or have your own idea for a potential #Debezium GSoC project? Let us know via the mailing list (link above) or the chat at debezium.zulipchat.com/#narrow/streamβ¦ π.
β’ β’ β’
Missing some Tweet in this thread? You can try to
force a refresh
β±οΈ Just ten more days until the release of @java 17, the next version with long-term support! To shorten the waiting time a bit, I'll do one tweet per day on a cool feature added since 11 (previous LTS), introducing just some of the changes making worth the upgrade. Let's go π!
π Ambigous null pointer exceptions were a true annoyance in the past. Not a problem any longer since Java 14: Helpful NPEs (JEP 358, openjdk.java.net/jeps/358) now exactly show which variable is null. A very nice improvement to #OpenJDK, previously available only in SAP's JVM.
9β£ Varying load and new app instances must be started up quickly? Check out class-data sharing (CDS), whose dev exp has improved a lot with JEP 350 (Dynamic CDS Archives, Java 13); also way more classes are archiveable since Java 15. More details here: morling.dev/blog/smaller-fβ¦
#Postgres as an event store -- Thanks a lot for all the super-insightful answers π! It looks like using a jsonb[] for modeling an event stream isn't ideal performance-wise, but several great pointers to using #Postgres for event sourcing here. Mentioned solutions include... 1/4
A short 𧡠on @apachekafka topic creation (triggered by @niko_nava, thanks!): who should create Kafka topics, how to make sure they have the right settings, how to avoid dependencies between producer and consumer(s)? Here's my take:
2β£ Don't use broker-side topic auto-creation! You'll lack fine-grained control over different settings for different topics; Merely polling, or requesting metadata, will trigger creation based on global settings. Plus, some cloud services don't expose auto-creation to begin with.
3β£ Instead, the producer side should be in charge of creating topics. There you have the information and knowledge about the required settings (replication factor, no. of partitions, retention policy, etc.) for each topic. Depending on your requirements, different approaches...
Agreed, the term is sub-par. But hear me out, the architecture is not. Let's talk about a few common misconceptions about Serverless!
1β£ "Serverless means no servers"
There *are* servers involved, but it's not on you to run and operate them. Instead, the serverless provider is managing the platform, scaling things up (and down) as needed. Less things to take care of, billed per-use.
Myth: BUSTED!
2β£ "Serverless is cheaper"
Pay-per-use makes low/medium-volume workloads really cheap. But pricing is complex: no. of requests, assigned RAM/CPU, API gateways, traffic etc. Depending on your workload (e.g. high, sustained), other options like VMs are better.
Thanks for all votes and insightful answers to the poll on usage of @java's var! Not unexpectly, replies range from "using var all the time" to "don't see the point of it". Yet one third never using var at all was a surprise for me. Some repeating themes from replies in this π§΅.
1β£ Readability vs. writability: some argued var optimizes for writing code (less characters to type) at the cost of reading code (less explicit type info). I don't think that's the intention behind var. In fact, more (redundant, repetitive) code may read worse.
2β£ var only or primarily used in test code: described by several folks as a good starting point for getting their feet wet with local variable type inference, before using it more widely.
Message transformations (SMTs) are an invaluable feature of @ApacheKafka Connect, enabling tons of use cases with a small bit of coding, or even just configuration of existing SMTs ready to use. Here are some applications in the context of change data capture: (1/7)
* Converting data types and formats: date/time formats are the most common example here, e.g. to convert milli-seconds timestamps into strings adhering to a specific date format (2/7)
* Creating an "anti-corruption layer", shielding consumers from legacy schemas or ensuring compatibility after schema changes; e.g. could use an SMT to choose more meaningful field names, or re-add a field using its old name after a column rename, easing consumer migration (3/7)