Tweet

Aurimas Griciūnas

Feb 21 • 8 tweets • 4 min read

Here is a short refresher on 𝗔𝗖𝗜𝗗 𝗣𝗿𝗼𝗽𝗲𝗿𝘁𝗶𝗲𝘀 𝗼𝗳 𝗗𝗕𝗠𝗦 (𝗗𝗮𝘁𝗮𝗯𝗮𝘀𝗲 𝗠𝗮𝗻𝗮𝗴𝗲𝗺𝗲𝗻𝘁 𝗦𝘆𝘀𝘁𝗲𝗺).

🧵

#Data #DataEngineering #MLOps #MachineLearning #DataScience

It could be that you are taking ACID Properties for granted when you are using transactional databases.

If you are interviewing for Data Engineering roles you will be asked to explain what the concept means.

👇

Let’s take a closer look.

Transaction is a sequence of steps performed on a database as a single logical unit of work.

The ACID database transaction model ensures that a performed transaction is always consistent by ensuring:

👇

➡️ 𝗔𝘁𝗼𝗺𝗶𝗰𝗶𝘁𝘆 - Each transaction is either properly carried out or the database reverts back to the state before the transaction started.
➡️ 𝗖𝗼𝗻𝘀𝗶𝘀𝘁𝗲𝗻𝗰𝘆 - The database must be in a consistent state before and after the transaction.

👇

➡️ 𝗜𝘀𝗼𝗹𝗮𝘁𝗶𝗼𝗻 - Multiple transactions occur independently without interference.
➡️ 𝗗𝘂𝗿𝗮𝗯𝗶𝗹𝗶𝘁𝘆 - Successful transactions are persisted even in the case of system failure.

👇

ACID guarantees will be ensured by the most Relational DBMSes:

👉 MySQL
👉 PostgreSQL
👉 Microsoft SQL Server
👉 …

👇

NoSQL databases usually do not conform to them - they are enforcing another transaction model called BASE. BASE guarantees eventual consistency. Example databases:

👉 Cassandra
👉 MongoDB
👉 …

👇

👋 I am Aurimas.

I will help you Level Up in #MLOps, #MachineLearning, #DataEngineering, #DataScience and overall #Data space.

𝗙𝗼𝗹𝗹𝗼𝘄 𝗺𝗲 and hit 🔔

Join a growing community of Data Professionals by subscribing to my 𝗡𝗲𝘄𝘀𝗹𝗲𝘁𝘁𝗲𝗿: newsletter.swirlai.com/p/sai-17-patte…

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @Aurimas_Gr

Aurimas Griciūnas

@Aurimas_Gr

Feb 23

Do you know how 𝗔𝗽𝗮𝗰𝗵𝗲 𝗦𝗽𝗮𝗿𝗸 𝗶𝘀 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝗲𝗱?

Find out in the 🧵

#Data #DataEngineering #MLOps #MachineLearning #DataScience

𝗔𝗽𝗮𝗰𝗵𝗲 𝗦𝗽𝗮𝗿𝗸 is an extremely popular distributed processing framework utilizing in-memory processing to speed up task execution. Most of its libraries are contained in the Spark Core layer.

👇

As a warm up exercise for later deeper dives and tips, today we focus on some architecture basics.

𝗦𝗽𝗮𝗿𝗸 𝗵𝗮𝘀 𝘀𝗲𝘃𝗲𝗿𝗮𝗹 𝗵𝗶𝗴𝗵 𝗹𝗲𝘃𝗲𝗹 𝗔𝗣𝗜𝘀 𝗯𝘂𝗶𝗹𝘁 𝗼𝗻 𝘁𝗼𝗽 𝗼𝗳 𝗦𝗽𝗮𝗿𝗸 𝗖𝗼𝗿𝗲 𝘁𝗼 𝘀𝘂𝗽𝗽𝗼𝗿𝘁 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝘂𝘀𝗲 𝗰𝗮𝘀𝗲𝘀:

👇

Read 15 tweets

Aurimas Griciūnas

@Aurimas_Gr

Feb 23

A refresher on the role of 𝗗𝗮𝘁𝗮 𝗖𝗼𝗻𝘁𝗿𝗮𝗰𝘁𝘀 in the Data Pipeline.

Read on in the 🧵

#Data #DataEngineering #MLOps #MachineLearning #DataScience

In its simplest form Data Contract is an agreement between Data Producers and Data Consumers on what the Data being produced should look like, what SLAs it should meet and the semantics of it.

👇

𝗗𝗮𝘁𝗮 𝗖𝗼𝗻𝘁𝗿𝗮𝗰𝘁 𝘀𝗵𝗼𝘂𝗹𝗱 𝗵𝗼𝗹𝗱 𝘁𝗵𝗲 𝗳𝗼𝗹𝗹𝗼𝘄𝗶𝗻𝗴 𝗻𝗼𝗻-𝗲𝘅𝗵𝗮𝘂𝘀𝘁𝗶𝘃𝗲 𝗹𝗶𝘀𝘁 𝗼𝗳 𝗺𝗲𝘁𝗮𝗱𝗮𝘁𝗮:

👉 Schema of the Data being Produced.

👇

Read 14 tweets

Aurimas Griciūnas

@Aurimas_Gr

Feb 22

@eugeneyan

What does a 𝗥𝗲𝗮𝗹 𝗧𝗶𝗺𝗲 𝗦𝗲𝗮𝗿𝗰𝗵 𝗼𝗿 𝗥𝗲𝗰𝗼𝗺𝗺𝗲𝗻𝗱𝗲𝗿 𝗦𝘆𝘀𝘁𝗲𝗺 𝗗𝗲𝘀𝗶𝗴𝗻 look like?

The graph was inspired by the amazing work of @eugeneyan

More in the 🧵

#Data #DataEngineering #MLOps #MachineLearning #DataScience

Recommender and Search Systems are one of the biggest money makers for most companies when it comes to Machine Learning.

👇

Both Systems are inherently similar. Their goal is to return a list of recommended items given a certain context - it could be a search query in the e-commerce website or a list of recommended songs given that you are currently listening to a certain song on Spotify.

👇

Read 12 tweets

Aurimas Griciūnas

@Aurimas_Gr

Feb 1

𝗡𝗼 𝗘𝘅𝗰𝘂𝘀𝗲𝘀 𝗗𝗮𝘁𝗮 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗣𝗼𝗿𝘁𝗳𝗼𝗹𝗶𝗼 𝗧𝗲𝗺𝗽𝗹𝗮𝘁𝗲 - next week I will enrich it with the missing Machine Learning and MLOps parts!

🧵

#Data #DataEngineering #MLOps #MachineLearning #DataScience

Today - let’s review it once more. It is super helpful as these kind of Data Architectures are what you will find in real life situations.

𝗥𝗲𝗰𝗮𝗽:

👇

𝟭. Data Producers - Python Applications that extract data from chosen Data Sources and push it to Collector via REST or gRPC API calls.

👇

Read 14 tweets

Aurimas Griciūnas

@Aurimas_Gr

Jan 31

What are 𝗟𝗮𝗺𝗯𝗱𝗮 𝗮𝗻𝗱 𝗞𝗮𝗽𝗽𝗮 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲𝘀?

🧵

#Data #DataEngineering #MLOps #MachineLearning #DataScience

Lambda and Kappa are both Data architectures proposed to solve movement of large amounts of data for reliable Online access.

👇

The most popular architecture has been and continues to be Lambda. However, with Stream Processing becoming more accessible to organizations of every size you will be hearing a lot more of Kappa in the near future. Let’s see how they are different.

👇

Read 15 tweets

Aurimas Griciūnas

@Aurimas_Gr

Jan 30

Let’s remind ourselves of how a 𝗥𝗲𝗾𝘂𝗲𝘀𝘁-𝗥𝗲𝘀𝗽𝗼𝗻𝘀𝗲 𝗠𝗼𝗱𝗲𝗹 𝗗𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁 looks like - 𝗧𝗵𝗲 𝗠𝗟𝗢𝗽𝘀 𝗪𝗮𝘆.

🧵

#MLOps #MachineLearning #DataScience #Data

You will find this type of model deployment to be the most popular when it comes to Online Machine Learning Systems.

Let's zoom in:

𝟭: Version Control: Machine Learning Training Pipeline is defined in code, once merged to the main branch it is built and triggered.

👇

𝟮: Feature Preprocessing: Features are retrieved from the Feature Store, validated and passed to the next stage. Any feature related metadata that is tightly coupled to the Model being trained is saved to the Experiment Tracking System.

👇

Read 14 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Aurimas Griciūnas

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @Aurimas_Gr

Aurimas Griciūnas

Aurimas Griciūnas

Aurimas Griciūnas

Aurimas Griciūnas

Aurimas Griciūnas

Aurimas Griciūnas

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!