You are very likely to run into a ๐๐ถ๐๐๐ฟ๐ถ๐ฏ๐๐๐ฒ๐ฑ ๐๐ผ๐บ๐ฝ๐๐๐ฒ ๐ฆ๐๐๐๐ฒ๐บ ๐ผ๐ฟ ๐๐ฟ๐ฎ๐บ๐ฒ๐๐ผ๐ฟ๐ธ in your career. It could be ๐ฆ๐ฝ๐ฎ๐ฟ๐ธ, ๐๐ถ๐๐ฒ, ๐ฃ๐ฟ๐ฒ๐๐๐ผ or any other.
๐
Also, it is very likely that these Frameworks would be reading data from a distributed storage. It could be ๐๐๐๐ฆ, ๐ฆ๐ฏ etc.
๐
These Frameworks utilize multiple ๐๐ฃ๐จ ๐๐ผ๐ฟ๐ฒ๐ ๐ณ๐ผ๐ฟ ๐๐ผ๐ฎ๐ฑ๐ถ๐ป๐ด ๐๐ฎ๐๐ฎ and performing ๐๐ถ๐๐๐ฟ๐ถ๐ฏ๐๐๐ฒ๐ฑ ๐๐ผ๐บ๐ฝ๐๐๐ฒ in parallel.
๐
How files are stored in your ๐ฆ๐๐ผ๐ฟ๐ฎ๐ด๐ฒ ๐ฆ๐๐๐๐ฒ๐บ ๐ถ๐ ๐๐ฒ๐ for utilizing distributed ๐ฅ๐ฒ๐ฎ๐ฑ ๐ฎ๐ป๐ฑ ๐๐ผ๐บ๐ฝ๐๐๐ฒ ๐๐ณ๐ณ๐ถ๐ฐ๐ถ๐ฒ๐ป๐๐น๐.
โก๏ธ ๐ฆ๐ฝ๐น๐ถ๐๐๐ฎ๐ฏ๐น๐ฒ ๐๐ถ๐น๐ฒ๐ are Files that can be partially read by several processes at the same time.
โก๏ธ In distributed file or block storages files are stored in chunks called blocks.
โก๏ธ Block sizes will vary between different storage systems.
โก๏ธ If your file is ๐ก๐ผ๐ป-๐ฆ๐ฝ๐น๐ถ๐๐๐ฎ๐ฏ๐น๐ฒ and is bigger than a block in storage - it will be split between blocks but will only be read by a ๐ฆ๐ถ๐ป๐ด๐น๐ฒ ๐๐ฃ๐จ ๐๐ผ๐ฟ๐ฒ which might cause ๐๐ฑ๐น๐ฒ ๐๐ฃ๐จ time.
๐
โก๏ธ If your file is ๐ฆ๐ฝ๐น๐ถ๐๐๐ฎ๐ฏ๐น๐ฒ - multiple cores can read it at the same time (one core per block).
โก๏ธ If possible - prefer ๐ฆ๐ฝ๐น๐ถ๐๐๐ฎ๐ฏ๐น๐ฒ ๐๐ถ๐น๐ฒ types.
โก๏ธ If you are forced to use ๐ก๐ผ๐ป-๐ฆ๐ฝ๐น๐ถ๐๐๐ฎ๐ฏ๐น๐ฒ files - manually partition them into sizes that would fit into a single FS Block to utilize more CPU Cores.
Usually MLOps Engineers are professionals tasked with building out the ML Platform in the organization.
๐
This means that the skill set required is very broad - naturally very few people start off with the full set of skills you would need to brand yourself as a MLOps Engineer. This is why I would not choose this role if you are just entering the market.
So how do we implement ๐ฃ๐ฟ๐ผ๐ฑ๐๐ฐ๐๐ถ๐ผ๐ป ๐๐ฟ๐ฎ๐ฑ๐ฒ ๐๐ฎ๐๐ฐ๐ต ๐ ๐ฎ๐ฐ๐ต๐ถ๐ป๐ฒ ๐๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด ๐ ๐ผ๐ฑ๐ฒ๐น ๐ฃ๐ถ๐ฝ๐ฒ๐น๐ถ๐ป๐ฒ in ๐ง๐ต๐ฒ ๐ ๐๐ข๐ฝ๐ ๐ช๐ฎ๐?
๐ญ: Everything starts in version control: Machine Learning Training Pipeline is defined in code, once merged to the main branch it is built and triggered.
๐
๐ฎ: Feature preprocessing stage: Features are retrieved from the Feature Store, validated and passed to the next stage. Any feature related metadata is saved to an Experiment Tracking System.
How do we ๐๐ฒ๐ฐ๐ผ๐บ๐ฝ๐ผ๐๐ฒ ๐ฅ๐ฒ๐ฎ๐น ๐ง๐ถ๐บ๐ฒ ๐ ๐ฎ๐ฐ๐ต๐ถ๐ป๐ฒ ๐๐ฒ๐ฎ๐ฟ๐ป๐ถ๐ป๐ด ๐ฆ๐ฒ๐ฟ๐๐ถ๐ฐ๐ฒ ๐๐ฎ๐๐ฒ๐ป๐ฐ๐ and why should you care to understand the pieces as a ML Engineer?
Usually, what is cared about by the users of your Machine Learning Service is the total endpoint latency - the time difference between when a request is performed (1.) against the Service till when the response is received (6.).
๐
Certain SLAs will be established on what the acceptable latency is and you will need to reach that. Being able to decompose the total latency is even more important as you can improve each piece independently. Let's see how.
๐๐ฝ๐ฎ๐ฐ๐ต๐ฒ ๐ฆ๐ฝ๐ฎ๐ฟ๐ธ is an extremely popular distributed processing framework utilizing in-memory processing to speed up task execution. Most of its libraries are contained in the Spark Core layer.
๐
As a warm up exercise for later deeper dives and tips, today we focus on some architecture basics.
In its simplest form Data Contract is an agreement between Data Producers and Data Consumers on what the Data being produced should look like, what SLAs it should meet and the semantics of it.
What does a ๐ฅ๐ฒ๐ฎ๐น ๐ง๐ถ๐บ๐ฒ ๐ฆ๐ฒ๐ฎ๐ฟ๐ฐ๐ต ๐ผ๐ฟ ๐ฅ๐ฒ๐ฐ๐ผ๐บ๐บ๐ฒ๐ป๐ฑ๐ฒ๐ฟ ๐ฆ๐๐๐๐ฒ๐บ ๐๐ฒ๐๐ถ๐ด๐ป look like?
The graph was inspired by the amazing work of @eugeneyan
Recommender and Search Systems are one of the biggest money makers for most companies when it comes to Machine Learning.
๐
Both Systems are inherently similar. Their goal is to return a list of recommended items given a certain context - it could be a search query in the e-commerce website or a list of recommended songs given that you are currently listening to a certain song on Spotify.