Metrics collection is a popular system design interview question. There are two ways metrics data can be collected, pull or push. It is a routine debate. In this post, we will take a look at the pull model. 1/8
Figure 1 shows data collection with a pull model over HTTP. We have dedicated metric collectors which pull metrics values from the running applications periodically. 2/8
In this approach, the metrics collector needs to know the complete list of service endpoints to pull data from. One naive approach is to use a file to hold DNS/IP information for every service endpoint on the “metric collector” servers. 3/8
While the idea is simple, this approach is hard to maintain in a large-scale environment where servers are added or removed frequently, and we want to ensure that metric collectors don’t miss out on collecting metrics from any new servers. 4/8
The good news is that we have a scalable solution available through Service Discovery, provided by Kubernetes, Zookeeper, etc., wherein services register their availability and the metrics collector can be notified whenever the list of service endpoints changes (Figure 2) 5/8
Figure 3 explains the pull model in detail.
1️⃣ The metrics collector fetches configuration metadata of service endpoints from Service Discovery. Metadata include pulling interval, IP addresses, timeout and retries parameters, etc. 6/8
2️⃣ The metrics collector pulls metrics data via a pre-defined HTTP endpoint (for example, /metrics). To expose the endpoint, a client library usually needs to be added to the service. In Figure 3, the service is Web Servers. 7/8
3️⃣ Optionally, the metrics collector registers a change event notification with Service Discovery to receive an update whenever the service endpoints change. Alternatively, the metrics collector can poll for endpoint changes periodically. 8/8
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Model Context Protocol (MCP) is a new system introduced by Anthropic to make AI models more powerful.
It is an open standard (also being run as an open-source project) that allows AI models (like Claude) to connect to databases, APIs, file systems, and other tools without needing custom code for each new integration.
MCP follows a client-server model with 3 key components:
1 - Host: AI applications like Claude that provide the environment for AI interactions so that different tools and data sources can be accessed. The host runs the MCP Client.
Kubernetes (K8S) is an open-source container orchestration platform originally developed by Google and now maintained by CNCF.
Here’s how developers interact with Kubernetes:
1 - Developers create manifest files describing the application.
2 - Kubernetes takes these manifest files, validates them, and deploys the applications across its cluster of worker nodes.
3 - Kubernetes manages the entire lifecycle of the application.
Kubernetes is made up of two main components:
1 - Control Plane: It is like the brain of Kubernetes and consists of the following parts:
- API Server: It receives all incoming requests from users or CLI.
1 - Collaboration Tools
Software development is a social activity. Learn to use collaboration tools like Jira, Confluence, Slack, MS Teams, Zoom, etc.
2 - Programming Languages
Pick and master one or two programming languages. Choose from options like Java, Python, JavaScript, C#, Go, etc.
3 - API Development
Learn the ins and outs of API Development approaches such as REST, GraphQL, and gRPC.
4 - Web Servers and Hosting
Know about web servers as well as cloud platforms like AWS, Azure, GCP, and Kubernetes
5 - Authentication and Testing
Learn how to secure your applications with authentication techniques such as JWTs, OAuth2, etc. Also, master testing techniques like TDD, E2E Testing, and Performance Testing
6 - Databases
Learn to work with relational (Postgres, MySQL, and SQLite) and non-relational databases (MongoDB, Cassandra, and Redis).
Twitter has enforced very strict rate limiting. Some people cannot even see their own tweets.
Rate limiting is a very important yet often overlooked topic. Let's use this opportunity to take a look at what it is and the most popular algorithms.
A thread.
#RateLimitExceeded
What is rate limiting? Rate limiting controls the rate at which users or services can access a resource. Here are some examples:
- A user can send a message no more than 2 per second
- One can create a maximum of 10 accounts per day from the same IP address
Fixed Window Counter
The algorithm divides the timeline into fixed-size time windows and assigns a counter for each window. Each request increments the counter by some value. Once the counter reaches the threshold, subsequent requests are blocked until the new time window begins