Follow @oliverjumpertz

12,399 views

Oliver Jumpertz

Follow @oliverjumpertz

, 14 tweets, 2 min read

My Authors

Some things you also have to consider when building services and software systems:

- Logging
- Metrics
- Tracing

🧵👇

1) Logging
Logging may be obvious for many devs, but there's more to it than just doing it.

Choosing a format, which can be processed easily, should be a priority.

Then asking how those logs are collected and where they can be viewed is also pretty important.

1.1) Log Format

Plain text may be easy to read, but can sometimes be pretty difficult to process automatically.

You should also consider so-called tags, which is like a map where certain variables can be set, like request ids, to be able to follow the execution of a call.

1.2) Log Format

Logging in machine processable formats like JSON may be a good idea. This also opens up the possibility of processing logs automatically and indexing them inside Elastic, e.g.

1.3) Log Collection

Logs must be collected. Especially in distributed systems, where instances of services come and go, it's crucial to collect logs from each individual instance, maybe aggregate them, and then collect them at a central place.

1.4) Log Processing

After log collection, you still need to give people that need it access to all that information.

Maybe you have to unify different log formats into one, and then put them into an Elastic, e.g., or Splunk for indexing (whatever it is to use).

2) Metrics

Metrics help a lot with system monitoring.

The amount of requests to a certain API, the amount of errors, the number of open database connections, and detailed information about available system resources are crucial informations at least the ops team needs.

2.1) Collecting Metrics

You will have to build metric collection into your services.

There are a lot of libraries and frameworks for it.

But no matter the solution you choose, you still have to define them for each individual service, and add them to your code.

2.2) Metric Collection

Like logs, metrics must be collected.

They are either written to files, stdout, or are available at a special HTTP endpoints (Prometheus e.g.).

No matter the solution chosen, you need to collect them.

2.3) Metric Processing

All those collected metrics must be processed, maybe unified, and then put somewhere for people to view.

You'll maybe want to create dashboards, set up alerts on certain thresholds, etc.

3) Tracing

Especially distributed systems with many micro services that all talk to other services are especially one thing:

Difficult to debug.

Tracing helps by making the way of requests through a system more transparent.

3.1) Implementing Tracing

There are libraries and methodologies that help you to get into tracing.

You will, however, still have to instrument your code to start tracing with your services.

3.2) Collecting Traces

I think you now see where this goes. Traces also need to be collected.

The larger the system, the more different formats or methodologies there may be, for various reasons.

All of those need to be collected and put into a central place.

3.3) Processing Traces

As with all the other things in this thread, you'll want your traces to be viewable by everyone that needs to.

Maybe you need to clean that data before you can show it to your users.

Try unrolling a thread yourself!

More from @oliverjumpertz see all

Embed code for your website

Did Thread Reader help you today?