Prometheus Data & Query Model (thread)

Basics first, bear with me.

Everything starts from a METRIC.

Metric is a certain measurement of a system one wants to track. Metric names are identifiers.

Ex: http_requests_total, gc_duration_seconds, etc.
Every metric is measured at a certain TIME and has a certain VALUE.

Time is always with milliseconds precision.

Value is always float64, even if it looks like an integer.

Metric value, aka sample, is a pair (value, timestamp).
A metric can be LABEL-ed to allow more fine-grained control over measurements.

Label names are identifiers too (in a common programming sense).

Label values are always strings.

Ex: http_requests_total{method="GET", status="200"}
Last but not least in the data model...

TIME SERIES - a series of (value, timestamp) samples attributed to a certain metric and label set.

Ex:

- series A: requests{method="GET"}
- series B: requests{method="PUT"}
- series C: queries{type="SELECT"}
Prometheus docs mention also different metric types:

- counter
- gauge
- histogram
- summary

But the metric type doesn't really matter - in the end, everything boils down to series of (value, timestamp) tuples.
Query model begins...

Prometheus introduces the concept of a vector. W/o really saying much why is it called so.

Well, in programming, "vector" and "array" are synonyms. I.e. it's a finite sequence of homogeneous elements.

So, what are those vectors in Prometheus?
Since we are dealing with a TSDB, my first thought was that a vector is a bunch of samples corresponding to a certain time range.

But it's not...
Let's take a closer look at PromQL.

The simplest possible PromQL query consists just of a metric name.

But!

Metric != Series

There is usually a bunch of series behind a single metric name. As many as there are unique label sets.
So a query like `http_requests_total` would return as many samples as there are unique time series sharing the "http_requests_total" name.

And it's a vector! Or, more precisely, an INSTANT vector.

Did you notice that we haven't mentioned any time ranges here?
Each element of an instant vector:

- belongs to a different time series
- shares the same timestamp as all other elements.

But how to specify that timestamp?
PromQL doesn't allow to specify a timestamp for an instant vector.

A timestamp is specified separately! For instance, in the API request.

Of it defaults to `now()`.
So, how to plot a bunch of time series on a graph if you can select only an instant vector and not a piece of a time series?

You need to send a range query!

A range query consists of a:

- instant vector selector
- start time
- end time
- time step
That's where the official docs start to suck. They avoid explaining the idea of instant and range queries.

Instant query with an instant vector selector returns a single vector.

Range query with the same vector selector returns a [timestamped] vector of [instant] vectors!
Official Prometheus docs are so focused on instant and range vectors that they completely forgot to explain instant and range query concepts.

I had to google for quite some time until I finally found these two @PromLabs gems:

- promlabs.com/blog/2020/07/0…
- promlabs.com/blog/2020/06/1…
Last but not least - range vectors.

A range vector is a PromQL concept.

Ex: `http_requests_total{method="GET"}[5m]`

A range vector is like an instant vector where every value is replaced with a series of values from the specified time bucket.

I.e. it's a matrix!
The following are valid constructs:

- instant query with an instant vector selector
- range query with an instant vector selector
- instant query with a range vector selector

A range query with a range vector selector would be a 3D construct... It's simply not a thing in p8s.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Ivan Velichko

Ivan Velichko Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @iximiuz

21 Mar
Grasping Computer Networking - Bottom-Up Approach.

1/ Computers can talk only to their neighbors!

What is a neighbor? It's a network node residing on the same Layer 2 segment.

Curious, what is an L2 segment? Then read on!
A simplest possible example of an L2 segment is a point-to-point connection of two computers (as above).

But to interconnect multiple computers, an additional multi-port device is needed.
For wired networks, such a device is called "switch" or "bridge".

There are lots of switches in data centers. But you probably have one at home too.

It's your Wi-Fi router! See those RJ45 ports on the backside? Then it's not a router but a bridge combined with a router.
Read 12 tweets
20 Dec 20
The great rise of cloud-native projects (thread).

Microservices don't solve any technical problems. Instead, microservices are trying to solve org challenges.

...by turning org problems into new tech challenges.
In accordance with the complexity conservation law, microservice architectures should be bringing a lot of new problems yet to be solved.

ferd.ca/complexity-has…
Thesis: the goal of the majority of the CNCF projects is to solve those technical problems originated by the microservice transition.

landscape.cncf.io
Read 9 tweets
19 Dec 20
Kubernetes is a Distributed Operating System (thread):

1. One of the primary goals of a traditional operating system (e.g Linux) is to share a machine's resources between apps. While "Kubernetes is all about sharing machines between applications."
2. An operating system gives you a handy way to launch your apps. So does Kubernetes.
3. An operating system gives you a handy way to install apps. So does Kubernetes (through raw YAML or a package manager like Helm).
Read 5 tweets
18 Dec 20
Kubernetes is deprecating Docker as a container runtime after v1.20. But nobody cares. Why? Because images built with Docker will keep working on Kubernetes.
So does images created with Podman and most of the other build tools around. Why? Because of the great standardization effort called OCI!

opencontainers.org
OCI contains two specifications: the Runtime Specification (runtime-spec) and the Image Specification (image-spec).
Read 14 tweets
12 Dec 20
One of the Kubernetes superpowers is how it tackles the networking problem. Here is my approach to gaining a comprehensive understanding of the topic.

1. Learn the super-simple Kubernetes Network Model: much like VMs, every Pod gets its own IP address.

kubernetes.io/docs/concepts/…
2. Learn that achieving simplicity is hard:

2.1. Networking on a single Node. How containers communicate within a Pod, how Pods talk to each other within a Node. Long story short, it's all about Linux namespaces and network virtualization capabilities.

iximiuz.com/en/posts/conta…
2.2. Cross-node Pod-to-Pod networking. Kubernetes demands that every Pod should get its own IP. But it doesn't say how. Makes sense actually, because it's highly infra-specific. Use a plugin like Flannel or Calico instead. Keyword - overlay networks.

kubernetes.io/docs/concepts/…
Read 18 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(