Metric reporting and storage

This page describes how metrics are reported and stored in SigNoz.

Metric types

Gauge

Say you have a commerce website and you want to track the number of active users on your website. You can create a gauge metric to track the number of active users. The following is the pseudo code to create a gauge metric:

gauge = Gauge('active_users', 'Number of active users')

When a user logs in, you increment the gauge metric by 1:

gauge.add(1)

And when a user logs out, you decrement the gauge metric by 1:

gauge.add(-1)

When you add or subtract from a gauge metric, you can optionally provide a list of tags to add to the metric. For example:

gauge.add(1, attributes={'user_type': 'paid'})

This will create a gauge metric with the name active_users and the tag user_type:paid. The tags help you filter and group metrics in the UI. The metrics SDK reports the metric with the tag user_type:paid to the backend. The more tags you add, the more granular the metric data becomes and the more data is sent to the backend. The following is the simplified representation of the metric data that is reported to the backend:

active_users,user_type=paid;1;timestamp=1729430400

There are two components to the metric data:

  • Time series: This is the metric name and the tags. In the example above, the time series is active_users,user_type=paid.
  • Sample: This is the value of the metric and the timestamp. In the example above, the sample is 1;timestamp=1729430400.

Once a time series is created, the metrics SDK continues to report the metric data to the backend until the process restarts. Continuing with the example above, the following is the metric data that is reported to the backend:

active_users,user_type=paid;1;timestamp=1729430400
active_users,user_type=paid;1;timestamp=1729430460
active_users,user_type=paid;1;timestamp=1729430520
active_users,user_type=paid;1;timestamp=1729430580
active_users,user_type=paid;1;timestamp=1729430640
active_users,user_type=paid;1;timestamp=1729430700
active_users,user_type=paid;1;timestamp=1729430760
active_users,user_type=paid;1;timestamp=1729430820
active_users,user_type=paid;1;timestamp=1729430880
active_users,user_type=paid;1;timestamp=1729430940
active_users,user_type=paid;1;timestamp=1729431000
active_users,user_type=paid;1;timestamp=1729431060
active_users,user_type=paid;1;timestamp=1729431120

Here the value of the metric is 1 and the timestamp is the time when the metric was reported. The metrics SDK reports the metric data to the backend at a regular interval which is determined by the reporting_interval configuration. The default reporting interval is 1 minute. As you can see, the same value is reported multiple times with different timestamps. This is because the metrics SDK reports the metric data to the backend at a regular interval for each unique time series recorded.

How does the SigNoz backend store this data?

There are two versions of the metric data tables in the SigNoz backend:

v2 (legacy but continues to be supported for backward compatibility)

There are two tables in the backend database that store the metric data:

  • samples_v2: This table stores the metric values.
  • timeseries_v2: This table stores the metadata of the time series such as the metric name, tags, and description.

The samples_v2 table has the following columns:

  • metric_name: This is the name of the metric.
  • fingerprint: This is the hash of the metric name and the tags.
  • value: This is the value of the metric.
  • timestamp_ms: This is the timestamp when the metric value was reported in milliseconds.

The timeseries_v2 table has the following columns:

  • metric_name: This is the name of the metric.
  • fingerprint: This is the hash of the metric name and the tags.
  • labels: This is the tags of the time series.
  • timestamp_ms: This is the timestamp when the time series was seen for the first time in milliseconds by SigNoz ingestior component since it started. There can be multiple replicas of ingestior components which are independent of each other. This means that the same time series can be reported multiple times from different replicas.

Note: some columns are not shown in the table above for brevity.

The above two tables are joined on the fingerprint column to get the metric data. Continuing with the example above, the following is the data that is stored in the samples_v2 and timeseries_v2 tables:

samples_v2 table

metric_namefingerprintvaluetimestamp_ms
active_users123456789011729430400000
active_users123456789011729430460000
active_users123456789011729430520000
active_users123456789011729430580000
active_users123456789011729430640000
active_users123456789011729430700000
active_users123456789011729430760000
active_users123456789011729430820000
active_users123456789011729430880000
active_users123456789011729430940000
active_users123456789011729431000000
active_users123456789011729431060000
active_users123456789011729431120000

timeseries_v2 table

metric_namefingerprintlabelstimestamp_ms
active_users1234567890user_type=paid1729430400000

v4

There are two tables in the backend database that store the metric data:

  • samples_v4: This table stores the metric values.
  • timeseries_v4: This table stores the metadata of the time series such as the metric name, tags, and description.

The samples_v4 table has the following columns:

  • metric_name: This is the name of the metric.
  • fingerprint: This is the hash of the metric name and the tags.
  • value: This is the value of the metric.
  • unix_milli: This is the timestamp when the metric value was reported in milliseconds.

The timeseries_v4 table has the following columns:

  • metric_name: This is the name of the metric.
  • fingerprint: This is the hash of the metric name and the tags.
  • labels: This is the tags of the time series.
  • unix_milli: This is the timestamp rounded to the nearest hour when the metric value was reported i.e there will be one entry for every hour for a given time series.

Note: some columns are not shown in the table above for brevity.

The above two tables are joined on the fingerprint column to get the metric data. Continuing with the example above, the following is the data that is stored in the samples_v4 and timeseries_v4 tables:

samples_v4 table

metric_namefingerprintvalueunix_milli
active_users123456789011729430400000
active_users123456789011729430460000
active_users123456789011729430520000
............

timeseries_v4 table

metric_namefingerprintlabelsunix_milli
active_users1234567890user_type=paid1729429200000
............

As you can see, the timeseries_v4 table has one entry for every hour for a given time series. If the same series is continued to be reported, a new entry is added for the next hour and so on.

Was this page helpful?

On this page