More and more companies are now shifting to a cloud-native & microservices-based architecture. Having an application monitoring tool is critical in this world because you can’t just log into a machine and figure out what’s going wrong.

We have spent years learning about application monitoring & observability. What are the key features an observability tool should have to enable fast resolution of issues.

cover image

In our opinion, good observability tools should have

  • Out of the box application metrics like latency, request rates, error rate, etc.
  • All telemetry signals under a single pane
  • Way to go from metrics to traces to find why some issues are happening
  • Seamless flow between metrics, traces & logs — the three pillars of observability
  • Ability to set dynamic thresholds for alerts
  • Transparency in pricing

User experience not great in current open-source tools

We found that though there are open-source tools like Prometheus & Jaeger, they don’t provide a great user experience as SaaS products do. It takes lots of time and effort to get them working, figuring out the long-term storage, etc. And if you want metrics and traces, it’s not possible as Prometheus metrics & Jaeger traces have different formats.

SaaS tools like DataDog and NewRelic do a much better job at many of these aspects:

  • They are easy to setup & get started
  • Provide out-of-box application metrics
  • Provides correlation between signals for contextual troubleshooting

But it has the following issues:

  • Crazy node-based pricing, which doesn’t make sense in today’s micro-services architecture. Any node which is live for more than 8hrs in a month is charged. So, unsuitable for spiky workloads
  • Very costly. They charge custom metrics for $5/100 metrics
  • It is cloud-only, so not suitable for companies that have concerns with sending data outside their infra
  • some tools charge based on user seats which becomes very limiting and costly if your engineering team is large
  • For any small feature, you are dependent on their roadmap. We think this is an unnecessary restriction for a product which developers use. A product used by developers should be extendible

To fill this gap we built SigNoz, an open-source alternative to DataDog.

Key Features of SigNoz - a DataDog alternative

Some of our key features which makes SigNoz vastly superior to current open-source products and a great alternative to DataDog are:

  • Metrics, traces, and logs under a single pane of glass
  • Opentelemetry-native
  • Correlation of telemetry signals based on OpenTelemetry's semantic conventions
  • Out of the box charts for application metrics
  • Seamless flow between metrics, traces & logs
  • Distributed tracing with Flamegraphs & Gantt charts to visualize user requests
  • Filtering based on tags & attributes
  • Custom aggregates on filtered traces
  • Monitoring of messaging queues
  • Infrastructure dashboards
  • Exceptions monitoring
  • Transparent usage data & pricing

Application metrics

Get out of the box p90, p99 latencies, RPS, Error rates and top endpoints for a service out of the box.

SigNoz dashboard showing popular RED metrics
SigNoz UI showing application overview metrics like 50th/90th/99th Percentile latencies, request rate and Apdex

Seamless flow between telemetry signals

Powered by OpenTelemetry's semantic conventions, you can quickly jump between telemetry signals in SigNoz. Found something suspicious in a metric, just click that point in the graph & get details of traces which may be causing the issues. Seamless, Intuitive.

Similarly, we have enabled correlation between other telemetry signals.

APM Metrics to Traces & Logs

APM metrics to Traces & Logs
Quickly jump from APM metrics to traces or logs to investigate issues further

Traces to Logs

If you see a API call taking more time than usual, you can go to related logs to investigate further.

Traces to related logs
Go from traces to related logs to get more context in debugging performance issues

Similarly you can click on detailed view of logs and then go to related trace ID to see the flow of user requests.

Logs to traces
Go from logs to related trace ID in flamegraph view to see the flow of user requests

Logs with Infrastructure metrics

While troubleshooting with logs, you can investigate the related infrastructure metrics to see if issues are happening becuase of that.

Logs to infra metrics
Check Pod and Node metrics while troubleshooting with logs

Quick filers & Advanced Query Builder for all telemetry signals

For all telemetry signals, there are quick filters to quickly filter out the data needed. We have built customized query builders for each signal to make your troublshooting work easier.

For example, query builder for traces allows you to create queries for finding the p99 latency of services.

Trace Query Builder
Create queries on your trace data by using various aggregate functions and group by options

Similarly, use quick filters to quickly filter the data that you need.

Logs quick filter
Use quick filter in logs to quickly filter out specific logs based on your use case

You can create custom metrics from filtered traces to find metrics of any type of request. Want to find p99 latency of customer_type: premium who are seeing status_code:400. Just set the filters, and you have the graph. Boom!

Flamegraphs & Gantt charts

Detailed flamegraph & Gantt charts to find the exact cause of the issue and which underlying requests are causing the problem. Is it a SQL query gone rogue or a Redis operation is causing an issue? Get more context on your spans with tags and events.

Detailed Flamegraphs & Gantt charts
Spans of a trace visualized with the help of flamegraphs and gantt charts in SigNoz dashboard

Logs Management

SigNoz provides Logs management with advanced log query builder. You can also monitor your logs in real-time using live tailing. SigNoz uses a columnar database ClickHouse to store logs, which is very efficient at ingesting and storing logs data. Columnar databases like ClickHouse are very effective in storing log data and making it available for analysis.

Logs tab in SigNoz
Logs tab in SigNoz comes equipped with advanced logs query builder and live tailing

Metrics & Dashboards

Monitor any metrics important to you. Ingest metrics from your infrastructure or applications and create customized dashboards to monitor them.

A hostmetrics dashboard
You can create any kind of customized dashboards using different visualization panel types and an advanced query builder

Exceptions Monitoring

Monitor exceptions automatically in Python, Java, Ruby, and Javascript. For other languages, just drop in a few lines of code and start monitoring exceptions.

Exceptions Monitoring in SigNoz
Exceptions Monitoring in SigNoz

Simple usage-based pricing with cost control features

Open-source SigNoz is free to self-host and use. SigNoz cloud is the easiest way of running SigNoz without any hassle of maintenance. For cloud, we have a simple usage-based pricing based on amount of data sent.

  • Logs: $0.3/GB ingested
  • Traces: $0.3/GB ingested
  • Metrics: $0.1/mil samples

Check out our pricing plans for more details. You can also estimate your monthly bill using the pricing calculator.


Getting started with SigNoz

Get Started - Free CTA


Related Content

SigNoz vs Datadog
Beware these surprises in Datadog pricing

Was this page helpful?