SigNoz - Open-Source Alternative to DataDog

More and more companies are now shifting to a cloud-native & microservices-based architecture. Having an application monitoring tool is critical in this world because you can’t just log into a machine and figure out what’s going wrong.

We have spent years learning about application monitoring & observability. What are the key features an observability tool should have to enable fast resolution of issues.

In our opinion, good observability tools should have

Out of the box application metrics like latency, request rates, error rate, etc.
All telemetry signals under a single pane
Way to go from metrics to traces to find why some issues are happening
Seamless flow between metrics, traces & logs — the three pillars of observability
Ability to set dynamic thresholds for alerts
Transparency in pricing

User experience not great in current open-source tools

We found that though there are open-source tools like Prometheus & Jaeger, they don’t provide a great user experience as SaaS products do. It takes lots of time and effort to get them working, figuring out the long-term storage, etc. And if you want metrics and traces, it’s not possible as Prometheus metrics & Jaeger traces have different formats.

SaaS tools like DataDog and NewRelic do a much better job at many of these aspects:

They are easy to setup & get started
Provide out-of-box application metrics
Provides correlation between signals for contextual troubleshooting

But it has the following issues:

Crazy node-based pricing, which doesn’t make sense in today’s micro-services architecture. Any node which is live for more than 8hrs in a month is charged. So, unsuitable for spiky workloads
Very costly. They charge custom metrics for $5/100 metrics
It is cloud-only, so not suitable for companies that have concerns with sending data outside their infra
some tools charge based on user seats which becomes very limiting and costly if your engineering team is large
For any small feature, you are dependent on their roadmap. We think this is an unnecessary restriction for a product which developers use. A product used by developers should be extendible

To fill this gap we built SigNoz, an open-source alternative to DataDog.

Key Features of SigNoz - a DataDog alternative

Some of our key features which makes SigNoz vastly superior to current open-source products and a great alternative to DataDog are:

Metrics, traces, and logs under a single pane of glass
Opentelemetry-native
Correlation of telemetry signals based on OpenTelemetry's semantic conventions
Out of the box charts for application metrics
Seamless flow between metrics, traces & logs
Distributed tracing with Flamegraphs & Gantt charts to visualize user requests
Filtering based on tags & attributes
Custom aggregates on filtered traces
Monitoring of messaging queues
Infrastructure dashboards
Exceptions monitoring
Transparent usage data & pricing

Application metrics

Get out of the box p90, p99 latencies, RPS, Error rates and top endpoints for a service out of the box.

SigNoz dashboard showing popular RED metrics — *SigNoz UI showing application overview metrics like 50th/90th/99th Percentile latencies, request rate and Apdex*

Seamless flow between telemetry signals

Powered by OpenTelemetry's semantic conventions, you can quickly jump between telemetry signals in SigNoz. Found something suspicious in a metric, just click that point in the graph & get details of traces which may be causing the issues. Seamless, Intuitive.

Similarly, we have enabled correlation between other telemetry signals.

APM Metrics to Traces & Logs

Traces to Logs

If you see a API call taking more time than usual, you can go to related logs to investigate further.

Traces to related logs — *Go from traces to related logs to get more context in debugging performance issues*

Similarly you can click on detailed view of logs and then go to related trace ID to see the flow of user requests.

Logs to traces — *Go from logs to related trace ID in flamegraph view to see the flow of user requests*

Logs with Infrastructure metrics

While troubleshooting with logs, you can investigate the related infrastructure metrics to see if issues are happening becuase of that.

Logs to infra metrics — *Check Pod and Node metrics while troubleshooting with logs*

Quick filers & Advanced Query Builder for all telemetry signals

For all telemetry signals, there are quick filters to quickly filter out the data needed. We have built customized query builders for each signal to make your troublshooting work easier.

For example, query builder for traces allows you to create queries for finding the p99 latency of services.

Trace Query Builder — *Create queries on your trace data by using various aggregate functions and group by options*

Similarly, use quick filters to quickly filter the data that you need.

Logs quick filter — *Use quick filter in logs to quickly filter out specific logs based on your use case*

You can create custom metrics from filtered traces to find metrics of any type of request. Want to find p99 latency of customer_type: premium who are seeing status_code:400. Just set the filters, and you have the graph. Boom!

Flamegraphs & Gantt charts

Detailed flamegraph & Gantt charts to find the exact cause of the issue and which underlying requests are causing the problem. Is it a SQL query gone rogue or a Redis operation is causing an issue? Get more context on your spans with tags and events.

Logs Management

SigNoz provides Logs management with advanced log query builder. You can also monitor your logs in real-time using live tailing. SigNoz uses a columnar database ClickHouse to store logs, which is very efficient at ingesting and storing logs data. Columnar databases like ClickHouse are very effective in storing log data and making it available for analysis.

*Logs tab in SigNoz comes equipped with advanced logs query builder and live tailing*

Metrics & Dashboards

Monitor any metrics important to you. Ingest metrics from your infrastructure or applications and create customized dashboards to monitor them.

A hostmetrics dashboard — *You can create any kind of customized dashboards using different visualization panel types and an advanced query builder*

Exceptions Monitoring

Monitor exceptions automatically in Python, Java, Ruby, and Javascript. For other languages, just drop in a few lines of code and start monitoring exceptions.

Simple usage-based pricing with cost control features

Open-source SigNoz is free to self-host and use. SigNoz cloud is the easiest way of running SigNoz without any hassle of maintenance. For cloud, we have a simple usage-based pricing based on amount of data sent.

Logs: $0.3/GB ingested
Traces: $0.3/GB ingested
Metrics: $0.1/mil samples

Check out our pricing plans for more details. You can also estimate your monthly bill using the pricing calculator.

SigNoz - Open-Source Alternative to DataDog

written by

Pranay Prateek

published on

reading time

User experience not great in current open-source tools