Hacker News

Ask HN: How much distributed tracing costs:Using OSS like Jaeger or paid product

Wanted to get your inputs on 2 solutions for tracing needs: 1. Open Source solutions like Jaeger - Is there a blog or link which analyses the cost of storage of traces to any of the backend like Cassandra or Elastic? Do you use any compression techniques before storing to DB?

2.Paid products like Datadog - Most of the vendors charge on a per-host basis. What's the underlying logic for charging like that?

If I were to use a vendor for this, I would like to be charged according to the number of spans sent for storage like in logs, say $2 per million spans or $0.2 per GB. Only Datadog seems to charge on such basis and asks for $1.7 per million of spans apart from $31 per host which covers 1M spans only. Does DataDog give you control of which spans to visualise & store?

- Does any vendor give you control of spans to process or a clear pricing estimate for the tracing part?

What other things should I look when using Jaeger or buying Datadog? My primary need is to monitor and debug my applications.

8 pointspranay01 posted 2 months ago13 Comments
akyaky said 2 months ago:

Hey! I can't speak to the exact mechanism for pricing on a large or enterprise level account, but it sounds like you're just getting started with tracing - might be worth checking out Lightstep. Pricing is pretty transparent for different tiers of non-enterprise usage and it's seats+services monitored, NOT consumption, so there shouldn't be any end-of-month surprises. (https://lightstep.com/pricing)

Lightstep and several other vendors use open source standards so worst case scenario, you have great instrumentation even if you're not paying for a product right away. /shrug

(full disclaimer, I work at Lightstep albeit not in sales)

pranay01 said 2 months ago:

Checked Lightstep. They charge 1900 USD for 8 services and 199 USD for each added services. So, if I have 15 service, it would be ~3300 USD/month - which seems a lot to me.

Also, don't understand why they are charging based on services? From that perspective, costing based on number of spans seems more transparent

ankitnayan said 2 months ago:

If I have 8 services sending a combined total of 100GB of traces per day and 1 extra service that alone sends 100GB of traces per day, how is your pricing justified either to LightStep or as a customer?

verdverm said 2 months ago:

The thing with Jaeger is you will likely need to deploy other services with it. For example, if you use Istio, pods need sidecars with non-negligible resource requirements. You'll need EFK / Prometheus.

You should include the salaries to deploy, debug, and maintain in the calculation. Not just Jaeger, all the supporting systems too. Finding people (you'll need more than one most likely) is a challenge as well.

You probably don't need distributed tracing at first, I have only seen a few people at this point. Start with basic logging and metrics if you have none today.

pranay01 said 2 months ago:

Understand the total cost of ownership, etc. Was trying to understand the actual cost from running the infra etc. for Jaeger

verdverm said 2 months ago:

It's not such an easy calculation, depends a lot on all the other things and the nature of your workloads.

ankitnayan said 2 months ago:

Will it help, if I say 10 spans per request and 100 RPS?

Or even simpler, I use the dd-trace library and my application handles 100RPS

verdverm said 2 months ago:

No, you need to talk about the running environment, the VMs, the platform, the other components, are your VMs in the same zone or region... There's a ton

Regardless Jaeger is a small piece of monitoring and logging. And an advanced capability I have not actually seen deployed. Sounds great, but often unneeded, at least until you have the other things in order

Have you thought about what happens if ingestion of logs breaks?

If your primary need is to monitor and debug, then don't worry about distributed tracing yet

ankitnayan said 2 months ago:

Just being curious about the cost, why running environment or VMs, the platform matters? Region matters for data transfer costs that I can understand.

I can relate more with the number of spans based pricing because eventually spans are saved to DB apart from the cost of running Jaeger infra. Can you please provide more details into how the running environment, the VMs, the platform, the other components affect the cost? Will be helpful

verdverm said 2 months ago:

All the components of a robust monitoring system take up space. Disk, CPU, Mem, developer time

tannerbrockwell said 2 months ago:

Look at OpenTracing. This is the merger of two tracing initiatives and will provide the pointers to both commercial and open source solutions. [1] There is some excellent capabilities to instrument an existing java app through the jvm [2] with OpenTracing.

[1]: https://opentracing.io/ [2]: https://medium.com/opentracing/opentracing-on-kubernetes-get...

pranay01 said 2 months ago:

Yes, aware of that. Also, opentracing is now merged with opencensus to create a new project called opentelemetry. That is the upcoming standard now.

my questioning was more around running the cost of running these tools

tj_9000 said 2 months ago:

Hello, I am a Civil engineering student in my final year of Bachelor's degree. I am very interested in the field of machine learning and ai.But i am trying to determine how can i integrate my academic knowledge with machine learning. Any suggestions or ideas would be greatly appreciated.Thank you.