OpenTelemetry for Rust Developers (opens in new tab)

(signoz.io)

62 pointsdhruv_ahuja2mo ago10 comments

10 comments

The OpenTelemetry spec is absolutely what folks have been waiting for for as long as I've been in computing (~20 years). A single standard that is implemented in nearly every popular language with very close feature parity. It's honestly wonderful to work with compared to the old vendor supplied frameworks.

I took it upon myself to write a library for my current employer (4yrs ago now?) that abstracted and standardized the way our Rust services instantiated and utilized the metrics and tracing fundamentals that OpenTelemetry provides. I recently added OTLP logging (technically using tracing events) to allow for forwarding baggage / context / metadata with the log lines. The `tracing` crate in rust also has a macro called `instrument` that allows you to mostly auto-instrument your functions for tracing, allowing the tracing context to be extracted and propagated into your function so the trace / span can be added to subsequent HTTP / gRPC requests.

We did all kinds of other stuff too, like adding a method for attaching the trace-id to our kafka messages so we can see how long the entire lifetime of the request takes (including sitting on the queue). It's been extremely insightful.

Signoz is newer to the game. I'm glad there are more competitors and vendors using OpenTelemetry natively. We originally talked to some of the big vendors and they were going to gladly accept OpenTelemetry, but they marked every metric as a "custom" metric and would charge out the wazoo for each of them, far in excess of whatever was instrumented natively with their APM plugin thingamabob.

The more the better. I love OpenTelemetry, and using it in Rust has been mostly great.

dhruv_ahujaOP2mo ago

That library you built sounds great. The kind of things that I love to read the code of, if I'm using it in a project. I was divided between adding instrument macro, but decided on manual instrumentation for the demonstration.

Regarding monitoring Kakfa execution times, absolutely agreed. In my previous job, monitoring Celery had helped us understand consumer bottlenecks, because we couldn't see background job traces containing the celery consumer spans. And when they did appear, they were hours late. So the entire trace took 8 hours instead of the expected couple minutes.

Happy to hear you've been enjoying OTel and Rust!

zamalek2mo ago

I have OTEL + Rust in production, alongside some other languages (+ OTEL), and it is by far more useful and predictable than the others. I often find myself monkey-patching in logging for other language libraries, where with Rust it just works.

(Except for this, that is: https://github.com/tokio-rs/tracing/issues/2519)

scottlamb2mo ago

Opened the link. Saw my own comment. I'm still as confused today as I was then about how this was ever supposed to work—either the quoted code is wrong or there's some weird unstated interface contract. I gather from other issues the maintainers are uninterested in a semver break any time soon. Unsure if they'd accept a performance regression (even if it makes the thing actually work). So I feel stuck. In the meantime, I don't use per-layer filtering. That's a trap.

I've got a whole list of puzzling bugs in the tracing <-> opentelemetry <-> datadog linkage.

pooper2mo ago

Speaking of opentelemetry, I try to use open telemetry with my personal projects in an asp dotnet as well as with a dotnet console app. I don't have the required corporate background in opentelemetry. I had to write my own file log exporter. I didn't write it myself -- I used Claude to write it for me in jsonl format which seemed like a good way to have each row in json and for the console app, I get a file something like this

``` logs_2025-12-24_0003.jsonl ```

I asked Claude to keep it in an xdg folder and it chose

``` /home/{username}/.local/share/{applicationName}/telemetry/logs ```

I also have folders for metrics and traces but those are empty.

I have never had a need to look at the logs for the dotnet console app I have and the only reason I have looked at the logs on the asp dotnet app was to review errors when I ran into some error on my asp dotnet application, which frankly I don't need open telemetry for.

What am I missing here? Am I using it wrong?

If you use open telemetry, where do your logs, metrics, and traces go? Do you write your own custom classes to write them to a file on the disk? Do you pay for something like datadog (congratulations on winning the lottery I guess?)

I appreciate your reply. Thank you for helping me learn.

jiggawatts2mo ago

Open Telemetry is a "narrow waist".

That is, it defines a relatively small, interoperable interface that a lot of distinct products from many different vendors can "sink" their telemetry into, and then on the other end of this narrow waist a bunch of different consumers can "source" the data from.

Think of it as a fancy alternative to ILogger and similar abstractions that is cross-platform and cross-vendor. Instead of Microsoft-specific or Java-specific (or whatever-specific) sources with their own protocols and conventions, there's a single standard for the data schema that everybody can talk. It's like TCP/IP or JSON.

So your question is in some sense nonsense. It's like asking "what do you use TCP/IP for?" or "where do you put your JSON"?

The answer is: wherever you want! That's the whole point.

In Azure, you would use Application Insights, as a random example. New Relic, DataDog, Prometheus, Zipkin, Elasticsearch, or... just your console output. Simple text log files. A SQL database. Wherever!

In more practical terms, for a solo developer working on personal projects, use Aspire.NET with Visual Studio 2026. You'll get a free "local" alternative to an APM like Application Insights or DataDog that collects the traces. Keep using the "standard" interfaces like ILogger, they forward everything to OTel if you're using it.

jiehong2mo ago

By default, open telemmetry will try to send traces to a remote http endpoint, unless there is a local exporter in the code base to avoid that behaviour.

Otel has value for things crossing microservices in production, much less in local for a single app.

Often, languages running on dotnet or the JVM already enjoy great tooling for quite a few years, and so locally they are much better than otel.

luzejian2mo ago

This is one of those areas where the gap between theory and practice is enormous. The frameworks and best practices look clean in blog posts, but the actual implementation in real orgs involves dozens of edge cases and tradeoffs that the frameworks don't address. I'd be curious to hear from anyone who's gone through this at scale — what surprised you most about the gap between how it's supposed to work and how it actually works?

hedora2mo ago

One (expensive) vendor claimed to support this stuff, and it flat out didn't work. After lots and lots of escalations, the fix ended up being with a massive terraform diff for us to deploy to prod.

The problem had nothing to do with our network infrastructure, deployment, etc, etc, it was completely their end.

I get the impression this standard is 10x more complicated than it needs to be, but the stuff it replaced was 100x too complicated.

Honestly though, I feel like it's all just a big hack working around splunk and similar systems' inability to understand summary statistics.

Less politely: If I had shell on the telemetry servers, I could probably get the same functionality from a perl one-liner that's easier to understand + maintain.

luzejian2mo ago

The interesting meta-pattern here is how often the tooling around a problem lags the problem itself by 5-10 years. The operational complexity exists, the pain is real, but because it's distributed across many small actors rather than concentrated in a few large ones, the market for structured solutions is slower to develop. That's usually a signal rather than a dead end — it means the first tool that actually fits the workflow, rather than a generic workflow tool, has real leverage.

j / k navigate · click thread line to collapse

10 comments

wise0wl2mo ago

The more the better. I love OpenTelemetry, and using it in Rust has been mostly great.

dhruv_ahujaOP2mo ago

Happy to hear you've been enjoying OTel and Rust!

zamalek2mo ago

(Except for this, that is: https://github.com/tokio-rs/tracing/issues/2519)

scottlamb2mo ago

I've got a whole list of puzzling bugs in the tracing <-> opentelemetry <-> datadog linkage.

pooper2mo ago

``` logs_2025-12-24_0003.jsonl ```

I asked Claude to keep it in an xdg folder and it chose

``` /home/{username}/.local/share/{applicationName}/telemetry/logs ```

I also have folders for metrics and traces but those are empty.

What am I missing here? Am I using it wrong?

I appreciate your reply. Thank you for helping me learn.

jiggawatts2mo ago

Open Telemetry is a "narrow waist".

So your question is in some sense nonsense. It's like asking "what do you use TCP/IP for?" or "where do you put your JSON"?

The answer is: wherever you want! That's the whole point.

jiehong2mo ago

By default, open telemmetry will try to send traces to a remote http endpoint, unless there is a local exporter in the code base to avoid that behaviour.

Otel has value for things crossing microservices in production, much less in local for a single app.

Often, languages running on dotnet or the JVM already enjoy great tooling for quite a few years, and so locally they are much better than otel.

luzejian2mo ago

hedora2mo ago

One (expensive) vendor claimed to support this stuff, and it flat out didn't work. After lots and lots of escalations, the fix ended up being with a massive terraform diff for us to deploy to prod.

The problem had nothing to do with our network infrastructure, deployment, etc, etc, it was completely their end.

I get the impression this standard is 10x more complicated than it needs to be, but the stuff it replaced was 100x too complicated.

Honestly though, I feel like it's all just a big hack working around splunk and similar systems' inability to understand summary statistics.

Less politely: If I had shell on the telemetry servers, I could probably get the same functionality from a perl one-liner that's easier to understand + maintain.

luzejian2mo ago

j / k navigate · click thread line to collapse