What OpenLLMetry Does
OpenLLMetry is the open-source standard for LLM observability built on top of OpenTelemetry. Created by Traceloop (co-founded by Nir Gazit, former Google ML engineer and Fiverr chief architect, and Gal Kleinman, Fiverr ML group leader), the project extends the CNCF's OpenTelemetry protocol with AI-specific instrumentations for LLM providers, vector databases, and agent frameworks. With 7,000+ GitHub stars and Apache 2.0 licensing, it has become the go-to choice for teams that want LLM observability without vendor lock-in. Traceloop raised a $6.1 million seed round backed by Y Combinator, Samsung NEXT, and Grand Ventures.
Core Architecture and Setup
The core insight behind OpenLLMetry is elegant: agent execution is structurally similar to a distributed trace. Each step in an LLM pipeline — prompt construction, retrieval, API call, response processing — maps naturally to spans in a trace. By building on OpenTelemetry rather than inventing a proprietary protocol, OpenLLMetry lets you plug LLM observability into your existing monitoring stack. If you already use Datadog, New Relic, Sentry, Honeycomb, Grafana, or any OpenTelemetry-compatible backend, OpenLLMetry sends data there with no additional platform required.
Setup requires just two lines of code. Import the SDK, call Traceloop.init() with your app name, and all LLM calls are automatically instrumented. The SDK provides decorators for marking workflows, tasks, and agents, giving you structured traces that show exactly how each request flows through your LLM application. SDKs are available for Python, TypeScript, Go, and Ruby, covering the primary languages used in LLM application development. The non-intrusive instrumentation approach means you can add observability to an existing application without restructuring your code.
Provider Coverage and Signals
Provider coverage is comprehensive. OpenLLMetry instruments calls to OpenAI, Anthropic, Cohere, Google Gemini, AWS Bedrock, Ollama, and more than 20 other LLM providers. Vector database instrumentations cover Pinecone, Chroma, Weaviate, and others. Framework support includes LangChain, LlamaIndex, CrewAI, Haystack, and additional agent orchestration tools. Each instrumentation automatically captures prompts, responses, token usage, latency, model parameters, and error details — the complete set of signals needed to debug and optimize LLM applications.
The three observability signals are supported: traces (enabled by default), metrics, and logs (which can be enabled with a single parameter). This goes beyond what most LLM-specific tools provide, which typically only capture traces. Having metrics and logs in the same OpenTelemetry pipeline means you can correlate LLM behavior with system-level performance, create dashboards that combine token costs with infrastructure metrics, and set alerts based on any combination of signals.
Vendor Neutrality and Traceloop
The vendor-neutral approach is OpenLLMetry's greatest strategic advantage. Proprietary LLM observability platforms like LangSmith or Galileo require using their SDK, their platform, and their data format. If you want to switch providers or consolidate monitoring, you face a migration. OpenLLMetry uses the OpenTelemetry protocol, which means your observability data flows through standard collectors and can be routed to any compatible backend. You can send the same traces to multiple destinations simultaneously or switch backends without changing instrumentation code.
Traceloop, the managed platform built on OpenLLMetry, adds the insights layer that open-source tracing alone does not provide. The commercial offering includes built-in evaluation metrics for faithfulness, relevance, and safety that run automatically against real production data. Custom evaluators can be trained by annotating examples to score output the way your team would. Evaluations run automatically on pull requests or in real time, catching quality regressions before they reach users. This is where Traceloop draws the line between the free open-source library and the paid platform.
OpenTelemetry Synergy and Limitations
For teams already invested in OpenTelemetry for infrastructure monitoring, OpenLLMetry is the obvious choice for LLM observability — it extends what you already have rather than adding another siloed tool. The project also instruments everything that OpenTelemetry already covers, meaning your database calls, API requests, and HTTP transactions appear in the same traces as your LLM interactions, giving you end-to-end visibility from user request to model response and back.
The main limitations relate to the gap between tracing and actionable intelligence. OpenLLMetry gives you excellent raw data, but turning that data into insights about prompt quality, hallucination rates, or cost optimization requires either the Traceloop managed platform or building your own analysis layer. Compared to purpose-built LLM observability platforms like Langfuse, the open-source library alone lacks evaluation frameworks, prompt management, and dataset curation. The OpenTelemetry protocol itself was designed for cloud infrastructure, not AI workloads, so handling large payloads like vision model inputs required significant adaptation.
The Bottom Line
OpenLLMetry is becoming the standard protocol layer for LLM observability, much as OpenTelemetry became the standard for cloud observability. For teams that want to avoid vendor lock-in, integrate LLM monitoring with their existing observability stack, and retain the flexibility to switch or combine backends, it is the strongest foundation available. Start with OpenLLMetry for instrumentation, send data to your existing monitoring platform, and evaluate whether the Traceloop managed service adds enough value over your current tools to justify the additional cost.