What Datadog Does
In the enterprise observability market, Datadog has established itself as the platform that does everything. With 30,500+ customers referenced in Datadog partner materials and adoption across organizations ranging from startups to Fortune 500 companies, it remains a common recommendation when teams need unified visibility across their technology stack. The platform's strength is its breadth — infrastructure monitoring, application performance management, log management, security monitoring, real user monitoring, synthetic testing, CI visibility, cloud cost management, and LLM observability all live within a single interface with shared data correlation.
Architecture and APM
The technical foundation is an agent-based architecture where Datadog agents installed on hosts collect metrics from over 850 integrations, forwarding telemetry to Datadog's cloud platform for processing, storage, and visualization. The unified platform means an engineer investigating a latency spike can start from an APM trace, correlate it with infrastructure metrics from the affected host, check the relevant log entries, verify whether a recent deployment introduced the regression through CI visibility, and confirm the user impact through RUM data — all without leaving a single interface or manually joining data from separate tools.
Application performance monitoring captures distributed traces across microservices, generates service maps showing request flows and dependencies, and integrates error tracking directly into the APM workflow. The Continuous Profiler extends this visibility to the code level, showing function-level CPU, memory, and IO consumption in production with minimal overhead. For teams troubleshooting performance regressions, the ability to go from a slow trace to the exact function responsible is a significant advantage over platforms that stop at trace-level analysis.
Infrastructure and Log Management
Infrastructure monitoring covers hosts, containers, Kubernetes clusters, serverless functions, and cloud services across AWS, Azure, and Google Cloud. Network monitoring maps traffic flows between services and identifies network-level bottlenecks. Cloud cost management visualizes infrastructure spending and maps it back to services and teams, helping organizations understand the cost implications of their architectural decisions. This infrastructure depth is why platform engineering and SRE teams consistently choose Datadog — it provides the broadest visibility into the systems they are responsible for operating.
Log management ingests, indexes, and analyzes log data with full-text search, pattern analysis, and correlation with metrics and traces. However, log management is also where Datadog's pricing complexity becomes most apparent. Ingestion costs $0.10 per GB, but indexing — required for search and alerting — costs $1.70 per million log events. Teams that ingest hundreds of gigabytes daily can see log management become their single largest Datadog line item. Many organizations adopt a strategy of ingesting all logs but selectively indexing only the subset needed for active investigation, using log archives for long-term storage at lower cost.
Security Monitoring
The security monitoring capabilities have expanded significantly, covering cloud security posture management, application security, code security with SAST, software composition analysis, and runtime threat detection. This positions Datadog as a platform that can serve both engineering and security teams from a single vendor. For organizations consolidating their tool stack, the ability to replace separate security scanning tools with Datadog's built-in capabilities can simplify operations, though dedicated security platforms like Snyk or Aikido Security typically offer deeper coverage in their respective domains.
Pricing and Lock-In
Pricing is Datadog's most criticized aspect and the primary reason teams evaluate alternatives. The model combines per-host infrastructure monitoring at $15 per host per month, APM at $31 per host per month with a requirement that APM hosts also have paid infrastructure monitoring, log management at per-GB ingestion plus per-event indexing costs, custom metrics at $0.05 each beyond included quotas, and separate pricing for each additional product. High-watermark billing means the 99th percentile of hourly host usage determines the monthly bill, penalizing teams for temporary scaling events. A realistic mid-market estimate for 50 engineers with 200 hosts reaches $220,000 or more annually.
The ecosystem lock-in concern is real but nuanced. Datadog's proprietary agent and query language create dependency that makes migration expensive. However, the platform has invested in OpenTelemetry support, allowing teams to instrument with vendor-neutral SDKs while still sending data to Datadog. This provides a partial hedge against lock-in, though the full value of Datadog's platform comes from its proprietary features that OpenTelemetry alone cannot replicate. Teams that anticipate potential future migration should invest in OpenTelemetry instrumentation from the start.
AI Observability
The AI and LLM observability features position Datadog for the next generation of applications. LLM Observability monitors model calls, token consumption, prompt-response quality, and includes built-in sensitive data scanning to prevent data leakage through AI interactions. The platform can detect hallucinations, track agent reasoning chains, and correlate AI behavior with underlying infrastructure performance. For organizations building production AI applications, this unified view of both traditional and AI-specific telemetry is a genuine differentiator that few competitors match.
The Bottom Line
Datadog is the right choice for organizations that need the broadest possible observability coverage and have the budget to support it. Platform engineering teams, SRE organizations, and enterprise DevOps groups that manage complex, multi-service architectures get the most value from the unified correlation capabilities. Cost-sensitive teams should carefully model their expected spend before committing, and consider whether a combination of purpose-built tools — Sentry for error tracking, Grafana and Prometheus for metrics, and a separate log aggregation solution — could provide adequate coverage at lower cost. The market in 2026 increasingly supports hybrid approaches where Datadog covers the use cases it does best while cheaper alternatives handle high-volume telemetry like logs.