Home/Migration Guide
Migrating from Datadog: The OpenTelemetry Playbook (2026)
OpenTelemetry is the key to migrating from Datadog. Instrument with OTel once, send to any backend. No big-bang cutover needed.
Migration Overview
The strategy is simple: deploy the OpenTelemetry Collector alongside the Datadog agent. Run both in parallel. Validate data on the new platform. Cut over when confident. This approach eliminates risk because you maintain full visibility throughout the transition. The five steps below take 2-4 weeks for a typical 50-server environment.
Assess Current Usage
Before migrating, audit which Datadog features you actually use. Most teams use 30-40% of the features they pay for. Identify what you need to replace versus what you can drop.
Infrastructure monitoring - nearly always needed. Replace with Prometheus or OTel metrics.
APM / Distributed tracing - needed if you have microservices. Replace with OTel traces.
Log management - needed for debugging. Replace with Loki, OpenSearch, or vendor log ingest.
RUM / Synthetics - evaluate if actually used. Many teams enabled these but rarely check them.
Security monitoring - evaluate separately. May need a dedicated SIEM (Splunk, Elastic).
Custom metrics - audit which are queried. Exclude unused metrics to simplify migration.
Instrument with OpenTelemetry
Replace the Datadog agent with the OpenTelemetry Collector. Run both in parallel during the transition. Cover metrics, traces, and logs separately since each has different migration considerations.
Metrics
Deploy the OTel Collector with hostmetrics receiver for infrastructure metrics. For application metrics, add OTel SDK instrumentation or use Prometheus client libraries (OTel Collector has a Prometheus receiver). Both approaches produce Prometheus-compatible metrics.
Traces
Replace dd-trace libraries with OTel SDKs for your language (opentelemetry-python, opentelemetry-js, opentelemetry-java, etc.). The OTel SDK sends traces to the Collector via OTLP. This is the most code-touching step but produces vendor-neutral instrumentation.
Logs
Configure the OTel Collector filelog receiver to tail log files, or use the OTLP log exporter in your application. For Kubernetes, the Collector can collect container logs via the k8sattributes processor. Logs require the least application code changes.
Choose Your Backend
Grafana Cloud
Best for cost. Managed Prometheus + Loki + Tempo. Usage-based pricing. Open standards.
See comparisonNew Relic
Best for simplicity. Closest UX to Datadog. 100 GB free. Data-ingest pricing.
See comparisonSelf-Hosted
Best for control. Prometheus + Grafana + Loki + Tempo. Zero software cost. SRE required.
See comparisonDynatrace
Best for enterprise. Davis AI auto-discovery. Zero-config OneAgent. Full stack included.
See comparisonMigrate Dashboards and Alerts
Dashboard Migration
Grafana has a Datadog dashboard converter that handles basic translations. For complex dashboards, manual recreation is needed. Prioritize your top 5-10 critical dashboards first. Rebuild others over the following weeks as the team settles into the new platform.
Alert Rule Checklist
- Export all Datadog monitors via API (GET /api/v1/monitor)
- Map each monitor to the equivalent query syntax (DQL to PromQL or NRQL)
- Recreate notification channels (Slack, PagerDuty, email)
- Test every critical alert during the parallel-run period
- Verify on-call routing works end-to-end
Metric Naming Conventions
Datadog uses dot-separated names (system.cpu.user). Prometheus uses underscores with _total suffix for counters (system_cpu_user_total). The OTel Collector can transform metric names during export. Configure the metricstransform processor to handle the mapping automatically.
Team Training
| Audience | Training Focus | Time |
|---|---|---|
| Daily users (SRE, DevOps) | New query language, dashboard navigation | 3-5 days |
| On-call engineers | Alert triage, incident investigation workflows | 1-2 weeks |
| Application developers | OTel SDK basics, adding custom spans/metrics | 2-3 days |
| Management | New reporting dashboards, cost visibility | 1 day |
Timeline and Cost
2-4 weeks
Total migration time (50 servers)
1-2 weeks
Parallel-run period (both systems)
70-90%
Cost savings after migration
During the parallel-run period, you pay for both Datadog and the new platform simultaneously. Budget for 1.5-2x your current Datadog bill during those 1-2 weeks. After cutover, monthly savings begin immediately.