Home/Migration Guide

Migrating from Datadog: The OpenTelemetry Playbook (2026)

OpenTelemetry is the key to migrating from Datadog. Instrument with OTel once, send to any backend. No big-bang cutover needed.

Migration Overview

The strategy is simple: deploy the OpenTelemetry Collector alongside the Datadog agent. Run both in parallel. Validate data on the new platform. Cut over when confident. This approach eliminates risk because you maintain full visibility throughout the transition. The five steps below take 2-4 weeks for a typical 50-server environment.

1

Assess Current Usage

Before migrating, audit which Datadog features you actually use. Most teams use 30-40% of the features they pay for. Identify what you need to replace versus what you can drop.

Infrastructure monitoring - nearly always needed. Replace with Prometheus or OTel metrics.

APM / Distributed tracing - needed if you have microservices. Replace with OTel traces.

Log management - needed for debugging. Replace with Loki, OpenSearch, or vendor log ingest.

RUM / Synthetics - evaluate if actually used. Many teams enabled these but rarely check them.

Security monitoring - evaluate separately. May need a dedicated SIEM (Splunk, Elastic).

Custom metrics - audit which are queried. Exclude unused metrics to simplify migration.

2

Instrument with OpenTelemetry

Replace the Datadog agent with the OpenTelemetry Collector. Run both in parallel during the transition. Cover metrics, traces, and logs separately since each has different migration considerations.

Metrics

Deploy the OTel Collector with hostmetrics receiver for infrastructure metrics. For application metrics, add OTel SDK instrumentation or use Prometheus client libraries (OTel Collector has a Prometheus receiver). Both approaches produce Prometheus-compatible metrics.

Traces

Replace dd-trace libraries with OTel SDKs for your language (opentelemetry-python, opentelemetry-js, opentelemetry-java, etc.). The OTel SDK sends traces to the Collector via OTLP. This is the most code-touching step but produces vendor-neutral instrumentation.

Logs

Configure the OTel Collector filelog receiver to tail log files, or use the OTLP log exporter in your application. For Kubernetes, the Collector can collect container logs via the k8sattributes processor. Logs require the least application code changes.

3

Choose Your Backend

Grafana Cloud

Best for cost. Managed Prometheus + Loki + Tempo. Usage-based pricing. Open standards.

See comparison

New Relic

Best for simplicity. Closest UX to Datadog. 100 GB free. Data-ingest pricing.

See comparison

Self-Hosted

Best for control. Prometheus + Grafana + Loki + Tempo. Zero software cost. SRE required.

See comparison

Dynatrace

Best for enterprise. Davis AI auto-discovery. Zero-config OneAgent. Full stack included.

See comparison
4

Migrate Dashboards and Alerts

Dashboard Migration

Grafana has a Datadog dashboard converter that handles basic translations. For complex dashboards, manual recreation is needed. Prioritize your top 5-10 critical dashboards first. Rebuild others over the following weeks as the team settles into the new platform.

Alert Rule Checklist

  • Export all Datadog monitors via API (GET /api/v1/monitor)
  • Map each monitor to the equivalent query syntax (DQL to PromQL or NRQL)
  • Recreate notification channels (Slack, PagerDuty, email)
  • Test every critical alert during the parallel-run period
  • Verify on-call routing works end-to-end

Metric Naming Conventions

Datadog uses dot-separated names (system.cpu.user). Prometheus uses underscores with _total suffix for counters (system_cpu_user_total). The OTel Collector can transform metric names during export. Configure the metricstransform processor to handle the mapping automatically.

5

Team Training

AudienceTraining FocusTime
Daily users (SRE, DevOps)New query language, dashboard navigation3-5 days
On-call engineersAlert triage, incident investigation workflows1-2 weeks
Application developersOTel SDK basics, adding custom spans/metrics2-3 days
ManagementNew reporting dashboards, cost visibility1 day

Timeline and Cost

2-4 weeks

Total migration time (50 servers)

1-2 weeks

Parallel-run period (both systems)

70-90%

Cost savings after migration

During the parallel-run period, you pay for both Datadog and the new platform simultaneously. Budget for 1.5-2x your current Datadog bill during those 1-2 weeks. After cutover, monthly savings begin immediately.

Frequently Asked Questions

How long does a Datadog migration take?
For a 50-server environment, plan 2-4 weeks. Week 1: Instrument with OpenTelemetry alongside the Datadog agent (parallel run). Week 2: Recreate critical dashboards and alerts on the new platform. Week 3: Train the team and validate data parity. Week 4: Cut over and decommission the Datadog agent. Larger environments (200+ servers) may take 4-6 weeks.
Can I migrate gradually without downtime?
Yes. The recommended approach is a parallel run: keep the Datadog agent running while you deploy the OpenTelemetry Collector alongside it. Both systems collect the same data simultaneously. This lets you validate the new platform before cutting over. You pay for both systems during the overlap period (typically 1-2 weeks), but you maintain full visibility throughout.
What is the OpenTelemetry Collector?
The OTel Collector is a vendor-neutral agent that receives, processes, and exports telemetry data (metrics, traces, logs). It replaces the Datadog agent. You configure receivers (where data comes from), processors (filtering, sampling, enrichment), and exporters (where data goes). The same Collector can send to Grafana Cloud, New Relic, Dynatrace, SigNoz, or any OTLP-compatible backend.
Will I lose any data during migration?
Not if you run in parallel. The Datadog agent and OTel Collector run simultaneously during the transition period. Historical data in Datadog remains accessible until your contract expires. Export critical dashboards and reports before the contract ends. Most platforms do not provide data export APIs for raw telemetry, so plan accordingly.