DNS Configuration & Hosting Cutover
Executive Summary
Execute a controlled DNS configuration and hosting cutover with zero service interruption. This playbook coordinates the DNS lead, database owner, and on-call SRE through authoritative zone preparation, TTL reduction, staging synchronisation, traffic routing, and post-switch validation. The outcome is a deterministic switch where every global resolver adopts the new origin within minutes, mail routing stays intact, search signals are preserved, and a rehearsed rollback path stands ready if any threshold is breached.
A hosting cutover is the single highest-risk hour of most technical migrations because it touches every layer at once: the authoritative DNS zone, the recursive resolver caches you do not control, the CDN edge, the origin servers, the database, and the mail flow. The failure modes compound — a forgotten SOA serial increment hides a stale secondary, an unlowered TTL strands a fraction of users on a dead IP for a day, and an unaligned auto-increment offset corrupts rows the moment writes resume. The discipline that prevents all of this is sequencing. Each of the four working phases below has a single owner, a clear entry condition, an explicit exit gate, and a numeric rollback trigger, so no one improvises mid-switch. Treat the timeline as a contract: TTL reduction completes before sync, sync proves parity before the record swap, and the swap is never declared done until propagation tracking confirms global adoption.
The four phases map onto four detailed runbooks. TTL Optimization Strategies shrinks the cache window so the switch — and any reversal — takes minutes rather than days. Staging to Production Sync makes the new origin byte-identical to what was tested and captures the final transactional delta. Zero-Downtime Cutover Plans chooses and drives the switching mechanism — weighted DNS or blue-green — so traffic shifts gradually with an instant reversal at every step. DNS Propagation Tracking proves the change reached every region before the legacy origin is retired. When a phase breaches its threshold, the Migration Rollback Playbooks define the reversal, and the Search Console Handover protects search equity once the dust settles.
Prerequisites
- Full administrative access to authoritative DNS providers, registrar consoles, and CDN dashboards.
- Read/write database backups, GTID positions, and static asset snapshots captured immediately before the window.
- Staging environment matching production topology: OS kernel, runtime versions, schema, and TLS chains.
- Audited SPF, DKIM, and DMARC records so mail routing survives the IP change.
- Baseline organic traffic, indexation, and origin latency metrics for validation comparison.
- Defined rollback triggers and an incident communication channel agreed with stakeholders.
Step-by-Step Execution
1. Reduce TTL and Prepare Authoritative Zones
Audit every authoritative zone for orphaned A, AAAA, CNAME, MX, and TXT records and delete stale entries before propagation begins. Reduce cache retention windows using the staged schedule in TTL Optimization Strategies 48–72 hours before execution so global resolver caches expire in time. Validate nameserver delegation chains and confirm secondary DNS is fully synchronised.
2. Synchronise Staging and Production Data
Run automated database replication between legacy and target hosts, then verify static assets and file permissions. Enforce the parity protocol in Staging to Production Sync to eliminate configuration drift, schema mismatches, and auto-increment collisions. Run synthetic transactions against staging IPs and confirm SSL/TLS chains and WAF rules match production.
3. Stage the Cutover Routing Plan
Choose the switching mechanism — weighted DNS, blue-green, or geographic load balancing — using Zero-Downtime Cutover Plans. Pre-stage the new records pointing at staging IPs for a dry run, and reconcile the redirect map from URL Mapping & Redirect Architecture so legacy paths resolve cleanly on the new origin.
4. Swap Authoritative Records
Lock source writes, finalise the incremental data sync, then update authoritative A/AAAA records to the new origin IPs during an off-peak window. Increment the SOA serial on every change and confirm secondary nameservers adopt it before you trust the switch. Watch real-time query resolution and NXDOMAIN spikes from the first second, because the cheapest moment to abort is before any meaningful traffic has reached the new origin.
5. Track Global Propagation
Confirm resolver cache adoption across regions with DNS Propagation Tracking before decommissioning anything. Do not retire legacy infrastructure until worldwide record adoption is verified and CDN edges pull from the new origin rather than cached legacy IPs.
6. Integrate the Edge and Verify Search Signals
Reconfigure CDN origin pull endpoints to the new host IPs and purge stale edge objects immediately after the DNS update. Validate cache-control headers, canonical tags, and robots.txt accessibility. Submit updated sitemaps and begin the Search Console Handover so crawl budget tracks the new property without organic visibility loss.
Technical Configs
These fragments cover the layers a cutover touches in order: the zone metadata that governs caching and validation, the reverse-DNS and health-check plumbing that keeps mail and failover honest, and the API call that performs the swap. Adapt the placeholders to your provider but keep the sequence — verify FCrDNS and health checks before the swap, not after.
SOA record values — increment the serial on every zone change:
; serial = YYYYMMDDNN (e.g., 2026061901) — bump or secondaries ignore the change
; refresh = 3600
; retry = 900
; expire = 604800
; minimum = 300 ; controls negative (NXDOMAIN) cache duration
DNSSEC rollover parameters (BIND/Knot zone signing policy):
# Keep KSK active while the ZSK rolls to avoid SERVFAIL on validating resolvers
dnssec_rollover:
algorithm: ECDSAP256SHA256
key_signing_key: active
zone_signing_key: rollover_pending
signature_validity: 30d
Forward-confirmed reverse DNS check (dig):
# FCrDNS must agree both directions or mail filters will penalise the new origin
NEW_IP="203.0.113.10"
dig -x "$NEW_IP" +short # should return new-origin.example.com.
dig new-origin.example.com +short # should return $NEW_IP
Route53 health check (AWS CLI JSON fragment):
# Drives automatic failover when the new origin stops answering /healthz
anycast_health_check:
Type: HTTP
RequestInterval: 10
FailureThreshold: 3
ResourcePath: /healthz
FullyQualifiedDomainName: origin.example.com
Cloudflare API — swap an A record to the new origin (dig-verifiable):
# PATCH the record, then confirm propagation with dig before purging caches
curl -s -X PATCH "https://api.cloudflare.com/client/v4/zones/{zone_id}/dns_records/{record_id}" \
-H "Authorization: Bearer {token}" \
-H "Content-Type: application/json" \
-d '{"content":"203.0.113.10","ttl":60}'
Validation & Rollback
Confirm the switch empirically before standing down. Keep the rollback path warm until global adoption is proven.
Post-Cutover Validation Checklist:
dig @8.8.8.8 example.com +shortanddig @1.1.1.1 example.com +shortboth return the new IP.openssl s_client -connect example.com:443 -servername example.com </dev/null.
Common Pitfalls:
- Leaving TTL at the default 86400 s, stretching propagation to 24–48 hours.
- Modifying mail routing before DNS logs confirm 100% resolution to the new origin.
- Forgetting to increment the SOA serial, so secondaries silently keep stale data.
- Purging CDN caches before propagation completes, forcing re-fetch from the old origin.
- Trusting a single resolver instead of checking adoption across regions.
Rollback Protocol:
- Revert authoritative A/AAAA records to legacy IPs and restore the prior SOA serial increment.
- Restore previous CDN origin configuration and invalidate edge caches for affected paths.
- Confirm HTTP 200 on legacy paths and resolver agreement on the old IP.
- Notify stakeholders, follow the Migration Rollback Playbooks, and document failure vectors.
Trigger rollback when NXDOMAIN rates exceed 5% for more than 15 minutes, database replication lag passes the agreed ceiling, email deliverability drops from SPF/DKIM misalignment, or enterprise ISP caching creates localised traffic blackholes.
FAQ
How long before cutover should TTL values be reduced? Reduce TTLs to 60–300 seconds 48–72 hours before execution so global resolver caches expire and the new IP is adopted within minutes of the authoritative update; stage the reduction rather than dropping straight to 60 s.
How do we maintain email service continuity during DNS switching? Keep identical MX, SPF, DKIM, and DMARC records across both zones until propagation is verified, and do not touch mail routing until DNS query logs confirm 100% resolution to the new infrastructure.
What metrics indicate a successful DNS cutover? A query failure rate below 1% in authoritative logs, consistent HTTP 200/301 responses from the new origin, zero SSL handshake errors, and resolver alignment confirmed across multiple regions.
How should the CDN cache be handled during the transition? Use origin shield routing, purge critical paths immediately after the DNS update completes propagation, validate cache-control headers, and keep a dual-origin fallback until edge caches fully repopulate.
Can I cut over without lowering TTL first? You can, but expect 24–48 hours of split traffic while cached records expire; for a controlled switch always run the staged TTL reduction described in TTL Optimization Strategies first.
Related
- TTL Optimization Strategies
- Staging to Production Sync
- DNS Propagation Tracking
- Zero-Downtime Cutover Plans
- Migration Rollback Playbooks
← Back to Home