Every Tuesday · Free Forever

TheRunbook.

One real incident. One deep post-mortem.
One set of queries you can run right now.
No padding. No sponsored content. No "have you checked the docs?"

24,000+ DBAs subscribed

Issue #141 published this week

All four databases covered on rotation

Recent Issues

What lands in your inbox every week

Real incidents, real queries. Four recent Tuesdays.

ISSUE #141PostgreSQL

The Autovacuum That Ate Our IOPS Budget — Fixed in 8 Minutes

A high-write OLTP table had dead tuple bloat exceeding 40%. Autovacuum was running, but cost_delay was throttling it to uselessness. The pg_stat_user_tables query that exposed it and the three-line config fix.

⏱ 9 min read· Apr 1, 2026

ISSUE #140SQL Server

Parameter Sniffing Killed Our Monday Morning Deploy — Twice

A stored procedure ran in 40ms on dev and 22 seconds on prod. The culprit wasn't the query or the index — it was a cached plan compiled for a rare parameter value. The full diagnosis and when to reach for OPTION(RECOMPILE).

⏱ 11 min read· Mar 25, 2026

ISSUE #139MySQL

InnoDB Gap Locks: The Invisible Deadlock Cause Nobody Checks First

Two transactions, two different rows — still deadlocked. This is the gap lock scenario that bites teams using REPEATABLE READ with range queries. Full lock graph and what we actually did.

⏱ 10 min read· Mar 18, 2026

ISSUE #138Oracle

Undo Contention Diagnosed in V$WAITSTAT — ORA-01555 Finally Explained

ORA-01555 Snapshot Too Old was firing on long-running reports during peak hours. Reading V$UNDOSTAT, sizing UNDO_RETENTION correctly, and the index-organized table trick that cut undo generation by 60%.

⏱ 8 min read· Mar 11, 2026

What's Inside Every Issue

Same six sections, every week.

Consistent format so you know what you're getting — and can skip straight to what you need.

🔥

Incident of the Week

One real post-mortem. What broke, what the timeline looked like, what the first wrong diagnosis was, and what actually fixed it.

🔍

Query of the Week

One diagnostic or maintenance query — fully annotated. Copy it straight into your runbook. All four databases on rotation.

📖

detailed analysis

A longer explainer on one internal concept. MVCC, WAL internals, cost-based optimizer mechanics — the stuff that makes you dangerous in an incident.

⚙️

Config Corner

One configuration parameter, explained properly. What it does, what happens when it's wrong, and the sensible default vs production-tuned value.

🗞️

Community Picks

The three best things published in the Postgres, MySQL, and SQL Server communities that week. Curated — not scraped.

✅

Config Corner

One configuration parameter explained properly. What it controls, what breaks when it's wrong, and the gap between the default value and what you actually want in production.

What Subscribers Say

This is the one newsletter I actually read end-to-end. The incident breakdowns are exactly the kind of thing that makes you a better DBA — not because it happened to you, but because now you know what to look for when it does.

Principal DBA, fintech companyPostgreSQL · 11 years

The Query of the Week alone is worth the subscription. I've added at least 20 of them to our internal runbooks. My team calls them "TQ queries" without even realising where they came from.

Database Platform Lead, e-commerceMySQL / Aurora · 8 years

I forwarded Issue #127 to three colleagues. Our on-call runbook now quotes it verbatim. The depth of the incident breakdowns is what convinced me to copy these queries straight into our runbooks.

Senior DBA, healthcare SaaSSQL Server · 14 years

SQL Server Consulting

SQL SERVER PERFORMANCE · PRODUCTION · CONSULTING

Query Performance
Specialist

12 years · Banking & Financial Services · Critical Infrastructure

The problems that reach me are consistent: a query that performs correctly in test and collapses under production load, a plan that changed overnight with no code change, a blocking chain where every session looks like the blocker. These are not random failures — they have patterns, and those patterns are in the execution plan.

Every diagnosis starts at the execution plan. The plan does not lie — it shows exactly what SQL Server decided to do and why. Reading them across hundreds of production instances for 12 years means recognising the failure mode within minutes, not hours.

"The execution plan is the only honest account of what SQL Server actually did. Everything else is a theory."

Areas of Specialisation

→

Query plan regression & parameter sniffing

Plans that worked yesterday, failing today. The execution plan shows exactly why.

→

Blocking chains & deadlock diagnosis

Extended Events, sys.dm_exec_requests, lock escalation. Finding the root blocker, not just the victims.

→

Index strategy & audit

Identifying unused indexes that are costing write performance, and missing ones that are costing reads.

→

TempDB contention & sort spill analysis

Sort spills, hash spills, version store growth. The hidden performance killers that don't show up in CPU metrics.

→

Wait stats & DMV-based root cause analysis

PAGEIOLATCH, CXPACKET, LCK_M_X — reading wait statistics as a diagnostic language, not just metrics.

→

Plan cache management & SET option diagnosis

ARITHABORT mismatch, plan proliferation, the query that runs in 40ms from SSMS and 22 seconds from the app.

Who This Is For

▸ Engineering teams with a performance incident they cannot explain

▸ Companies running SQL Server without a dedicated DBA

▸ Organisations where queries worked fine until they didn't

▸ Teams preparing for a high-traffic event and needing a pre-flight review

Get in Touch

📞

+91 96331 31861

Call directly · India time (IST)

💬 WHATSAPP →

querytuning.org · SQL Server performance · Production systems
Response within 24 hours

TheRunbook.

Join 24,000 DBAs who read it every Tuesday.