Two background jobs. Different products. Different order IDs. No shared rows. Still deadlocking. The error log was recording deadlocks between transactions that, by every logical inspection, should have had no conflict at all.
The Alert
Application error: Deadlock found when trying to get lock; try restarting transaction. Two inventory update jobs, running concurrently on different product IDs, deadlocking roughly once every two minutes during peak processing. The application retried on deadlock — but retries were failing too, compounding the load.
First Hypothesis: Shared Row Conflict
Two transactions deadlock by holding locks the other needs. First assumption: they were updating the same row. We pulled the last deadlock graph from SHOW ENGINE INNODB STATUS.
SHOW ENGINE INNODB STATUS\G
-- Look for the LATEST DETECTED DEADLOCK section
-- It shows:
-- Transaction 1: what lock it holds, what it is waiting for
-- Transaction 2: what lock it holds, what it is waiting for
-- Which transaction was chosen as the victimThe deadlock graph showed: Transaction 1 held a gap lock on the range (product_id=1044, order_id > 5000). Transaction 2 held a gap lock on the range (product_id=1891, order_id > 5000). They were waiting for each other's gap locks on an INSERT into the order_items table. Neither row existed yet — they were trying to insert new rows in the same index gap.
The Discovery
The order_items table had a composite index on (product_id, order_id). When Transaction 1 ran a SELECT ... WHERE product_id = 1044 AND order_id > 5000 under REPEATABLE READ, InnoDB placed a gap lock on the entire range of order IDs above 5000 for product 1044. When Transaction 2 inserted a row with product_id=1044 (a different product, but the same high order_id range due to the index structure), it conflicted with Transaction 1's gap lock.
SELECT r.trx_id AS waiting_trx, r.trx_mysql_thread_id AS waiting_thread, r.trx_query AS waiting_query, b.trx_id AS blocking_trx, b.trx_mysql_thread_id AS blocking_thread, b.trx_query AS blocking_query FROM information_schema.innodb_lock_waits w JOIN information_schema.innodb_trx r ON r.trx_id = w.requesting_trx_id JOIN information_schema.innodb_trx b ON b.trx_id = w.blocking_trx_id;
Incident Timeline
| Time | Event |
|---|---|
| Ongoing for months | Intermittent deadlocks between inventory jobs. Retried silently. |
| Scale event | Job concurrency increased from 2 to 8 workers. Deadlock rate 30x. |
| Investigation start | SHOW ENGINE INNODB STATUS. Gap lock deadlock identified. |
| +2 hours | Transaction isolation changed to READ COMMITTED on the job sessions |
| +2 hours | Deadlocks stop immediately. Jobs complete without conflict. |
Root Cause
REPEATABLE READ (MySQL's default isolation level) uses gap locks to prevent phantom reads — rows that appear between reads within the same transaction. Gap locks block inserts into the locked range. Two concurrent transactions scanning different ranges of the same index can produce overlapping gap locks that each block the other's inserts, causing a deadlock on rows that do not yet exist.
The Fix
-- Set on the session before the job runs: SET SESSION transaction_isolation = 'READ-COMMITTED'; -- READ COMMITTED does not use gap locks. -- It only locks the rows it actually reads, not the ranges between them. -- Phantoms are possible but inventory update jobs do not require -- REPEATABLE READ semantics. -- Verify the current isolation level: SELECT @@transaction_isolation;
Prevention
Background jobs that perform INSERT/UPDATE without requiring REPEATABLE READ semantics now explicitly set READ COMMITTED at the session level. The distinction is documented: REPEATABLE READ is used for financial transactions where phantom protection matters. READ COMMITTED is used for inventory, logging, and audit writes where it does not.