AWR report flagged "undo header" as the top wait event during an end-of-quarter batch. Every transaction was waiting to write to the undo tablespace header. The batch was running 14 concurrent sessions, all trying to acquire undo segments simultaneously, competing for a fixed number of active undo segments in a system configured for a much lighter concurrent load.
The Alert
Batch processing took 4.5 hours instead of the expected 90 minutes. AWR showed "undo header" waits averaging 140ms per event — enormous for a header acquisition that should take microseconds. 14 batch sessions, each needing a dedicated undo segment, were competing for 10 active undo segments.
The Discovery
-- Current wait events: SELECT event, total_waits, total_timeouts, time_waited, average_wait FROM v$system_event WHERE event LIKE '%undo%' ORDER BY time_waited DESC; -- Undo segment activity: SELECT usn, xacts AS active_transactions, waits, gets, ROUND(waits/gets*100, 4) AS contention_pct FROM v$rollstat ORDER BY waits DESC; -- How many undo segments are configured: SELECT COUNT(*) AS online_segments FROM dba_rollback_segs WHERE status = 'ONLINE';
10 online undo segments. 14 concurrent batch sessions. Each session needs an exclusive undo segment to start a transaction. Sessions 11-14 had to wait for one of the first 10 to complete and release their segment. The average_wait of 140ms was the serialization delay of that queue.
Incident Timeline
| Time | Event |
|---|---|
| Quarter-end 00:00 | Batch begins. 14 concurrent sessions start. |
| 00:00 +5min | 10 sessions acquire undo segments. 4 sessions wait. |
| 00:00-04:30 | Persistent queuing throughout batch. 4.5 hour runtime. |
| AWR review next morning | "undo header" identified as top wait event |
| +1 hour investigation | 10 online segments vs 14 concurrent sessions confirmed |
| +2 hours | ALTER SYSTEM SET undo_management to expand active segments |
| Next quarter | Batch runtime: 95 minutes. Zero undo header waits. |
Root Cause
Oracle's automatic undo management allocates undo segments dynamically up to the number configured by the undo tablespace. The active segment count was capped at 10. With 14 concurrent transactions, 4 sessions had to wait for an undo segment to become available. The wait time accumulated across thousands of transactions over 4.5 hours.
The Fix
-- Check current undo segment configuration: SHOW PARAMETER undo; -- Increase undo tablespace to accommodate more active segments: -- Oracle auto-manages segment count based on tablespace size -- Adding space allows more segments to activate ALTER TABLESPACE undotbs1 ADD DATAFILE '/u01/app/oracle/oradata/undo02.dbf' SIZE 4096M AUTOEXTEND ON NEXT 512M; -- Verify active segment count after batch: SELECT COUNT(*) FROM v$rollstat WHERE xacts > 0;
Prevention
Undo contention is now checked as part of every AWR review. The rule: active undo segments must exceed peak concurrent DML session count by at least 20%. For quarter-end batches with 14 sessions, minimum 17 segments. The undo tablespace was sized to ensure this is always available.
Diagnosing Undo Contention vs Other Wait Events
Not all "undo header" waits mean the same thing. Contention on undo allocation headers happens when there are not enough undo segments for the concurrent transaction load. A different symptom — contention within a single undo segment — points to a hot undo segment being overloaded. The diagnostic queries distinguish them.
-- View wait breakdown at the undo segment level: SELECT usn, xacts AS active_transactions, wraps, extends, shrinks, waits, gets, CASE WHEN gets > 0 THEN ROUND(waits/gets * 100, 4) ELSE 0 END AS contention_pct FROM v$rollstat ORDER BY contention_pct DESC; -- Interpretation: -- Many segments with waits/gets > 1% = not enough segments (add space to undo tablespace) -- One segment with high contention, others clean = hot segment (investigate that workload)
SELECT tablespace_name, status, SUM(blocks * block_size) / 1024 / 1024 AS size_mb, COUNT(*) AS extent_count FROM dba_undo_extents GROUP BY tablespace_name, status ORDER BY tablespace_name, status; -- Status ACTIVE: currently in use by open transactions -- Status UNEXPIRED: retained for read consistency (cannot be reused until undo_retention expires) -- Status EXPIRED: available for reuse -- If EXPIRED space is nearly zero: undo tablespace is full → add space