ORA-01555 Snapshot Too Old: The Error That Came Back Every Tuesday at

Eleven Tuesdays in a row, at 14:07, the same report failed with ORA-01555: snapshot too old. Each time, the database looked healthy. Undo was available. The session count was normal. By the time anyone investigated, the evidence had been overwritten by the undo retention mechanism itself. Week twelve, we caught it live.

The Alert

ORA-01555 means Oracle could not reconstruct a consistent read image of a block because the undo data needed had been overwritten. The report ran for 22 minutes. During those 22 minutes, a batch job that ran every Tuesday at 14:00 was generating undo at 40x the normal rate. The report's consistent read view needed undo data from 14:00. The batch job had overwritten it by 14:22.

First Hypothesis: Undo Tablespace Too Small

The first hypothesis was that the undo tablespace was undersized. We checked V$UNDOSTAT to see undo generation and retention.

Analyse undo generation rate and retention

Oracle

SELECT
  begin_time,
  end_time,
  undoblks,                  -- undo blocks generated in this period
  txncount,                  -- transaction count
  maxquerylen,               -- longest query in seconds
  ssolderrcnt,               -- ORA-01555 errors in this period
  nospaceerrcnt,             -- ORA-30036 (undo space full)
  activeblks,                -- active undo blocks
  unexpiredblks              -- unexpired (retained) undo blocks
FROM   v$undostat
ORDER BY begin_time DESC
FETCH FIRST 24 ROWS ONLY;

The data showed a clear spike: at 14:00 every Tuesday, undoblks jumped from ~200 per 10-minute period to ~8,400. The batch job — a financial reconciliation process — was performing 400,000 DELETE + INSERT pairs. The maxquerylen during the spike was 1,340 seconds — the report. The undo needed for the report's consistent read was being overwritten by the batch job's DELETE operations.

Incident Timeline

Time	Event
14:00 (every Tue)	Financial reconciliation batch begins. 400K DELETEs + INSERTs.
14:07	Long-running report starts. Needs consistent view from 14:00.
14:07-14:22	Batch overwrites undo blocks needed by the report's read view.
14:22-14:29	Report queries blocks whose undo has been overwritten.
14:29	ORA-01555: snapshot too old. Report fails.
Week 12	V$UNDOSTAT queried live during the batch. Spike pattern confirmed.
Week 12 + 1 hour	UNDO_RETENTION raised from 900s to 3600s
Week 13	Report completes. No ORA-01555.

Root Cause

UNDO_RETENTION was set to 900 seconds (15 minutes). The report ran for 22 minutes. The batch job generated enough undo volume to overwrite 15+ minutes of undo history in under 8 minutes. The report needed undo data from its start time, which was more than 900 seconds old by the time it encountered the overwritten blocks.

The Fix

Increase UNDO_RETENTION and enable guaranteed retention

Oracle

-- Raise retention beyond the longest expected query duration:
ALTER SYSTEM SET undo_retention = 3600; -- 1 hour

-- Enable guaranteed retention on the undo tablespace
-- (prevents undo from being overwritten even under space pressure):
ALTER TABLESPACE undotbs1 RETENTION GUARANTEE;

-- Monitor undo adequacy:
SELECT
  maxquerylen,
  ssolderrcnt,
  ROUND(undoblks * 8192 / 1024/1024, 2) AS undo_mb_per_period
FROM   v$undostat
ORDER BY begin_time DESC
FETCH FIRST 6 ROWS ONLY;

Prevention

UNDO_RETENTION is now set to 2x the expected maximum query duration. The batch job was rescheduled to 02:00 on Wednesday — outside of business hours, away from long-running reports. V$UNDOSTAT is monitored every 10 minutes for ssolderrcnt increments — any non-zero value fires an alert before the next failure.

Priya Venkataraman

Oracle DBA · QueryTuning

Priya has managed Oracle databases for financial services companies for 13 years, focusing on MVCC behaviour, undo management, and the category of problems that only appear at scale during peak business windows.

ORA-01555 Snapshot Too Old: The Error That Came Back Every Tuesday at 2 PM