Row-level locks are cheap. SQL Server acquires them in the thousands without concern. But at some point — 5,000 locks by default — SQL Server decides that managing thousands of row locks is more expensive than taking one table lock. That conversion is lock escalation. When it happens during a write-heavy operation, every other reader and writer on the table waits for the table lock to clear. A 10-second bulk update becomes a 10-second full table lock.
How Lock Escalation Works
SQL Server tracks lock memory in the lock manager. The threshold for escalation is controlled by the LOCK_ESCALATION setting on each table and, at the server level, by trace flag 1224. When a single statement acquires more than 5,000 locks (row or page locks) against a single table, SQL Server attempts to escalate to a table-level lock (TAB lock).
Escalation happens at the statement boundary — not the transaction boundary. A transaction with multiple statements can trigger escalation on any individual statement if that statement touches enough rows.
The lock manager does not distinguish between shared, update, and exclusive locks when counting toward the escalation threshold. A SELECT that acquires 5,000 shared row locks will escalate to a shared table lock, which blocks any exclusive table lock attempts from concurrent writers.
Detecting Lock Escalation
CREATE EVENT SESSION [LockEscalation] ON SERVER ADD EVENT sqlserver.lock_escalation( WHERE sqlserver.database_name = 'YourDatabase' ) ADD TARGET package0.ring_buffer(SET max_memory = 4096) WITH (STARTUP_STATE = ON); ALTER EVENT SESSION [LockEscalation] ON SERVER STATE = START; -- Query the ring buffer for recent escalations: SELECT xdr.value('(@timestamp)', 'datetime2') AS escalation_time, xdr.value('(data[@name="object_id"]/value)[1]', 'int') AS object_id, xdr.value('(data[@name="escalated_lock_count"]/value)[1]', 'bigint') AS lock_count, OBJECT_NAME(xdr.value('(data[@name="object_id"]/value)[1]', 'int')) AS table_name FROM ( SELECT CAST(target_data AS XML) AS TargetData FROM sys.dm_xe_session_targets t JOIN sys.dm_xe_sessions s ON s.address = t.event_session_address WHERE s.name = 'LockEscalation' ) x CROSS APPLY TargetData.nodes('//RingBufferTarget/event') r(xdr) ORDER BY escalation_time DESC;
SELECT OBJECT_NAME(p.object_id) AS table_name, p.lock_escalation_desc, i.rows, SUM(p.reserved_page_count) * 8 / 1024 AS size_mb FROM sys.partitions p JOIN sys.dm_db_partition_stats i ON i.object_id = p.object_id WHERE p.lock_escalation_desc <> 'TABLE' GROUP BY p.object_id, p.lock_escalation_desc, i.rows ORDER BY i.rows DESC;
SELECT t.name, t.lock_escalation_desc, CASE t.lock_escalation_desc WHEN 'TABLE' THEN 'Default — escalates to table lock at 5000 rows' WHEN 'AUTO' THEN 'Partitioned tables escalate to partition, not table' WHEN 'DISABLE' THEN 'Escalation disabled — no table locks from this table' END AS behaviour FROM sys.tables t ORDER BY t.name;
Three Options to Prevent Escalation
DISABLE_LOCK_ESCALATION prevents escalation on a specific table. Disabling it globally via trace flags puts lock manager memory under pressure on large-scale operations. The per-table approach is always preferred.
-- Prevents this table from ever taking a table lock via escalation. -- Row and page locks are held instead. -- Only appropriate if the table is accessed by high-concurrency OLTP -- and bulk operations against it are broken into small batches. ALTER TABLE dbo.Orders SET (LOCK_ESCALATION = DISABLE); -- Revert to default (allows escalation — safer for bulk operations): ALTER TABLE dbo.Orders SET (LOCK_ESCALATION = TABLE);
-- Instead of: UPDATE Orders SET Status = 'Archived' WHERE OrderDate < '2023-01-01' -- (potentially millions of rows = guaranteed escalation) -- Do this instead: DECLARE @batch_size INT = 4000; -- stay under 5000 lock threshold DECLARE @rows_affected INT = 1; WHILE @rows_affected > 0 BEGIN UPDATE TOP (@batch_size) dbo.Orders SET Status = 'Archived' WHERE OrderDate < '2023-01-01' AND Status <> 'Archived'; -- idempotent condition SET @rows_affected = @@ROWCOUNT; IF @rows_affected > 0 WAITFOR DELAY '00:00:00.100'; -- 100ms pause to release row locks END
-- Read Committed Snapshot Isolation removes shared locks on reads. -- Readers never block writers, writers never block readers. -- Escalation to table lock only affects concurrent writers, not readers. -- This is the right solution when readers are the primary victim. ALTER DATABASE [YourDatabase] SET READ_COMMITTED_SNAPSHOT ON; -- Verify: SELECT name, is_read_committed_snapshot_on FROM sys.databases WHERE name = 'YourDatabase';
The Partitioned Table Special Case
On partitioned tables, the default TABLE escalation is almost always wrong. A write that touches 5,000 rows in one partition should not lock the entire table. Setting LOCK_ESCALATION = AUTO on partitioned tables escalates to the partition level instead — blocking only concurrent access to the same partition, not the entire table.
-- For partitioned tables: escalate to partition, not table ALTER TABLE dbo.OrdersPartitioned SET (LOCK_ESCALATION = AUTO); -- Verify partition count and current lock escalation: SELECT t.name, t.lock_escalation_desc, COUNT(p.partition_number) AS partition_count FROM sys.tables t JOIN sys.partitions p ON p.object_id = t.object_id GROUP BY t.name, t.lock_escalation_desc HAVING COUNT(p.partition_number) > 1;
Lock escalation is not a bug. It is a resource management mechanism that becomes a problem when the table is shared between high-concurrency OLTP reads and bulk write operations. The correct response depends on which side is causing the escalation: batch writers get broken into smaller chunks, high-read tables get RCSI, partitioned tables get AUTO.