A partial index indexes a subset of rows — those satisfying a WHERE clause. For a table with 50 million rows where 99.5% are in a terminal state and 0.5% are in active processing, a conventional index on the status column is almost useless: the planner avoids it because the selectivity of "active" rows is too low to justify the seek overhead for most queries. A partial index on status = 'active' is 200x smaller and the planner can use it for every query against the active subset.
When a Partial Index Is the Right Tool
A partial index is worth considering when all of the following are true: the table is large (millions of rows), a significant fraction of rows will never appear in queries after they reach a certain state (completed orders, processed events, archived records), and the active minority is the performance-critical subset — the rows that every application query actually reads and writes.
In these situations, a conventional index on the state column includes every historical row. The index is large, the buffer pool fills with index pages for rows nobody queries, and cache efficiency for the active rows drops.
Diagnosing Whether a Partial Index Will Help
-- How many rows are in each state? SELECT status, count(*) AS row_count, round(100.0 * count(*) / sum(count(*)) OVER (), 2) AS pct FROM job_queue GROUP BY status ORDER BY row_count DESC; -- Result that calls for a partial index: -- status | row_count | pct -- ------------+-----------+-------- -- completed | 49700000 | 99.40 -- failed | 245000 | 0.49 -- active | 55000 | 0.11 -- pending | 1200 | 0.00
-- Compare existing index size vs what a partial index would be: SELECT indexrelname AS index_name, pg_size_pretty(pg_relation_size(indexrelid)) AS index_size, idx_scan, idx_tup_read, idx_tup_fetch FROM pg_stat_user_indexes WHERE relname = 'job_queue' ORDER BY pg_relation_size(indexrelid) DESC;
-- Partial index: only active and pending rows -- The WHERE clause matches your most common query filter CREATE INDEX CONCURRENTLY idx_job_queue_active ON job_queue (created_at, job_id) WHERE status IN ('active', 'pending'); -- CONCURRENTLY builds without taking an exclusive lock -- Safe to run on a live table — no application downtime -- How small is this index? SELECT pg_size_pretty(pg_relation_size('idx_job_queue_active'));
Verifying the Planner Uses It
EXPLAIN (ANALYZE, BUFFERS) SELECT job_id, payload, created_at FROM job_queue WHERE status = 'active' ORDER BY created_at LIMIT 100; -- Expected output: -- Index Scan using idx_job_queue_active on job_queue -- (actual rows=87 loops=1) -- Buffers: shared hit=12 read=3 -- The "shared hit=12 read=3" is the signal: -- Only 15 buffer accesses to return 87 rows from a 50M row table -- Without the partial index: Sequential Scan, 50M row reads
-- This query WILL use the partial index: SELECT * FROM job_queue WHERE status = 'active'; SELECT * FROM job_queue WHERE status = 'pending'; -- This query will NOT use it (OR condition not in partial index predicate): SELECT * FROM job_queue WHERE status = 'active' OR status = 'completed'; -- This query will NOT use it (no status filter — can't restrict to partial index rows): SELECT * FROM job_queue WHERE created_at > now() - INTERVAL '1 hour';
The Unique Partial Index Pattern
One of the most powerful uses of partial indexes is enforcing uniqueness only on the active rows — not the entire table. A soft-delete pattern (is_deleted = false) needs unique emails among active users, but allows multiple deleted records with the same email.
-- Unique email among non-deleted users only: CREATE UNIQUE INDEX idx_users_email_active ON users (email) WHERE is_deleted = false; -- This allows: -- INSERT INTO users (email, is_deleted) VALUES ('a@b.com', true); -- INSERT INTO users (email, is_deleted) VALUES ('a@b.com', true); -- (two deleted records with same email — allowed) -- -- But prevents: -- INSERT INTO users (email, is_deleted) VALUES ('a@b.com', false); -- INSERT INTO users (email, is_deleted) VALUES ('a@b.com', false); -- (duplicate active email — ERROR: duplicate key)
Maintenance Considerations
Partial indexes maintain themselves automatically as rows are inserted and deleted. When a row's status changes from 'active' to 'completed', PostgreSQL removes it from the partial index automatically. There is no manual maintenance required.
Autovacuum handles dead tuple cleanup in partial indexes the same as regular indexes. REINDEX CONCURRENTLY is available if the index becomes bloated. The main consideration is that ANALYZE needs to see enough rows matching the partial index predicate to build an accurate histogram — on tables where the active set is very small (under 1,000 rows), the planner may occasionally make poor decisions due to statistics imprecision.