2,000 Idle Connections Consuming an Entire CPU Core

The application had scaled to 12 instances, each with its own connection pool configured to hold 200 connections. That was 2,400 connections to PostgreSQL. Most were idle. PostgreSQL was spending more CPU time managing idle connection state than executing queries. A single CPU core was permanently occupied doing nothing useful.

The Alert

CPU at 15% on a lightly loaded server. Database response times were slow despite the low CPU and no active workload. The symptom: CPU was busy but queries were not running. Something was consuming CPU that was not query execution.

The Discovery

Count connections by state

PostgreSQL

SELECT
  state,
  count(*) AS count,
  max(now() - state_change) AS max_idle_duration,
  client_addr
FROM   pg_stat_activity
WHERE  pid <> pg_backend_pid()
GROUP BY state, client_addr
ORDER BY count DESC;

2,047 connections in state idle. Some had been idle for over 4 hours. Each idle PostgreSQL connection is a forked backend process maintaining session state, checking for client disconnects, and processing keepalive packets. 2,000 backends doing nothing consumed a full CPU core in aggregate overhead.

Incident Timeline

Time	Event
3 months ago	Application scaled from 4 to 12 instances. Pool size not recalculated.
Ongoing	2,400 connections held open. Most idle at any given time.
Investigation day	15% CPU on idle server investigated
+1 hour	2,047 idle connections identified as the source
+2 hours	PgBouncer deployed in transaction pooling mode
+3 hours	Application connections reduced to 25 real PostgreSQL connections. CPU: 2%

Root Cause

PostgreSQL uses one backend process per connection. Each backend consumes memory and CPU regardless of whether it is executing queries. At 2,000 idle connections, the aggregate overhead of connection management exceeded the cost of the actual query workload.

The Fix

PgBouncer in transaction mode — the standard solution

PostgreSQL

# pgbouncer.ini — transaction pooling for OLTP workloads
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
pool_mode = transaction
max_client_conn = 2000      # accept 2000 app connections
default_pool_size = 25      # maintain only 25 real PostgreSQL connections
server_idle_timeout = 600   # close idle server connections after 10 min
client_idle_timeout = 0     # keep app connections open (app manages this)

Prevention

The rule: max PostgreSQL connections = (CPU cores × 2) + disk spindles, or roughly 50-100 for most servers. Application connection pool size is calculated as total_postgres_connections ÷ application_instances. PgBouncer is now the standard entry point for all application connections.

Before You Kill Connections: Confirm the Scope

Killing connections without understanding the source is temporary relief. The pool fills again within minutes if the root cause — idle connections held by application pool configuration — is not fixed. Before taking action, measure the full scope.

Connection inventory: who is connected, in what state, for how long

PostgreSQL

SELECT
  state,
  client_addr,
  application_name,
  count(*) AS connection_count,
  max(extract(epoch from (now() - state_change)))::int AS max_idle_sec,
  avg(extract(epoch from (now() - state_change)))::int AS avg_idle_sec
FROM   pg_stat_activity
WHERE  pid <> pg_backend_pid()
GROUP BY state, client_addr, application_name
ORDER BY connection_count DESC;

Check max_connections and how close you are to the limit

PostgreSQL

SELECT
  current_setting('max_connections')                         AS max_connections,
  count(*) AS current_connections,
  current_setting('max_connections')::int - count(*) AS slots_remaining,
  current_setting('superuser_reserved_connections')           AS superuser_reserved
FROM   pg_stat_activity;

-- superuser_reserved connections (default: 3) are always available to superusers
-- even if regular connections are exhausted.
-- Connect as superuser if regular login is failing due to exhaustion.

Terminate only idle connections if the limit is hit and you cannot restart

PostgreSQL

-- Emergency: terminate idle connections held more than 10 minutes
-- This is temporary relief — the pool refills without a config change
SELECT
  pg_terminate_backend(pid),
  client_addr,
  application_name,
  state,
  extract(epoch from (now() - state_change))::int AS idle_sec
FROM   pg_stat_activity
WHERE  state = 'idle'
  AND  state_change < now() - INTERVAL '10 minutes'
  AND  pid <> pg_backend_pid();

Marcus Reid

Staff DBA · PostgreSQL · 14 years · QueryTuning

Marcus has been managing large PostgreSQL deployments for 14 years. He writes about connection management, PgBouncer configuration, and PostgreSQL operational patterns.

2,000 Idle Connections Were Consuming an Entire CPU Core Before We Found It

The Alert

The Discovery

Incident Timeline

Root Cause

The Fix

Prevention

Before You Kill Connections: Confirm the Scope

The Same Problem on Other Engines