At 11:47 on a Wednesday, the MySQL data directory filled its volume. MySQL stopped. The cause: 847GB of binary logs accumulated over 9 months because expire_logs_days had been set to 0 at installation — meaning binary logs were never purged. Nobody had noticed because the volume had 1TB and queries had never been slow.
The Alert
Monitoring: /var/lib/mysql at 100%. MySQL error log: ERROR 28: Out of disk space. The data directory had 847GB of binary logs and 153GB of actual database files.
The Discovery
-- List all binary logs and their sizes: SHOW BINARY LOGS; -- Check current expire setting: SELECT @@expire_logs_days; -- MySQL 5.7 SELECT @@binlog_expire_logs_seconds; -- MySQL 8.0 -- Check which logs are safe to purge (all below current slave position): SHOW SLAVE STATUS\G -- on replica, get Relay_Master_Log_File -- On primary, purge logs the replica has consumed: PURGE BINARY LOGS TO 'binlog.003421'; -- replace with replica's current position
expire_logs_days: 0. SHOW BINARY LOGS returned 1,847 binary log files spanning 9 months. The current replica had consumed through file 3421. Files 1 through 3421 could be purged immediately.
Incident Timeline
| Time | Event |
|---|---|
| 9 months ago | MySQL installed. expire_logs_days not set — defaults to 0 (never purge). |
| 9 months | 847GB of binary logs accumulate silently. |
| Wed 11:47 | Volume at 100%. MySQL stops accepting writes. |
| 12:00 | On-call identifies binary logs as the cause. SHOW BINARY LOGS run. |
| 12:05 | Replica position checked. Safe purge point identified. |
| 12:08 | PURGE BINARY LOGS TO 'binlog.003421'. 600GB freed immediately. |
| 12:09 | MySQL resumes writes. Incident resolved. |
| 12:15 | expire_logs_days = 7 set globally and in my.cnf |
Root Cause
MySQL defaults expire_logs_days to 0 — no automatic purge. Binary logs accumulate indefinitely. On a production server with 1TB of storage this is not immediately visible. After 9 months of moderate write volume, the logs consumed 85% of total disk space.
The Fix
-- Immediate purge (check replica position first): SHOW SLAVE STATUS\G -- get Relay_Master_Log_File from replica PURGE BINARY LOGS TO 'binlog.003421'; -- Set expiry permanently: SET GLOBAL expire_logs_days = 7; -- MySQL 5.7 -- In my.cnf: expire_logs_days = 7 -- MySQL 8.0: SET GLOBAL binlog_expire_logs_seconds = 604800; -- 7 days in seconds -- Monitor binary log space usage: SELECT COUNT(*) AS log_count, SUM(file_size) / 1024/1024/1024 AS total_gb FROM information_schema.files WHERE file_type = 'BINLOG';
Prevention
expire_logs_days is now part of the installation checklist — set to 7 days minimum, 14 days if PITR is required. Disk usage monitoring alerts at 70% capacity, not 95%, giving time to investigate and purge before an outage.
Before Purging: Verify It Is Safe
Binary logs cannot be recovered after purging. The critical pre-flight check is to confirm the replication position of every downstream consumer — replicas, binlog-based ETL, change data capture systems — before running any PURGE command.
-- On the PRIMARY — find the oldest log position any replica has consumed: -- Run this on each replica first, then compare against SHOW BINARY LOGS output -- On each REPLICA: SHOW SLAVE STATUS\G -- Record: Relay_Master_Log_File and Exec_Master_Log_Pos -- The log file in this field MUST NOT be purged -- On the PRIMARY — safe purge point: -- Never purge any log that a replica has NOT yet consumed -- The oldest replica position is your purge boundary SHOW BINARY LOGS; -- Compare File column against replica positions -- Purge everything the replica has consumed: -- PURGE BINARY LOGS TO 'binlog.003421'; -- This leaves binlog.003421 and all newer files intact
-- Current binary log total size: SELECT COUNT(*) AS log_file_count, ROUND(SUM(file_size) / 1024 / 1024 / 1024, 2) AS total_gb, MIN(log_name) AS oldest_log, MAX(log_name) AS newest_log FROM information_schema.files WHERE file_type = 'BINLOG'; -- Or use SHOW BINARY LOGS and sum the file_size column: SHOW BINARY LOGS;