PostgreSQL Progress Reporting: Track Long-Running Operations in Real Time

Mafiree May 18, 2026 ·26 writeups ·joined Jan 2026

5 min read

The Problem

Every DBA managing PostgreSQL has asked the same question mid-operation: "Is it done yet?" Vacuuming, index creation, bulk data loads, and base backups can run for hours, and without visibility, they feel completely opaque. Before progress reporting views existed, the only options were log parsing, rough signals from pg_stat_activity, or simply waiting — creating real uncertainty during maintenance windows and migrations.

PostgreSQL Progress reporting solves this with built-in system views that expose the live, real-time state of long-running operations. They are in-memory, require no extra configuration, and are lightweight enough that querying them has no meaningful performance impact.

Operational Benefits

Progress reporting views enable DBAs to:

Detect bottlenecks by identifying which specific phase is consuming the most time
Spot stuck operations caused by lock contention, I/O saturation, or waiting transactions — visible immediately through the phase column
Estimate completion by computing percentages from fields like blocks_done vs. blocks_total
Automate alerting by integrating these SQL-queryable views into monitoring scripts and pipelines
Manage maintenance windows confidently by monitoring operations live as they run
Communicate reliable ETAs to stakeholders backed by real data rather than guesswork

The Six Progress Views (PostgreSQL 18)

1. pg_stat_progress_vacuum — Use this during any autovacuum or manual VACUUM, especially after bulk loads. The phase field cycles through scanning heap, vacuuming indexes, and cleanup. Track heap_blks_scanned vs. heap_blks_total for completion percentage, num_dead_tuples for bloat being reclaimed, and index_vacuum_count for index passes completed.

2. pg_stat_progress_analyze — Most useful when autoanalyze runs longer than expected or after large bulk loads. The phase field shows whether it is acquiring sample rows or inherited sample rows. Use sample_blks_scanned vs. sample_blks_total for progress, ext_stats_computed for multi-column statistics, and child_tables_done for partitioned tables.

3. pg_stat_progress_create_index — Critical for monitoring index builds on large tables, especially in CONCURRENTLY mode. The build moves through eight distinct phases in order: initializing → scanning table → sorting live tuples → loading tuples in tree → index validation (scanning index, then table) → waiting for old snapshots → waiting for readers before marking dead. Use blocks_done vs. blocks_total during scanning, tuples_done vs. tuples_total during sorting, and partitions_done for partitioned table indexes. If the build appears stuck, the phase column pinpoints whether it is waiting on locks, I/O, or active transactions.

4. pg_stat_progress_cluster — CLUSTER physically rewrites an entire table in index order, making it a heavy locking operation. This view tracks it through three phases: sequential heap scanning, index scanning heap, and writing new heap. Monitor heap_tuples_written vs. heap_tuples_scanned for row-level rewrite progress, heap_blks_scanned for block-level progress, and index_rebuild_count for indexes rebuilt so far.

5. pg_stat_progress_copy — Invaluable during ETL jobs and migrations for bulk COPY FROM/TO operations. A real-world sample in the blog shows 100,073,472 bytes processed out of 137,777,792 total, with 3,652,000 rows loaded — approximately 72.6% complete. Use bytes_processed vs. bytes_total for completion percentage, tuples_processed for total rows loaded, tuples_excluded / tuples_skipped to catch data quality issues mid-load, and the type field to identify whether the source is FILE, PIPE, PROGRAM, or STDIN.

6. pg_stat_progress_basebackup — Base backups pass through six phases: initializing → waiting for checkpoint to finish → estimating backup size → streaming database files → waiting for WAL archiving to finish → transferring WAL files. A prolonged pause on "waiting for checkpoint to finish" signals checkpoint pressure. Note that backup_total remains NULL until size estimation completes. The tablespaces_streamed field is relevant for multi-tablespace databases.

The Bottom Line

Together, these six views turn long-running PostgreSQL operations from unpredictable black boxes into fully observable, manageable workflows. DBAs get phase-level insight at any moment — enabling faster troubleshooting, better planning, accurate stakeholder communication, and stronger monitoring automation, all at zero additional configuration cost.

Business