Sizes and row counts without breaking production

performance
troubleshooting

The naive way to answer “how big is it?” — SELECT count(*) FROM t — does a sequential scan of the whole table. On a multi-GB table that locks a worker, blows out the buffer cache, and can take minutes. PostgreSQL already tracks all of this in the catalog; read from there instead.

Database size

SELECT pg_size_pretty(pg_database_size(current_database())) AS size;

All databases on the cluster, biggest first:

SELECT datname,
       pg_size_pretty(pg_database_size(datname)) AS size
  FROM pg_database
 WHERE datistemplate = false
 ORDER BY pg_database_size(datname) DESC;

Top tables by total size

pg_total_relation_size includes the table, its TOAST, and all indexes — the number that actually matches what you see on disk.

SELECT n.nspname                                              AS schema,
       c.relname                                              AS table,
       pg_size_pretty(pg_total_relation_size(c.oid))          AS total,
       pg_size_pretty(pg_relation_size(c.oid))                AS heap,
       pg_size_pretty(pg_indexes_size(c.oid))                 AS indexes,
       pg_size_pretty(pg_total_relation_size(c.oid)
                      - pg_relation_size(c.oid)
                      - pg_indexes_size(c.oid))               AS toast
  FROM pg_class c
  JOIN pg_namespace n ON n.oid = c.relnamespace
 WHERE c.relkind = 'r'
   AND n.nspname NOT IN ('pg_catalog', 'information_schema')
 ORDER BY pg_total_relation_size(c.oid) DESC
 LIMIT 20;

This reads catalog metadata only — no table scans, safe on production.

Row count estimates (no scan)

pg_class.reltuples is the planner’s row estimate, refreshed by ANALYZE and autovacuum. Accurate to a few percent on healthy tables and free.

SELECT n.nspname        AS schema,
       c.relname        AS table,
       c.reltuples::bigint AS estimated_rows
  FROM pg_class c
  JOIN pg_namespace n ON n.oid = c.relnamespace
 WHERE c.relkind = 'r'
   AND n.nspname = '${{ SCHEMA }}'
 ORDER BY c.reltuples DESC;

For a single table:

SELECT reltuples::bigint AS estimated_rows
  FROM pg_class
 WHERE oid = '${{ SCHEMA }}.${{ TABLE }}'::regclass;

Need an exact count and the table is small? SELECT count(*) is fine. Big table and you need exact? Run it off-hours, or keep a maintained counter table.

Index bloat (and how to fix it)

Index pages don’t get reclaimed in place — heavy update/delete workloads can leave indexes much larger than the underlying data. Find the worst offenders:

SELECT n.nspname                                     AS schema,
       c.relname                                     AS index,
       i.indrelid::regclass                          AS table,
       pg_size_pretty(pg_relation_size(c.oid))       AS size,
       s.idx_scan                                    AS scans
  FROM pg_class c
  JOIN pg_index i      ON i.indexrelid = c.oid
  JOIN pg_namespace n  ON n.oid = c.relnamespace
  LEFT JOIN pg_stat_user_indexes s ON s.indexrelid = c.oid
 WHERE c.relkind = 'i'
   AND n.nspname NOT IN ('pg_catalog', 'information_schema')
 ORDER BY pg_relation_size(c.oid) DESC
 LIMIT 20;

scans = 0 and a non-trivial size means the index is dead weight — drop it. Otherwise, rebuild concurrently (no exclusive lock, requires PostgreSQL 12+):

REINDEX INDEX CONCURRENTLY ${{ SCHEMA }}.<index_name>;

Or all indexes on a table:

REINDEX TABLE CONCURRENTLY ${{ SCHEMA }}.${{ TABLE }};

Plain REINDEX (without CONCURRENTLY) is faster but takes an ACCESS EXCLUSIVE lock — only use during maintenance windows.

Live vs. dead tuples (bloat hint)

SELECT relname,
       n_live_tup,
       n_dead_tup,
       round(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct,
       last_autovacuum,
       last_autoanalyze
  FROM pg_stat_user_tables
 ORDER BY n_dead_tup DESC
 LIMIT 20;

pg_stat_user_tables is a stats view — reads from shared memory, not the tables. High dead_pct on a hot table means autovacuum is falling behind.

One table, all the numbers

SELECT pg_size_pretty(pg_total_relation_size('${{ SCHEMA }}.${{ TABLE }}'))  AS total,
       pg_size_pretty(pg_relation_size('${{ SCHEMA }}.${{ TABLE }}'))        AS heap,
       pg_size_pretty(pg_indexes_size('${{ SCHEMA }}.${{ TABLE }}'))         AS indexes,
       (SELECT reltuples::bigint
          FROM pg_class
         WHERE oid = '${{ SCHEMA }}.${{ TABLE }}'::regclass)                 AS rows_est;

Why this is “polite”

Catalog reads only — pg_class, pg_namespace, pg_stat_* are tiny system tables/views cached in memory.
No heap I/O, no shared-buffer churn, no locks on user tables.
Safe to run from a read-only role on a primary at peak load.

The trade-off is that row counts are estimates. If you need exact, you pay for it.