Sizes and row counts without breaking production
The naive way to answer “how big is it?” — SELECT count(*) FROM t — does a sequential
scan of the whole table. On a multi-GB table that locks a worker, blows out the buffer
cache, and can take minutes. PostgreSQL already tracks all of this in the catalog; read
from there instead.
Database size
Section titled “Database size”SELECT pg_size_pretty(pg_database_size(current_database())) AS size;All databases on the cluster, biggest first:
SELECT datname, pg_size_pretty(pg_database_size(datname)) AS size FROM pg_database WHERE datistemplate = false ORDER BY pg_database_size(datname) DESC;Top tables by total size
Section titled “Top tables by total size”pg_total_relation_size includes the table, its TOAST, and all indexes — the number
that actually matches what you see on disk.
SELECT n.nspname AS schema, c.relname AS table, pg_size_pretty(pg_total_relation_size(c.oid)) AS total, pg_size_pretty(pg_relation_size(c.oid)) AS heap, pg_size_pretty(pg_indexes_size(c.oid)) AS indexes, pg_size_pretty(pg_total_relation_size(c.oid) - pg_relation_size(c.oid) - pg_indexes_size(c.oid)) AS toast FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind = 'r' AND n.nspname NOT IN ('pg_catalog', 'information_schema') ORDER BY pg_total_relation_size(c.oid) DESC LIMIT 20;This reads catalog metadata only — no table scans, safe on production.
Row count estimates (no scan)
Section titled “Row count estimates (no scan)”pg_class.reltuples is the planner’s row estimate, refreshed by ANALYZE and
autovacuum. Accurate to a few percent on healthy tables and free.
SELECT n.nspname AS schema, c.relname AS table, c.reltuples::bigint AS estimated_rows FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind = 'r' AND n.nspname = '${{ SCHEMA }}' ORDER BY c.reltuples DESC;For a single table:
SELECT reltuples::bigint AS estimated_rows FROM pg_class WHERE oid = '${{ SCHEMA }}.${{ TABLE }}'::regclass;Need an exact count and the table is small? SELECT count(*) is fine. Big table and
you need exact? Run it off-hours, or keep a maintained counter table.
Index bloat (and how to fix it)
Section titled “Index bloat (and how to fix it)”Index pages don’t get reclaimed in place — heavy update/delete workloads can leave indexes much larger than the underlying data. Find the worst offenders:
SELECT n.nspname AS schema, c.relname AS index, i.indrelid::regclass AS table, pg_size_pretty(pg_relation_size(c.oid)) AS size, s.idx_scan AS scans FROM pg_class c JOIN pg_index i ON i.indexrelid = c.oid JOIN pg_namespace n ON n.oid = c.relnamespace LEFT JOIN pg_stat_user_indexes s ON s.indexrelid = c.oid WHERE c.relkind = 'i' AND n.nspname NOT IN ('pg_catalog', 'information_schema') ORDER BY pg_relation_size(c.oid) DESC LIMIT 20;scans = 0 and a non-trivial size means the index is dead weight — drop it.
Otherwise, rebuild concurrently (no exclusive lock, requires PostgreSQL 12+):
REINDEX INDEX CONCURRENTLY ${{ SCHEMA }}.<index_name>;Or all indexes on a table:
REINDEX TABLE CONCURRENTLY ${{ SCHEMA }}.${{ TABLE }};Plain REINDEX (without CONCURRENTLY) is faster but takes an
ACCESS EXCLUSIVE lock — only use during maintenance windows.
Live vs. dead tuples (bloat hint)
Section titled “Live vs. dead tuples (bloat hint)”SELECT relname, n_live_tup, n_dead_tup, round(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct, last_autovacuum, last_autoanalyze FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 20;pg_stat_user_tables is a stats view — reads from shared memory, not the tables.
High dead_pct on a hot table means autovacuum is falling behind.
One table, all the numbers
Section titled “One table, all the numbers”SELECT pg_size_pretty(pg_total_relation_size('${{ SCHEMA }}.${{ TABLE }}')) AS total, pg_size_pretty(pg_relation_size('${{ SCHEMA }}.${{ TABLE }}')) AS heap, pg_size_pretty(pg_indexes_size('${{ SCHEMA }}.${{ TABLE }}')) AS indexes, (SELECT reltuples::bigint FROM pg_class WHERE oid = '${{ SCHEMA }}.${{ TABLE }}'::regclass) AS rows_est;Why this is “polite”
Section titled “Why this is “polite””- Catalog reads only —
pg_class,pg_namespace,pg_stat_*are tiny system tables/views cached in memory. - No heap I/O, no shared-buffer churn, no locks on user tables.
- Safe to run from a read-only role on a primary at peak load.
The trade-off is that row counts are estimates. If you need exact, you pay for it.