We recently optimized a PostgreSQL database handling 50M+ records. Query times dropped from 45 seconds to 200ms. Here's exactly how we did it.
The Problem
A client's reporting dashboard was unusable:
- User activity report: 45 seconds
- Sales analytics: 38 seconds
- Dashboard loads: 60+ seconds
- Database CPU: 95% constantly
Step 1: Identify Slow Queries
Enable slow query logging:
-- postgresql.conf
log_min_duration_statement = 1000 # Log queries > 1s
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '
-- Check slow queries
SELECT query, calls, total_time, mean_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;
Step 2: Add Missing Indexes
Problem query: Finding user orders by date range
-- Before: 45 seconds (full table scan)
SELECT * FROM orders
WHERE user_id = 123
AND created_at BETWEEN '2025-01-01' AND '2025-01-31';
-- Add composite index
CREATE INDEX idx_orders_user_created
ON orders(user_id, created_at);
-- After: 180ms (index scan)
Index Best Practices
- Index columns used in WHERE, JOIN, and ORDER BY
- Composite indexes: most selective column first
- Use INCLUDE for covering indexes (Postgres 11+)
- Partial indexes for common filtered queries
- Don't over-index—each index slows writes
Step 3: Optimize Query Structure
Example: Avoid N+1 Queries
-- Bad: N+1 queries (1 + 1000 queries)
users.forEach(user => {
const orders = await db.query(
'SELECT * FROM orders WHERE user_id = $1',
[user.id]
);
});
-- Good: Single query with JOIN
SELECT u.*, json_agg(o.*) as orders
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id;
Example: Use CTEs for Complex Queries
-- Before: Nested subqueries (slow)
SELECT *
FROM orders
WHERE user_id IN (
SELECT id FROM users
WHERE created_at > (
SELECT created_at FROM users
WHERE email = 'admin@example.com'
)
);
-- After: CTE (readable and optimized)
WITH admin_date AS (
SELECT created_at
FROM users
WHERE email = 'admin@example.com'
),
recent_users AS (
SELECT id
FROM users
WHERE created_at > (SELECT created_at FROM admin_date)
)
SELECT * FROM orders
WHERE user_id IN (SELECT id FROM recent_users);
Step 4: Database Configuration
Memory Settings
-- postgresql.conf (for 16GB RAM server)
shared_buffers = 4GB # 25% of RAM
effective_cache_size = 12GB # 75% of RAM
work_mem = 50MB # Per operation
maintenance_work_mem = 1GB # For VACUUM, CREATE INDEX
Connection Pooling
// Use PgBouncer for connection pooling
max_client_conn = 1000
default_pool_size = 25
pool_mode = transaction // Most efficient
Step 5: Partitioning Large Tables
For tables with 10M+ rows, partition by date or ID:
-- Create partitioned table
CREATE TABLE orders (
id BIGSERIAL,
user_id INT,
created_at TIMESTAMP
) PARTITION BY RANGE (created_at);
-- Create monthly partitions
CREATE TABLE orders_2025_01
PARTITION OF orders
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
CREATE TABLE orders_2025_02
PARTITION OF orders
FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');
Results
Performance Improvements:
- • User activity report: 45s → 200ms (99.5% faster)
- • Sales analytics: 38s → 150ms (99.6% faster)
- • Dashboard loads: 60s → 800ms (98.7% faster)
- • Database CPU: 95% → 25% (74% reduction)
- • Concurrent users supported: 50 → 500 (10x)
Monitoring & Maintenance
- pg_stat_statements: Track query performance
- EXPLAIN ANALYZE: Understand query execution plans
- Auto-vacuum: Keep statistics up to date
- Regular REINDEX: Prevent index bloat
- Monitor slow queries: Set up alerts for queries > 1s