We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
EXPLAIN & EXPLAIN ANALYZE
Knowing how to read query execution plans is one of the most valuable skills for database performance tuning. PostgreSQL's EXPLAIN command reveals exactly how the database engine plans to execute your query, and EXPLAIN ANALYZE shows what actually happened when it ran.
EXPLAIN: The Estimated Plan
EXPLAIN shows the query plan without executing the query. It displays the planner's estimates for cost, row counts, and chosen strategies.
EXPLAIN SELECT * FROM sales WHERE region = 'North';
Sample output:
Seq Scan on sales (cost=0.00..1.62 rows=10 width=68)
Filter: (region = 'North'::text)
Reading the Output
Each line represents a plan node. Let's break down the key parts:
- Seq Scan on sales -- The operation type and table
- cost=0.00..1.62 -- Estimated cost (startup cost..total cost) in arbitrary units
- rows=10 -- Estimated number of rows this node will return
- width=68 -- Estimated average size of each row in bytes
- Filter: -- The condition applied during the scan
Understanding Cost
Cost is measured in arbitrary units (based on seq_page_cost = 1.0). The two numbers represent:
- Startup cost (before the first row is produced):
0.00 - Total cost (to produce all rows):
1.62
Lower cost generally means a faster query, but costs are only meaningful for comparing plans for the same query.
EXPLAIN ANALYZE: The Real Execution
EXPLAIN ANALYZE actually executes the query and reports real timing alongside the estimates.
EXPLAIN ANALYZE SELECT * FROM sales WHERE region = 'North';
Sample output:
Seq Scan on sales (cost=0.00..1.62 rows=10 width=68)
(actual time=0.015..0.025 rows=10 loops=1)
Filter: (region = 'North'::text)
Rows Removed by Filter: 40
Planning Time: 0.085 ms
Execution Time: 0.045 ms
Additional Information
- actual time=0.015..0.025 -- Real time in milliseconds (startup..total)
- rows=10 -- Actual number of rows produced
- loops=1 -- How many times this node was executed
- Rows Removed by Filter: 40 -- How many rows were read but did not match
- Planning Time -- Time spent creating the plan
- Execution Time -- Total execution time
Comparing Estimates to Actuals
If the estimated rows is very different from the actual rows, the planner may be making suboptimal choices. This often indicates stale statistics that need updating with ANALYZE.
Common Scan Types
Seq Scan (Sequential Scan)
Reads every row in the table. This is the simplest and sometimes fastest approach for small tables or queries that return a large percentage of rows.
EXPLAIN SELECT * FROM sales;
Seq Scan on sales (cost=0.00..1.50 rows=50 width=68)
Index Scan
Uses an index to find matching rows, then fetches the full row from the table. Most efficient for highly selective queries on indexed columns.
-- Assuming an index exists on region
EXPLAIN SELECT * FROM sales WHERE region = 'North';
Index Scan using idx_sales_region on sales (cost=0.14..8.16 rows=10 width=68)
Index Cond: (region = 'North'::text)
Index Only Scan
Even better than Index Scan -- all needed columns are in the index, so the table does not need to be visited at all.
-- With a covering index on (region) INCLUDE (amount)
EXPLAIN SELECT region, amount FROM sales WHERE region = 'North';
Index Only Scan using idx_sales_covering on sales (cost=0.14..4.16 rows=10 width=36)
Index Cond: (region = 'North'::text)
Bitmap Index Scan + Bitmap Heap Scan
A two-step process for moderate selectivity. First, the index is scanned to create a bitmap of matching pages. Then, those pages are fetched in sequential order.
Bitmap Heap Scan on sales (cost=4.18..14.20 rows=10 width=68)
Recheck Cond: (region = 'North'::text)
-> Bitmap Index Scan on idx_sales_region (cost=0.00..4.18 rows=10 width=0)
Index Cond: (region = 'North'::text)
This is efficient when multiple rows match but they are scattered across many table pages.
Common Join Strategies
When your query involves joins, EXPLAIN reveals the join algorithm:
Nested Loop
For each row in the outer table, scan the inner table. Best for small datasets or when the inner table has an index.
Nested Loop (cost=0.00..15.25 rows=10 width=136)
-> Seq Scan on orders (cost=0.00..1.10 rows=10 width=68)
-> Index Scan using idx_sales_product on sales (cost=0.14..1.40 rows=1 width=68)
Index Cond: (product_id = orders.product_id)
Hash Join
Builds a hash table from one table, then probes it with rows from the other. Efficient for medium-to-large joins without indexes.
Hash Join (cost=1.12..3.25 rows=10 width=136)
Hash Cond: (sales.product_id = orders.product_id)
-> Seq Scan on sales (cost=0.00..1.50 rows=50 width=68)
-> Hash (cost=1.10..1.10 rows=10 width=68)
-> Seq Scan on orders (cost=0.00..1.10 rows=10 width=68)
Merge Join
Both inputs are sorted, then merged together. Efficient when data is already sorted or indexes provide sorted access.
EXPLAIN Options
EXPLAIN (FORMAT JSON)
Get the plan in JSON format for programmatic analysis:
EXPLAIN (FORMAT JSON) SELECT * FROM sales WHERE region = 'North';
EXPLAIN (BUFFERS)
Shows how many shared buffer blocks were read (combine with ANALYZE):
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM sales WHERE region = 'North';
Seq Scan on sales (cost=0.00..1.62 rows=10 width=68)
(actual time=0.015..0.025 rows=10 loops=1)
Filter: (region = 'North'::text)
Buffers: shared hit=1
Planning Time: 0.085 ms
Execution Time: 0.045 ms
- shared hit=1 -- One block was found in the shared buffer cache
- shared read=N -- Blocks read from disk
EXPLAIN (VERBOSE)
Shows additional details like output column lists:
EXPLAIN (VERBOSE) SELECT region, SUM(amount) FROM sales GROUP BY region;
Performance Optimization Workflow
Here is a systematic approach to optimizing slow queries:
Step 1: Get the Current Plan
EXPLAIN ANALYZE SELECT * FROM sales
WHERE region = 'North' AND sale_date > '2025-06-01'
ORDER BY amount DESC;
Step 2: Identify the Bottleneck
Look for:
- Seq Scan on large tables -- Consider adding an index
- High actual time on a specific node -- Focus optimization there
- Rows estimate vs actual mismatch -- Run
ANALYZE tablenameto update statistics - Sort operations -- Consider an index that provides sorted access
- Filter with many rows removed -- The filter is not selective; consider a different approach
Step 3: Create an Index
CREATE INDEX idx_sales_region_date ON sales (region, sale_date);
Step 4: Re-run EXPLAIN ANALYZE
EXPLAIN ANALYZE SELECT * FROM sales
WHERE region = 'North' AND sale_date > '2025-06-01'
ORDER BY amount DESC;
Compare the new plan against the old one. Did the scan type change? Did the execution time drop?
Step 5: Update Statistics
If estimates are significantly off:
ANALYZE sales;
This updates the planner's statistics about data distribution.
Common Performance Pitfalls
| Problem | Symptom | Solution |
|---|---|---|
| Missing index | Seq Scan on large table | Create appropriate index |
| Stale statistics | Bad row estimates | Run ANALYZE |
| Function on indexed column | Index not used | Create expression index |
| Too many indexes | Slow writes | Drop unused indexes |
| SELECT * when few columns needed | Large width, no index-only scan | Select only needed columns |
Try It Yourself
-
Run
EXPLAIN ANALYZE SELECT * FROM sales WHERE region = 'North';and note the scan type and execution time. -
Run
EXPLAIN ANALYZE SELECT region, SUM(amount) FROM sales GROUP BY region ORDER BY SUM(amount) DESC;and identify each node in the plan. -
Run
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM sales WHERE amount > 200;and see how many buffer blocks were accessed. -
Compare plans: run EXPLAIN for
SELECT * FROM sales ORDER BY sale_datevsSELECT * FROM sales ORDER BY amount. Do they use different strategies?