Prev 3/3 Next

EXPLAIN & EXPLAIN ANALYZE

Knowing how to read query execution plans is one of the most valuable skills for database performance tuning. PostgreSQL's EXPLAIN command reveals exactly how the database engine plans to execute your query, and EXPLAIN ANALYZE shows what actually happened when it ran.

EXPLAIN: The Estimated Plan

EXPLAIN shows the query plan without executing the query. It displays the planner's estimates for cost, row counts, and chosen strategies.

EXPLAIN SELECT * FROM sales WHERE region = 'North';

Sample output:

Seq Scan on sales  (cost=0.00..1.62 rows=10 width=68)
  Filter: (region = 'North'::text)

Reading the Output

Each line represents a plan node. Let's break down the key parts:

Seq Scan on sales -- The operation type and table
cost=0.00..1.62 -- Estimated cost (startup cost..total cost) in arbitrary units
rows=10 -- Estimated number of rows this node will return
width=68 -- Estimated average size of each row in bytes
Filter: -- The condition applied during the scan

Understanding Cost

Cost is measured in arbitrary units (based on seq_page_cost = 1.0). The two numbers represent:

Startup cost (before the first row is produced): 0.00
Total cost (to produce all rows): 1.62

Lower cost generally means a faster query, but costs are only meaningful for comparing plans for the same query.

EXPLAIN ANALYZE: The Real Execution

EXPLAIN ANALYZE actually executes the query and reports real timing alongside the estimates.

EXPLAIN ANALYZE SELECT * FROM sales WHERE region = 'North';

Sample output:

Seq Scan on sales  (cost=0.00..1.62 rows=10 width=68)
                    (actual time=0.015..0.025 rows=10 loops=1)
  Filter: (region = 'North'::text)
  Rows Removed by Filter: 40
Planning Time: 0.085 ms
Execution Time: 0.045 ms

Additional Information

actual time=0.015..0.025 -- Real time in milliseconds (startup..total)
rows=10 -- Actual number of rows produced
loops=1 -- How many times this node was executed
Rows Removed by Filter: 40 -- How many rows were read but did not match
Planning Time -- Time spent creating the plan
Execution Time -- Total execution time

Comparing Estimates to Actuals

If the estimated rows is very different from the actual rows, the planner may be making suboptimal choices. This often indicates stale statistics that need updating with ANALYZE.

Common Scan Types

Seq Scan (Sequential Scan)

Reads every row in the table. This is the simplest and sometimes fastest approach for small tables or queries that return a large percentage of rows.

EXPLAIN SELECT * FROM sales;

Seq Scan on sales  (cost=0.00..1.50 rows=50 width=68)

Index Scan

Uses an index to find matching rows, then fetches the full row from the table. Most efficient for highly selective queries on indexed columns.

-- Assuming an index exists on region
EXPLAIN SELECT * FROM sales WHERE region = 'North';

Index Scan using idx_sales_region on sales  (cost=0.14..8.16 rows=10 width=68)
  Index Cond: (region = 'North'::text)

Index Only Scan

Even better than Index Scan -- all needed columns are in the index, so the table does not need to be visited at all.

-- With a covering index on (region) INCLUDE (amount)
EXPLAIN SELECT region, amount FROM sales WHERE region = 'North';

Index Only Scan using idx_sales_covering on sales  (cost=0.14..4.16 rows=10 width=36)
  Index Cond: (region = 'North'::text)

Bitmap Index Scan + Bitmap Heap Scan

A two-step process for moderate selectivity. First, the index is scanned to create a bitmap of matching pages. Then, those pages are fetched in sequential order.

Bitmap Heap Scan on sales  (cost=4.18..14.20 rows=10 width=68)
  Recheck Cond: (region = 'North'::text)
  ->  Bitmap Index Scan on idx_sales_region  (cost=0.00..4.18 rows=10 width=0)
        Index Cond: (region = 'North'::text)

This is efficient when multiple rows match but they are scattered across many table pages.

Common Join Strategies

When your query involves joins, EXPLAIN reveals the join algorithm:

Nested Loop

For each row in the outer table, scan the inner table. Best for small datasets or when the inner table has an index.

Nested Loop  (cost=0.00..15.25 rows=10 width=136)
  ->  Seq Scan on orders  (cost=0.00..1.10 rows=10 width=68)
  ->  Index Scan using idx_sales_product on sales  (cost=0.14..1.40 rows=1 width=68)
        Index Cond: (product_id = orders.product_id)

Hash Join

Builds a hash table from one table, then probes it with rows from the other. Efficient for medium-to-large joins without indexes.

Hash Join  (cost=1.12..3.25 rows=10 width=136)
  Hash Cond: (sales.product_id = orders.product_id)
  ->  Seq Scan on sales  (cost=0.00..1.50 rows=50 width=68)
  ->  Hash  (cost=1.10..1.10 rows=10 width=68)
        ->  Seq Scan on orders  (cost=0.00..1.10 rows=10 width=68)

Merge Join

Both inputs are sorted, then merged together. Efficient when data is already sorted or indexes provide sorted access.

EXPLAIN Options

EXPLAIN (FORMAT JSON)

Get the plan in JSON format for programmatic analysis:

EXPLAIN (FORMAT JSON) SELECT * FROM sales WHERE region = 'North';

EXPLAIN (BUFFERS)

Shows how many shared buffer blocks were read (combine with ANALYZE):

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM sales WHERE region = 'North';

Seq Scan on sales  (cost=0.00..1.62 rows=10 width=68)
                   (actual time=0.015..0.025 rows=10 loops=1)
  Filter: (region = 'North'::text)
  Buffers: shared hit=1
Planning Time: 0.085 ms
Execution Time: 0.045 ms

shared hit=1 -- One block was found in the shared buffer cache
shared read=N -- Blocks read from disk

EXPLAIN (VERBOSE)

Shows additional details like output column lists:

EXPLAIN (VERBOSE) SELECT region, SUM(amount) FROM sales GROUP BY region;

Performance Optimization Workflow

Here is a systematic approach to optimizing slow queries:

Step 1: Get the Current Plan

EXPLAIN ANALYZE SELECT * FROM sales
WHERE region = 'North' AND sale_date > '2025-06-01'
ORDER BY amount DESC;

Step 2: Identify the Bottleneck

Look for:

Seq Scan on large tables -- Consider adding an index
High actual time on a specific node -- Focus optimization there
Rows estimate vs actual mismatch -- Run ANALYZE tablename to update statistics
Sort operations -- Consider an index that provides sorted access
Filter with many rows removed -- The filter is not selective; consider a different approach

Step 3: Create an Index

CREATE INDEX idx_sales_region_date ON sales (region, sale_date);

Step 4: Re-run EXPLAIN ANALYZE

EXPLAIN ANALYZE SELECT * FROM sales
WHERE region = 'North' AND sale_date > '2025-06-01'
ORDER BY amount DESC;

Compare the new plan against the old one. Did the scan type change? Did the execution time drop?

Step 5: Update Statistics

If estimates are significantly off:

ANALYZE sales;

This updates the planner's statistics about data distribution.

Common Performance Pitfalls

Problem	Symptom	Solution
Missing index	Seq Scan on large table	Create appropriate index
Stale statistics	Bad row estimates	Run ANALYZE
Function on indexed column	Index not used	Create expression index
Too many indexes	Slow writes	Drop unused indexes
SELECT * when few columns needed	Large width, no index-only scan	Select only needed columns

Try It Yourself

Run EXPLAIN ANALYZE SELECT * FROM sales WHERE region = 'North'; and note the scan type and execution time.
Run EXPLAIN ANALYZE SELECT region, SUM(amount) FROM sales GROUP BY region ORDER BY SUM(amount) DESC; and identify each node in the plan.
Run EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM sales WHERE amount > 200; and see how many buffer blocks were accessed.
Compare plans: run EXPLAIN for SELECT * FROM sales ORDER BY sale_date vs SELECT * FROM sales ORDER BY amount. Do they use different strategies?

Take Quiz Next Lesson

SQL Editor Ctrl+Enter to run

Results

Run a query to see results here

Tables: