Prev 2/2 Next

Materialized Views

While regular views are virtual tables that re-execute their query every time, materialized views physically store the query results on disk. This makes reads extremely fast but means the data can become stale.

The Problem Materialized Views Solve

Consider a complex dashboard query that joins several large tables, performs aggregations, and takes 5 seconds to run:

-- This runs every time someone loads the dashboard
SELECT
  region,
  EXTRACT(MONTH FROM sale_date) AS month,
  COUNT(*) AS num_sales,
  SUM(amount) AS total_revenue,
  AVG(amount) AS avg_sale,
  COUNT(DISTINCT customer_id) AS unique_customers,
  SUM(CASE WHEN payment_method = 'credit_card' THEN amount ELSE 0 END) AS credit_card_revenue
FROM sales
GROUP BY region, EXTRACT(MONTH FROM sale_date)
ORDER BY region, month;

With a regular view, this query runs every time someone selects from it. With a materialized view, the results are computed once and cached.

Creating a Materialized View

Basic Syntax

CREATE MATERIALIZED VIEW view_name AS
SELECT ...
FROM ...
WHERE ...
[WITH [NO] DATA];

Example

CREATE MATERIALIZED VIEW mv_regional_sales AS
SELECT
  region,
  EXTRACT(MONTH FROM sale_date) AS month,
  COUNT(*) AS num_sales,
  SUM(amount) AS total_revenue,
  AVG(amount) AS avg_sale,
  COUNT(DISTINCT customer_id) AS unique_customers
FROM sales
GROUP BY region, EXTRACT(MONTH FROM sale_date)
ORDER BY region, month;

Now querying is instant:

-- This reads from the cached results -- no re-computation
SELECT * FROM mv_regional_sales
WHERE region = 'North';

WITH NO DATA

You can create the materialized view structure without populating it immediately:

CREATE MATERIALIZED VIEW mv_sales_summary AS
SELECT region, SUM(amount) AS total
FROM sales
GROUP BY region
WITH NO DATA;

The view exists but cannot be queried until you refresh it:

-- Error: materialized view "mv_sales_summary" has not been populated
SELECT * FROM mv_sales_summary;

-- Populate it
REFRESH MATERIALIZED VIEW mv_sales_summary;

-- Now it works
SELECT * FROM mv_sales_summary;

This is useful when the initial query is expensive and you want to populate it during off-peak hours.

Refreshing Materialized Views

Since materialized views store a snapshot of data, they become stale when the underlying tables change. You must manually refresh them.

Basic Refresh

REFRESH MATERIALIZED VIEW mv_regional_sales;

This:

Locks the materialized view (no reads allowed during refresh)
Re-executes the underlying query
Replaces the stored data with the new results
Releases the lock

Concurrent Refresh

To allow reads during the refresh, use CONCURRENTLY:

REFRESH MATERIALIZED VIEW CONCURRENTLY mv_regional_sales;

Requirements for concurrent refresh:

The materialized view must have a unique index
The refresh takes longer but does not block readers
If no unique index exists, you get an error

-- Create a unique index to enable concurrent refresh
CREATE UNIQUE INDEX idx_mv_regional_sales
ON mv_regional_sales (region, month);

-- Now concurrent refresh works
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_regional_sales;

When to Refresh

Common refresh strategies:

1. Scheduled Refresh (Most Common)

Use a cron job or scheduler to refresh at regular intervals:

-- In a cron job (every hour):
0 * * * * psql -c "REFRESH MATERIALIZED VIEW mv_regional_sales;"

-- Or every night at 2 AM:
0 2 * * * psql -c "REFRESH MATERIALIZED VIEW mv_regional_sales;"

2. After Data Loads

Refresh after batch imports or ETL processes:

-- In your ETL script:
COPY sales FROM '/data/new_sales.csv' CSV HEADER;
REFRESH MATERIALIZED VIEW mv_regional_sales;

3. On Demand

Let users or applications trigger a refresh when they need fresh data:

-- Application code
REFRESH MATERIALIZED VIEW mv_regional_sales;
SELECT * FROM mv_regional_sales;

4. Using Triggers (Advanced)

You can create a trigger function that refreshes the materialized view when the underlying table changes. However, this is generally not recommended for frequently updated tables because the refresh can be expensive.

Indexing Materialized Views

Since materialized views store data physically, you can create indexes on them just like regular tables. This can make queries against large materialized views even faster.

-- Index for quick region lookups
CREATE INDEX idx_mv_region ON mv_regional_sales (region);

-- Index for month-based queries
CREATE INDEX idx_mv_month ON mv_regional_sales (month);

-- Unique index (also required for CONCURRENTLY refresh)
CREATE UNIQUE INDEX idx_mv_region_month
ON mv_regional_sales (region, month);

Dropping Materialized Views

DROP MATERIALIZED VIEW mv_regional_sales;

-- Safe drop
DROP MATERIALIZED VIEW IF EXISTS mv_regional_sales;

-- With cascade
DROP MATERIALIZED VIEW mv_regional_sales CASCADE;

Checking Materialized View Status

You can check when a materialized view was last refreshed by looking at system catalogs:

-- See all materialized views
SELECT
  schemaname,
  matviewname,
  ispopulated
FROM pg_matviews
WHERE schemaname = 'public';

The ispopulated column tells you whether the view has data (true) or was created with WITH NO DATA and hasn't been refreshed yet (false).

Views vs Materialized Views: Decision Guide

Consideration	Use Regular View	Use Materialized View
Data freshness	Real-time needed	Slightly stale is OK
Query complexity	Simple, fast queries	Complex, expensive queries
Read frequency	Occasional reads	Frequent reads
Write frequency	Underlying data changes often	Data changes less often
Storage	No extra storage	Requires disk space
Indexes	Not applicable	Can create indexes
Insert/Update through	Possible (simple views)	Not possible

Practical Example: Analytics Dashboard

Here is a real-world pattern for a dashboard backed by materialized views:

-- Create the materialized view for the dashboard
CREATE MATERIALIZED VIEW mv_dashboard_stats AS
SELECT
  region,
  DATE_TRUNC('month', sale_date) AS month,
  COUNT(*) AS total_sales,
  SUM(amount) AS revenue,
  AVG(amount) AS avg_order_value,
  COUNT(DISTINCT customer_id) AS unique_customers,
  SUM(amount) / COUNT(DISTINCT customer_id) AS revenue_per_customer,
  MODE() WITHIN GROUP (ORDER BY payment_method) AS most_common_payment
FROM sales
GROUP BY region, DATE_TRUNC('month', sale_date);

-- Create indexes for common query patterns
CREATE UNIQUE INDEX ON mv_dashboard_stats (region, month);
CREATE INDEX ON mv_dashboard_stats (month);
CREATE INDEX ON mv_dashboard_stats (revenue DESC);

-- The dashboard queries are now instant
SELECT * FROM mv_dashboard_stats
WHERE month >= '2025-01-01'
ORDER BY revenue DESC;

-- Refresh nightly
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_dashboard_stats;

Materialized View Patterns

Aggregation Cache

Pre-compute expensive aggregations:

CREATE MATERIALIZED VIEW mv_product_stats AS
SELECT
  product_id,
  COUNT(*) AS times_sold,
  SUM(quantity) AS total_quantity,
  SUM(total_price) AS total_revenue,
  AVG(total_price) AS avg_price
FROM orders
GROUP BY product_id;

Denormalized Join Cache

Pre-compute frequently needed joins:

CREATE MATERIALIZED VIEW mv_order_details AS
SELECT
  o.product_id,
  o.customer_name,
  o.quantity,
  o.total_price,
  o.status,
  o.order_date,
  s.region,
  s.payment_method
FROM orders o
LEFT JOIN sales s ON o.product_id = s.product_id;

Time-Series Rollup

Pre-aggregate time-series data to coarser time periods:

CREATE MATERIALIZED VIEW mv_daily_sales AS
SELECT
  sale_date,
  COUNT(*) AS num_transactions,
  SUM(amount) AS daily_total,
  AVG(amount) AS daily_avg,
  MIN(amount) AS daily_min,
  MAX(amount) AS daily_max
FROM sales
GROUP BY sale_date
ORDER BY sale_date;

Try It Yourself

Think about the following scenarios:

Design a materialized view that pre-computes monthly revenue by region. What refresh strategy would you use for a dashboard that is viewed 100 times a day but only needs data accurate to within 1 hour?
Consider a regular view vs a materialized view for a query that joins 5 tables with 10 million rows each. Which would you choose and why?
If you needed to support concurrent reads during a refresh, what additional structure must you add to the materialized view?

Take Quiz Next Lesson