We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Materialized Views
While regular views are virtual tables that re-execute their query every time, materialized views physically store the query results on disk. This makes reads extremely fast but means the data can become stale.
The Problem Materialized Views Solve
Consider a complex dashboard query that joins several large tables, performs aggregations, and takes 5 seconds to run:
-- This runs every time someone loads the dashboard
SELECT
region,
EXTRACT(MONTH FROM sale_date) AS month,
COUNT(*) AS num_sales,
SUM(amount) AS total_revenue,
AVG(amount) AS avg_sale,
COUNT(DISTINCT customer_id) AS unique_customers,
SUM(CASE WHEN payment_method = 'credit_card' THEN amount ELSE 0 END) AS credit_card_revenue
FROM sales
GROUP BY region, EXTRACT(MONTH FROM sale_date)
ORDER BY region, month;
With a regular view, this query runs every time someone selects from it. With a materialized view, the results are computed once and cached.
Creating a Materialized View
Basic Syntax
CREATE MATERIALIZED VIEW view_name AS
SELECT ...
FROM ...
WHERE ...
[WITH [NO] DATA];
Example
CREATE MATERIALIZED VIEW mv_regional_sales AS
SELECT
region,
EXTRACT(MONTH FROM sale_date) AS month,
COUNT(*) AS num_sales,
SUM(amount) AS total_revenue,
AVG(amount) AS avg_sale,
COUNT(DISTINCT customer_id) AS unique_customers
FROM sales
GROUP BY region, EXTRACT(MONTH FROM sale_date)
ORDER BY region, month;
Now querying is instant:
-- This reads from the cached results -- no re-computation
SELECT * FROM mv_regional_sales
WHERE region = 'North';
WITH NO DATA
You can create the materialized view structure without populating it immediately:
CREATE MATERIALIZED VIEW mv_sales_summary AS
SELECT region, SUM(amount) AS total
FROM sales
GROUP BY region
WITH NO DATA;
The view exists but cannot be queried until you refresh it:
-- Error: materialized view "mv_sales_summary" has not been populated
SELECT * FROM mv_sales_summary;
-- Populate it
REFRESH MATERIALIZED VIEW mv_sales_summary;
-- Now it works
SELECT * FROM mv_sales_summary;
This is useful when the initial query is expensive and you want to populate it during off-peak hours.
Refreshing Materialized Views
Since materialized views store a snapshot of data, they become stale when the underlying tables change. You must manually refresh them.
Basic Refresh
REFRESH MATERIALIZED VIEW mv_regional_sales;
This:
- Locks the materialized view (no reads allowed during refresh)
- Re-executes the underlying query
- Replaces the stored data with the new results
- Releases the lock
Concurrent Refresh
To allow reads during the refresh, use CONCURRENTLY:
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_regional_sales;
Requirements for concurrent refresh:
- The materialized view must have a unique index
- The refresh takes longer but does not block readers
- If no unique index exists, you get an error
-- Create a unique index to enable concurrent refresh
CREATE UNIQUE INDEX idx_mv_regional_sales
ON mv_regional_sales (region, month);
-- Now concurrent refresh works
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_regional_sales;
When to Refresh
Common refresh strategies:
1. Scheduled Refresh (Most Common)
Use a cron job or scheduler to refresh at regular intervals:
-- In a cron job (every hour):
0 * * * * psql -c "REFRESH MATERIALIZED VIEW mv_regional_sales;"
-- Or every night at 2 AM:
0 2 * * * psql -c "REFRESH MATERIALIZED VIEW mv_regional_sales;"
2. After Data Loads
Refresh after batch imports or ETL processes:
-- In your ETL script:
COPY sales FROM '/data/new_sales.csv' CSV HEADER;
REFRESH MATERIALIZED VIEW mv_regional_sales;
3. On Demand
Let users or applications trigger a refresh when they need fresh data:
-- Application code
REFRESH MATERIALIZED VIEW mv_regional_sales;
SELECT * FROM mv_regional_sales;
4. Using Triggers (Advanced)
You can create a trigger function that refreshes the materialized view when the underlying table changes. However, this is generally not recommended for frequently updated tables because the refresh can be expensive.
Indexing Materialized Views
Since materialized views store data physically, you can create indexes on them just like regular tables. This can make queries against large materialized views even faster.
-- Index for quick region lookups
CREATE INDEX idx_mv_region ON mv_regional_sales (region);
-- Index for month-based queries
CREATE INDEX idx_mv_month ON mv_regional_sales (month);
-- Unique index (also required for CONCURRENTLY refresh)
CREATE UNIQUE INDEX idx_mv_region_month
ON mv_regional_sales (region, month);
Dropping Materialized Views
DROP MATERIALIZED VIEW mv_regional_sales;
-- Safe drop
DROP MATERIALIZED VIEW IF EXISTS mv_regional_sales;
-- With cascade
DROP MATERIALIZED VIEW mv_regional_sales CASCADE;
Checking Materialized View Status
You can check when a materialized view was last refreshed by looking at system catalogs:
-- See all materialized views
SELECT
schemaname,
matviewname,
ispopulated
FROM pg_matviews
WHERE schemaname = 'public';
The ispopulated column tells you whether the view has data (true) or was created with WITH NO DATA and hasn't been refreshed yet (false).
Views vs Materialized Views: Decision Guide
| Consideration | Use Regular View | Use Materialized View |
|---|---|---|
| Data freshness | Real-time needed | Slightly stale is OK |
| Query complexity | Simple, fast queries | Complex, expensive queries |
| Read frequency | Occasional reads | Frequent reads |
| Write frequency | Underlying data changes often | Data changes less often |
| Storage | No extra storage | Requires disk space |
| Indexes | Not applicable | Can create indexes |
| Insert/Update through | Possible (simple views) | Not possible |
Practical Example: Analytics Dashboard
Here is a real-world pattern for a dashboard backed by materialized views:
-- Create the materialized view for the dashboard
CREATE MATERIALIZED VIEW mv_dashboard_stats AS
SELECT
region,
DATE_TRUNC('month', sale_date) AS month,
COUNT(*) AS total_sales,
SUM(amount) AS revenue,
AVG(amount) AS avg_order_value,
COUNT(DISTINCT customer_id) AS unique_customers,
SUM(amount) / COUNT(DISTINCT customer_id) AS revenue_per_customer,
MODE() WITHIN GROUP (ORDER BY payment_method) AS most_common_payment
FROM sales
GROUP BY region, DATE_TRUNC('month', sale_date);
-- Create indexes for common query patterns
CREATE UNIQUE INDEX ON mv_dashboard_stats (region, month);
CREATE INDEX ON mv_dashboard_stats (month);
CREATE INDEX ON mv_dashboard_stats (revenue DESC);
-- The dashboard queries are now instant
SELECT * FROM mv_dashboard_stats
WHERE month >= '2025-01-01'
ORDER BY revenue DESC;
-- Refresh nightly
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_dashboard_stats;
Materialized View Patterns
Aggregation Cache
Pre-compute expensive aggregations:
CREATE MATERIALIZED VIEW mv_product_stats AS
SELECT
product_id,
COUNT(*) AS times_sold,
SUM(quantity) AS total_quantity,
SUM(total_price) AS total_revenue,
AVG(total_price) AS avg_price
FROM orders
GROUP BY product_id;
Denormalized Join Cache
Pre-compute frequently needed joins:
CREATE MATERIALIZED VIEW mv_order_details AS
SELECT
o.product_id,
o.customer_name,
o.quantity,
o.total_price,
o.status,
o.order_date,
s.region,
s.payment_method
FROM orders o
LEFT JOIN sales s ON o.product_id = s.product_id;
Time-Series Rollup
Pre-aggregate time-series data to coarser time periods:
CREATE MATERIALIZED VIEW mv_daily_sales AS
SELECT
sale_date,
COUNT(*) AS num_transactions,
SUM(amount) AS daily_total,
AVG(amount) AS daily_avg,
MIN(amount) AS daily_min,
MAX(amount) AS daily_max
FROM sales
GROUP BY sale_date
ORDER BY sale_date;
Try It Yourself
Think about the following scenarios:
-
Design a materialized view that pre-computes monthly revenue by region. What refresh strategy would you use for a dashboard that is viewed 100 times a day but only needs data accurate to within 1 hour?
-
Consider a regular view vs a materialized view for a query that joins 5 tables with 10 million rows each. Which would you choose and why?
-
If you needed to support concurrent reads during a refresh, what additional structure must you add to the materialized view?