FILTER and Conditional Aggregation

Standard GROUP BY with SUM() and COUNT() is powerful, but sometimes you need to aggregate different subsets of data in a single query. PostgreSQL's FILTER clause and conditional aggregation techniques let you compute multiple conditional summaries in one pass through the data.

The FILTER Clause

The FILTER clause restricts which rows are fed into an aggregate function:

aggregate_function(expression) FILTER (WHERE condition)

It is logically equivalent to aggregating only the rows that match the condition, but you can apply different filters to different aggregates in the same query.

Basic Example

SELECT
  COUNT(*) AS total_sales,
  COUNT(*) FILTER (WHERE region = 'North') AS north_sales,
  COUNT(*) FILTER (WHERE region = 'South') AS south_sales,
  COUNT(*) FILTER (WHERE region = 'East')  AS east_sales,
  COUNT(*) FILTER (WHERE region = 'West')  AS west_sales
FROM sales;

This computes five counts in one table scan -- far more efficient than running five separate queries.

SUM with FILTER

SELECT
  SUM(amount) AS total_revenue,
  SUM(amount) FILTER (WHERE payment_method = 'credit_card') AS credit_card_revenue,
  SUM(amount) FILTER (WHERE payment_method = 'cash')        AS cash_revenue,
  SUM(amount) FILTER (WHERE payment_method = 'bank_transfer') AS bank_transfer_revenue
FROM sales;

AVG with FILTER

SELECT
  region,
  AVG(amount) AS avg_all,
  AVG(amount) FILTER (WHERE amount > 100) AS avg_large_sales,
  AVG(amount) FILTER (WHERE amount <= 100) AS avg_small_sales
FROM sales
GROUP BY region
ORDER BY region;

FILTER with Date Conditions

SELECT
  COUNT(*) FILTER (WHERE sale_date >= '2025-01-01' AND sale_date < '2025-04-01') AS q1_count,
  COUNT(*) FILTER (WHERE sale_date >= '2025-04-01' AND sale_date < '2025-07-01') AS q2_count,
  SUM(amount) FILTER (WHERE sale_date >= '2025-01-01' AND sale_date < '2025-04-01') AS q1_revenue,
  SUM(amount) FILTER (WHERE sale_date >= '2025-04-01' AND sale_date < '2025-07-01') AS q2_revenue
FROM sales;

Conditional Aggregation with CASE

Before FILTER was added (PostgreSQL 9.4), the standard technique used CASE inside aggregate functions. You will still see this pattern in older code and in databases that do not support FILTER:

SELECT
  COUNT(CASE WHEN region = 'North' THEN 1 END) AS north_sales,
  COUNT(CASE WHEN region = 'South' THEN 1 END) AS south_sales,
  SUM(CASE WHEN payment_method = 'credit_card' THEN amount ELSE 0 END) AS credit_revenue,
  SUM(CASE WHEN payment_method = 'cash' THEN amount ELSE 0 END) AS cash_revenue
FROM sales;

FILTER vs. CASE Comparison

-- FILTER (cleaner)
COUNT(*) FILTER (WHERE region = 'North')

-- CASE equivalent (more verbose)
COUNT(CASE WHEN region = 'North' THEN 1 END)

-- FILTER with SUM
SUM(amount) FILTER (WHERE region = 'North')

-- CASE with SUM (need ELSE 0 to avoid NULL)
SUM(CASE WHEN region = 'North' THEN amount ELSE 0 END)

The FILTER syntax is more readable and less error-prone. Use it when you can.

Boolean Aggregates: bool_and() and bool_or()

These aggregates work on boolean expressions:

SELECT
  department,
  bool_and(is_active) AS all_active,    -- TRUE if every employee is active
  bool_or(is_active)  AS any_active,    -- TRUE if at least one is active
  COUNT(*) FILTER (WHERE is_active) AS active_count
FROM employees
GROUP BY department;
department all_active any_active active_count
Engineering true true 3
Marketing false true 1
Sales true true 2

Practical Uses

-- Check if all orders for each customer are completed
SELECT
  customer_name,
  bool_and(status = 'completed') AS all_completed,
  bool_or(status = 'pending')    AS has_pending
FROM orders
GROUP BY customer_name;

array_agg() -- Collect Values into Arrays

array_agg() gathers values from multiple rows into a PostgreSQL array:

SELECT
  department,
  array_agg(name) AS employees
FROM employees
GROUP BY department;
department employees
Engineering {Alice,Carlos,Diana}
Marketing {Bob,Eve}

With DISTINCT and ORDER BY

SELECT
  region,
  array_agg(DISTINCT payment_method ORDER BY payment_method) AS payment_methods_used
FROM sales
GROUP BY region;
region payment_methods_used
North {bank_transfer,cash,credit_card}
South {cash,credit_card}

Filtering NULL values

-- Exclude NULLs from the array
SELECT array_agg(name) FILTER (WHERE name IS NOT NULL)
FROM employees;

string_agg() -- Concatenate Values

string_agg() joins text values with a delimiter:

SELECT
  department,
  string_agg(name, ', ' ORDER BY name) AS employee_list
FROM employees
GROUP BY department;
department employee_list
Engineering Alice, Carlos, Diana
Marketing Bob, Eve

Building CSV or Readable Lists

-- Comma-separated list of unique regions
SELECT string_agg(DISTINCT region, ', ' ORDER BY region) AS regions
FROM sales;
-- Result: 'East, North, South, West'

-- Newline-separated for reports
SELECT string_agg(name || ': $' || salary::TEXT, E'\n' ORDER BY salary DESC)
FROM employees;

string_agg with FILTER

SELECT
  department,
  string_agg(name, ', ') FILTER (WHERE is_active) AS active_employees,
  string_agg(name, ', ') FILTER (WHERE NOT is_active) AS inactive_employees
FROM employees
GROUP BY department;

Combining Techniques

These techniques are most powerful when combined:

SELECT
  region,
  COUNT(*) AS total_sales,
  SUM(amount) AS total_revenue,
  ROUND(AVG(amount), 2) AS avg_sale,
  COUNT(*) FILTER (WHERE amount > 200) AS large_sales,
  ROUND(
    100.0 * COUNT(*) FILTER (WHERE payment_method = 'credit_card') / COUNT(*),
    1
  ) AS credit_card_pct,
  string_agg(DISTINCT payment_method, ', ' ORDER BY payment_method) AS methods_used,
  MIN(sale_date) AS first_sale,
  MAX(sale_date) AS last_sale
FROM sales
GROUP BY region
ORDER BY total_revenue DESC;

Pivot-Style Reports

Use FILTER to create a pivot table without a PIVOT keyword:

SELECT
  customer_name,
  COUNT(*) FILTER (WHERE status = 'completed') AS completed,
  COUNT(*) FILTER (WHERE status = 'pending')   AS pending,
  COUNT(*) FILTER (WHERE status = 'shipped')   AS shipped,
  COUNT(*) FILTER (WHERE status = 'cancelled') AS cancelled,
  SUM(total_price) AS total_spent
FROM orders
GROUP BY customer_name
ORDER BY total_spent DESC;

Try It Yourself

Run these queries against the sandbox tables:

  1. For each payment_method in sales, count how many sales had an amount over 100 and how many had an amount of 100 or less. Use FILTER.

  2. Create a summary of the orders table that shows, for each status, the count, total value, and a comma-separated list of customer names.

  3. For each region in sales, calculate what percentage of total revenue came from each payment method.

  4. Use array_agg() to get a list of all distinct products ordered by each customer in the orders table.

Summary

Technique Syntax Purpose
FILTER agg(x) FILTER (WHERE cond) Conditional aggregation (clean syntax)
CASE aggregation SUM(CASE WHEN ... THEN x END) Conditional aggregation (portable)
bool_and bool_and(expr) TRUE if all rows are TRUE
bool_or bool_or(expr) TRUE if any row is TRUE
array_agg array_agg(x ORDER BY y) Collect values into an array
string_agg string_agg(x, delim ORDER BY y) Concatenate values with a delimiter
DISTINCT in agg array_agg(DISTINCT x) Remove duplicates within aggregation
Take Quiz Next Lesson
SQL Editor Ctrl+Enter to run
Results
Run a query to see results here
Tables: