We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
GROUP BY
In the previous lesson, you learned aggregate functions that summarize an entire table into a single row. But what if you want to summarize data per category? For example, the average salary per department, or the total revenue per customer? That is exactly what GROUP BY does.
How GROUP BY Works
GROUP BY divides your rows into groups based on one or more columns. Then, aggregate functions are applied to each group separately, producing one result row per group.
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;
Here is what happens step by step:
- PostgreSQL scans all rows in the
employeestable - It groups rows by their
departmentvalue (all "Engineering" rows together, all "Sales" rows together, etc.) - For each group, it applies
COUNT(*)to count how many rows are in that group - It returns one row per group
The GROUP BY Rule
When you use GROUP BY, there is an important rule to follow:
Every column in your
SELECTlist must either be in theGROUP BYclause or wrapped in an aggregate function.
This query is valid:
SELECT department, AVG(salary)
FROM employees
GROUP BY department;
This query would cause an error:
-- ERROR: column "name" must appear in GROUP BY clause
-- or be used in an aggregate function
SELECT department, name, AVG(salary)
FROM employees
GROUP BY department;
Why? Because each group has one department value but potentially many name values. PostgreSQL would not know which name to show for the group.
Common GROUP BY Patterns
Count per Category
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;
Sum per Category
SELECT customer_name, SUM(total_price) AS total_spent
FROM orders
GROUP BY customer_name;
Average per Category
SELECT department, ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department;
Min and Max per Category
SELECT category,
MIN(price) AS cheapest,
MAX(price) AS most_expensive
FROM products
GROUP BY category;
Multiple Aggregates in One Query
You can use several aggregate functions together with GROUP BY:
SELECT department,
COUNT(*) AS num_employees,
ROUND(AVG(salary), 2) AS avg_salary,
MIN(salary) AS min_salary,
MAX(salary) AS max_salary,
SUM(salary) AS total_payroll
FROM employees
GROUP BY department;
This gives you a complete salary breakdown for each department in one query.
GROUP BY with WHERE
You can filter rows before they are grouped by using WHERE. The WHERE clause runs first, then GROUP BY groups the surviving rows:
SELECT department, COUNT(*) AS active_count
FROM employees
WHERE is_active = true
GROUP BY department;
This counts only active employees per department. Inactive employees are excluded before grouping happens.
Another example:
SELECT customer_name, COUNT(*) AS order_count
FROM orders
WHERE status = 'completed'
GROUP BY customer_name;
Grouping by Multiple Columns
You can group by more than one column. Each unique combination of values becomes its own group:
SELECT department, is_active, COUNT(*) AS num_employees
FROM employees
GROUP BY department, is_active;
This produces one row for each combination, such as "Engineering + active", "Engineering + inactive", "Sales + active", etc.
Here is another example with orders:
SELECT status, customer_name, COUNT(*) AS order_count
FROM orders
GROUP BY status, customer_name;
GROUP BY with ORDER BY
You can sort grouped results using ORDER BY. This is very common when you want to see the "top" or "bottom" groups:
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
ORDER BY employee_count DESC;
This lists departments from the most employees to the fewest.
SELECT customer_name, SUM(total_price) AS total_spent
FROM orders
GROUP BY customer_name
ORDER BY total_spent DESC;
This ranks customers by their total spending.
Execution Order
Understanding the order in which SQL clauses execute helps avoid confusion:
FROM-- Choose the tableWHERE-- Filter individual rowsGROUP BY-- Group the remaining rowsSELECT-- Choose columns and apply aggregatesORDER BY-- Sort the final result
This is why you cannot use an aggregate in a WHERE clause -- aggregation has not happened yet when WHERE runs. (You will learn how to filter groups in the next lesson on HAVING.)
Try It Yourself
- Find the number of products in each category.
SELECT category, COUNT(*) AS product_count
FROM products
GROUP BY category;
- Find the total quantity ordered for each product status.
SELECT status, SUM(quantity) AS total_quantity
FROM orders
GROUP BY status;
- Find the average price of products in each category, sorted from highest to lowest average price.
SELECT category, ROUND(AVG(price), 2) AS avg_price
FROM products
GROUP BY category
ORDER BY avg_price DESC;
- Count the number of orders each customer has placed and also show their total spending, sorted by total spending in descending order.
SELECT customer_name,
COUNT(*) AS order_count,
SUM(total_price) AS total_spent
FROM orders
GROUP BY customer_name
ORDER BY total_spent DESC;