X Xerobit

SQL GROUP BY and HAVING — Aggregate Data with Filters

GROUP BY groups rows into summary rows; HAVING filters the groups. Learn how to use GROUP BY with COUNT, SUM, AVG, MIN, MAX, when to use HAVING vs WHERE, and common pitfalls.

Mian Ali Khalid · · 4 min read
Use the tool
SQL Formatter
Format and beautify SQL queries. 12 dialect options (PostgreSQL, MySQL, SQLite, MSSQL, BigQuery, Snowflake, and more). Keyword casing control.
Open SQL Formatter →

GROUP BY collapses multiple rows into summary groups. HAVING filters those groups — it’s the WHERE clause for aggregate results.

Format your SQL with the SQL Formatter.

Basic GROUP BY

-- Count orders per customer:
SELECT
    customer_id,
    COUNT(*) AS order_count
FROM orders
GROUP BY customer_id;

-- Group by multiple columns:
SELECT
    year,
    month,
    category,
    SUM(revenue) AS total_revenue
FROM sales
GROUP BY year, month, category
ORDER BY year, month, category;

-- Group by expression:
SELECT
    DATE_TRUNC('month', created_at) AS month,
    COUNT(*) AS signups
FROM users
GROUP BY DATE_TRUNC('month', created_at)
ORDER BY month;

Aggregate functions

SELECT
    department,
    COUNT(*)              AS headcount,
    COUNT(DISTINCT title) AS unique_titles,
    AVG(salary)           AS avg_salary,
    MIN(salary)           AS min_salary,
    MAX(salary)           AS max_salary,
    SUM(salary)           AS payroll,
    ROUND(AVG(salary), 2) AS avg_salary_rounded
FROM employees
WHERE status = 'active'          -- WHERE filters BEFORE grouping
GROUP BY department
ORDER BY avg_salary DESC;

HAVING: filter groups

-- Customers with more than 5 orders:
SELECT
    customer_id,
    COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5;

-- Departments where average salary > 80,000:
SELECT
    department,
    ROUND(AVG(salary), 0) AS avg_salary
FROM employees
WHERE status = 'active'
GROUP BY department
HAVING AVG(salary) > 80000
ORDER BY avg_salary DESC;

WHERE vs HAVING

-- WHERE filters rows BEFORE aggregation (faster — reduces rows being aggregated):
SELECT department, COUNT(*) AS count
FROM employees
WHERE hire_date > '2020-01-01'  -- Filter individual rows first
GROUP BY department;

-- HAVING filters groups AFTER aggregation:
SELECT department, COUNT(*) AS count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;           -- Filter by aggregate result

-- Both together:
SELECT department, COUNT(*) AS active_recent_count
FROM employees
WHERE status = 'active'
  AND hire_date > '2020-01-01'  -- Row-level filter (use WHERE)
GROUP BY department
HAVING COUNT(*) >= 5;            -- Group-level filter (use HAVING)

Rule: If the filter doesn’t involve an aggregate function, put it in WHERE. If it does, use HAVING.

Common GROUP BY patterns

Top N per group

-- Top 3 products by revenue per category:
WITH ranked AS (
    SELECT
        category,
        product_name,
        SUM(revenue) AS total_revenue,
        RANK() OVER (PARTITION BY category ORDER BY SUM(revenue) DESC) AS rank
    FROM sales
    GROUP BY category, product_name
)
SELECT category, product_name, total_revenue
FROM ranked
WHERE rank <= 3;

Running totals and group stats

-- Monthly revenue with cumulative total:
SELECT
    DATE_TRUNC('month', order_date)::date AS month,
    SUM(amount)                            AS monthly_revenue,
    SUM(SUM(amount)) OVER (
        ORDER BY DATE_TRUNC('month', order_date)
    )                                      AS cumulative_revenue
FROM orders
GROUP BY DATE_TRUNC('month', order_date)
ORDER BY month;

Pivot-style aggregation with FILTER

-- Count by status in one query (PostgreSQL):
SELECT
    COUNT(*) FILTER (WHERE status = 'active')   AS active_count,
    COUNT(*) FILTER (WHERE status = 'inactive') AS inactive_count,
    COUNT(*) FILTER (WHERE status = 'pending')  AS pending_count
FROM users;

-- Equivalent with CASE:
SELECT
    SUM(CASE WHEN status = 'active'   THEN 1 ELSE 0 END) AS active,
    SUM(CASE WHEN status = 'inactive' THEN 1 ELSE 0 END) AS inactive,
    SUM(CASE WHEN status = 'pending'  THEN 1 ELSE 0 END) AS pending
FROM users;

Common pitfalls

-- ❌ Selecting a non-aggregated column not in GROUP BY:
SELECT department, name, COUNT(*) FROM employees GROUP BY department;
-- Error: 'name' must appear in GROUP BY or be used in an aggregate

-- ✅ Either add to GROUP BY:
SELECT department, name, COUNT(*) FROM employees GROUP BY department, name;

-- ✅ Or use an aggregate function:
SELECT department, MAX(name) AS sample_name, COUNT(*) FROM employees GROUP BY department;

-- ❌ Using HAVING for a non-aggregate filter (works but slow):
SELECT department FROM employees GROUP BY department HAVING department = 'Engineering';

-- ✅ Use WHERE instead:
SELECT department FROM employees WHERE department = 'Engineering' GROUP BY department;

Related posts

Related tool

SQL Formatter

Format and beautify SQL queries. 12 dialect options (PostgreSQL, MySQL, SQLite, MSSQL, BigQuery, Snowflake, and more). Keyword casing control.

Written by Mian Ali Khalid. Part of the Dev Productivity pillar.