SQL Style Guide: The Things That Actually Matter
Keyword casing, indentation, naming conventions, CTEs vs subqueries — the practical style rules that make SQL readable and maintainable across a team.
SQL is the one language where the same query can be written a dozen different ways and produce identical results. That freedom is a footgun. Over time, an undisciplined codebase accumulates queries where you have to mentally parse the structure before you can even start thinking about the logic. Style rules aren’t aesthetic preferences — they’re the difference between a query you can debug at 2am and one that makes you guess.
This is the practical guide: the conventions that have real payoff, the debates worth having, and what automated formatters can and can’t do for you.
Keyword casing: UPPER vs lower
This is the oldest debate in SQL style. Two camps:
UPPER for keywords — SELECT, FROM, WHERE, JOIN. The traditional style, used in virtually every SQL textbook and the official Mozilla SQL Style Guide. It visually separates the SQL machinery from your column and table names, which are typically lowercase.
Lowercase everywhere — preferred by some modern teams (including parts of the GitLab Data Team SQL Style Guide). Arguments: case-insensitive anyway, less shouty, consistent with the rest of your codebase.
The honest answer: either is fine, but UPPER keywords with lowercase identifiers is the more universally readable choice. The visual contrast lets your eye immediately locate the clause boundaries. Scan this:
-- Hard to scan
select u.id, u.email, count(o.id) as order_count
from user u left join order o on u.id = o.user_id
where u.created_at > '2024-01-01'
group by u.id, u.email
-- Clause boundaries are obvious
SELECT
u.id,
u.email,
COUNT(o.id) AS order_count
FROM user AS u
LEFT JOIN orders AS o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id, u.email
The formatted version takes up more vertical space. That’s a feature, not a problem.
Whatever you pick: enforce it consistently. Mixed casing (selecting vs SELECT in the same file) is the worst of both worlds. This is exactly what an automated formatter handles — paste in a query with inconsistent keyword casing, get back normalized output. The SQL Formatter handles this normalization automatically.
Indentation and clause layout
There are two dominant indentation styles: river style and leading comma. You will encounter both, and both camps argue with conviction.
River style (right-align keywords)
SELECT u.id,
u.email,
COUNT(o.id) AS order_count
FROM users AS u
LEFT JOIN orders AS o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id, u.email
Keywords right-aligned to a fixed column, data left-aligned after them. This creates the “river” of whitespace. It’s visually elegant and used in Oracle shops. The downside: adding a long clause name (CROSS JOIN, GROUP BY) forces you to re-align everything else.
Left-aligned keywords, indented body
SELECT
u.id,
u.email,
COUNT(o.id) AS order_count
FROM users AS u
LEFT JOIN orders AS o
ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY
u.id,
u.email
Each SQL clause starts at column 0, content indented 4 spaces. This is the style used by the Mozilla and GitLab guides, and it’s what most formatters default to. Easier to maintain, easier to diff in version control, no fiddly column alignment. Use this unless your codebase already commits to river style.
Leading comma vs trailing comma
Trailing comma (standard in most languages):
SELECT
u.id,
u.email,
COUNT(o.id) AS order_count
Leading comma:
SELECT
u.id
, u.email
, COUNT(o.id) AS order_count
Leading comma is the GitLab style. The argument: commenting out the last column doesn’t break the query (no trailing comma on a removed line). The counterargument: it looks weird, your IDE likely complains, and most formatters default to trailing. Unless your team already uses leading commas, trailing commas are the default recommendation.
Multi-table JOINs
One JOIN condition per line, ON clause indented under the JOIN:
SELECT
o.id,
o.total,
u.email,
p.name AS product_name
FROM orders AS o
INNER JOIN users AS u
ON o.user_id = u.id
INNER JOIN order_items AS oi
ON o.id = oi.order_id
INNER JOIN products AS p
ON oi.product_id = p.id
WHERE o.created_at > '2024-01-01'
Never put the ON clause on the same line as JOIN when there are three or more tables — it becomes a horizontal scroll nightmare.
Naming conventions
Use snake_case for tables and columns
user_id, created_at, order_total. Always. camelCase (userId, createdAt) is a JavaScript convention, not SQL. PascalCase (UserId) is common in SQL Server shops but creates friction when queries leave the database layer. snake_case is universal.
Reserved words that will bite you
Never name a table or column with a reserved word. The classic offenders:
-- These will cause parser errors or require quoting in most databases
user -- reserved in PostgreSQL, MySQL, SQL Server
order -- reserved in every major database
name -- reserved in some parsers
date -- reserved in Oracle, MySQL
value -- reserved in Oracle
status -- not reserved but overloaded
Common fix: prefix with a domain noun. user becomes account or app_user. order becomes purchase or sales_order. name becomes full_name, display_name, or product_name. This is annoying to discover after your schema is live — check your column names before you ship a migration.
Plural vs singular table names
This one genuinely splits engineering teams.
Plural (users, orders, products): the table is a collection of rows, collections are plural, end of story. Most Rails apps default to plural via ActiveRecord conventions.
Singular (user, order, product): the table represents the entity type, not the collection. When you write FROM user WHERE id = 1, you’re querying the user concept. Some ORMs (notably SQLAlchemy with explicit table names, Prisma) align better with singular. SQL joins also read more naturally: JOIN user ON order.user_id = user.id reads like a sentence.
Pick one and enforce it. Mixed schemas (users table with a product join) are a cognitive tax on every developer who touches the codebase. The Mozilla SQL Style Guide specifies lowercase snake_case and doesn’t mandate plural/singular — pick your convention and document it in your team’s style guide.
Explicit column names, never SELECT *
-- Never ship this to production
SELECT * FROM orders WHERE user_id = 1;
-- Always explicit
SELECT id, user_id, total, created_at FROM orders WHERE user_id = 1;
SELECT * in application code is a correctness hazard — a new column in the schema can break downstream deserialization. It’s also a performance hazard on wide tables with text or jsonb columns you don’t need.
Alias conventions
Use AS explicitly. Some dialects allow dropping it; don’t:
-- Implicit AS — legal in most dialects but easy to misread
SELECT COUNT(id) total FROM orders;
-- Explicit AS — unambiguous
SELECT COUNT(id) AS total FROM orders;
For table aliases, use meaningful short names, not single letters. a and b tell you nothing. u for users, o for orders, oi for order_items — still short, but readable without scrolling back to the FROM clause.
-- Bad: what is b?
SELECT a.id, b.name
FROM orders a
JOIN products b ON a.product_id = b.id;
-- Good: readable at a glance
SELECT o.id, p.name
FROM orders AS o
JOIN products AS p ON o.product_id = p.id;
For CTEs (see below), use full descriptive names. recent_orders, active_users, monthly_totals — the name should make the CTE’s purpose clear without reading its body.
WHERE clause formatting
Multi-condition WHERE clauses: one condition per line, boolean operators at the start:
WHERE
u.created_at > '2024-01-01'
AND u.status = 'active'
AND u.country_code IN ('US', 'CA', 'GB')
AND o.total > 100
AND/OR at the start of the line (not the end) mirrors the way you read logic: each line is a complete condition, and the operator tells you how it connects to the previous one. Ending lines with AND means the operator can get lost when skimming.
For complex boolean logic, group with parentheses even when precedence rules make them optional:
WHERE
(u.status = 'active' OR u.status = 'trial')
AND u.created_at > '2024-01-01'
AND binds tighter than OR — but writing out the parentheses removes ambiguity for anyone reading the query later, including you.
CTEs vs subqueries
Common Table Expressions (CTEs) introduced with WITH beat nested subqueries for readability at essentially every query length above trivial.
Before — nested subquery:
SELECT
u.email,
order_summary.total_orders,
order_summary.lifetime_value
FROM users AS u
INNER JOIN (
SELECT
user_id,
COUNT(id) AS total_orders,
SUM(total) AS lifetime_value
FROM orders
WHERE status = 'completed'
GROUP BY user_id
) AS order_summary ON u.id = order_summary.user_id
WHERE order_summary.lifetime_value > 500;
After — CTE:
WITH completed_orders AS (
SELECT
user_id,
COUNT(id) AS total_orders,
SUM(total) AS lifetime_value
FROM orders
WHERE status = 'completed'
GROUP BY user_id
)
SELECT
u.email,
co.total_orders,
co.lifetime_value
FROM users AS u
INNER JOIN completed_orders AS co ON u.id = co.user_id
WHERE co.lifetime_value > 500;
The CTE version is longer by one line but dramatically easier to read. The subquery logic is named, separated, and readable on its own terms before you get to the outer query. When you have three levels of nesting, CTEs go from “nice to have” to “mandatory for sanity.”
CTEs also compose cleanly:
WITH completed_orders AS (
SELECT user_id, COUNT(id) AS total_orders, SUM(total) AS lifetime_value
FROM orders
WHERE status = 'completed'
GROUP BY user_id
),
high_value_users AS (
SELECT user_id
FROM completed_orders
WHERE lifetime_value > 500
)
SELECT u.email, u.created_at
FROM users AS u
INNER JOIN high_value_users AS hvu ON u.id = hvu.user_id;
Each CTE is a named step. The final SELECT reads like English.
Caveat: CTEs in MySQL (before 8.0) and some older PostgreSQL versions don’t optimize through them the same way subqueries do — the planner may materialize a CTE into a temp table regardless of whether the outer query filters it down. In modern PostgreSQL (12+) and MySQL (8.0+), this is largely a non-issue. If you’re on an older version and have a performance-critical query, benchmark both forms.
What SQL formatters actually do — and what they can’t
A SQL formatter handles the mechanical layer: keyword case normalization, consistent indentation, clause alignment, whitespace cleanup. Paste in a 200-line query written by three different people across six months and get back something that at least has consistent structure.
The SQL Formatter handles keyword normalization and indentation for the common dialects (PostgreSQL, MySQL, T-SQL, SQLite).
What formatters cannot do:
- Rename bad columns.
SELECT a, b, c FROM tformatted is stillSELECT a, b, c FROM t. Column names are semantics; formatting is syntax. - Fix structural problems. A 12-level nested subquery formatted is a 12-level nested subquery with nice indentation. The fix is refactoring to CTEs.
- Catch logic errors. The formatter doesn’t know that your WHERE clause is backwards or that you’re joining on the wrong column.
- Enforce naming conventions. A formatter will cheerfully indent a query with
tbl_UsersandorderIDcolumns.
Formatters are a floor, not a ceiling. They eliminate the mechanical noise so code review can focus on the actual logic.
Further reading
- Comparing JSON Structurally — structural comparison concepts that transfer to thinking about query output
- YAML vs JSON: Which to Use When — related format discipline for config and data files
- Mozilla SQL Style Guide — the full spec, well-reasoned
- GitLab Data Team SQL Style Guide — production-tested, leading-comma style
- SQL Formatter tool — paste and normalize any query instantly
Related posts
- SQL CTEs (Common Table Expressions) — WITH Clause Guide — Common Table Expressions (CTEs) use the WITH clause to create named temporary re…
- SQL Formatter Online — Format and Beautify SQL Queries — A SQL formatter adds consistent indentation and line breaks to SQL queries, maki…
- SQL GROUP BY and HAVING — Aggregate Data with Filters — GROUP BY groups rows into summary rows; HAVING filters the groups. Learn how to …
- Comparing JSON Structurally (Not Just as Strings) — Two JSON documents can be byte-different and semantically identical. Or byte-ide…
- YAML vs JSON: Which to Use When (and Why It Matters) — JSON is for machines, YAML is for humans, and choosing the wrong one quietly cos…
Related tool
Format and beautify SQL queries. 12 dialect options (PostgreSQL, MySQL, SQLite, MSSQL, BigQuery, Snowflake, and more). Keyword casing control.
Written by Mian Ali Khalid. Part of the Dev Productivity pillar.