X Xerobit

SQL Indexes Guide — B-Tree, Partial, Composite, and When Not to Index

Database indexes speed up queries but slow down writes. Learn how B-tree indexes work, when to add composite and partial indexes, how to identify missing indexes with EXPLAIN,...

Mian Ali Khalid · · 5 min read
Use the tool
SQL Formatter
Format and beautify SQL queries. 12 dialect options (PostgreSQL, MySQL, SQLite, MSSQL, BigQuery, Snowflake, and more). Keyword casing control.
Open SQL Formatter →

Indexes trade write performance for read performance. Adding the right indexes can reduce a 10-second query to 1ms. Adding the wrong ones slows inserts and wastes storage.

Format and analyze your SQL with the SQL Formatter.

How B-tree indexes work

-- Without index: PostgreSQL scans every row (seq scan)
SELECT * FROM users WHERE email = 'alice@example.com';
-- Reads 1,000,000 rows to find 1 match

-- With B-tree index: binary search (O log n)
CREATE INDEX idx_users_email ON users (email);
-- Now reads ~20 rows to find 1 match in 1M rows

A B-tree index is a sorted tree structure. For WHERE email = 'alice@example.com', the database walks the tree (20 steps for 1M rows) instead of scanning every row.

When indexes help

-- ✅ Equality lookups:
WHERE email = 'alice@example.com'
WHERE user_id = 12345

-- ✅ Range queries:
WHERE created_at > '2026-01-01'
WHERE price BETWEEN 10 AND 100

-- ✅ ORDER BY (avoids sort step):
ORDER BY created_at DESC

-- ✅ JOIN columns:
SELECT * FROM orders o JOIN users u ON o.user_id = u.id
-- Index on orders.user_id speeds up the join

-- ✅ High cardinality columns (many distinct values):
-- email: ~100% unique → excellent for indexing
-- boolean: 2 values → poor candidate

When indexes DON’T help

-- ❌ Low-cardinality columns (< 1% selectivity):
WHERE is_active = true     -- 99% of rows are active
WHERE status IN ('published', 'draft')  -- Most rows match

-- ❌ Function on the indexed column:
WHERE LOWER(email) = 'alice@example.com'  -- Index not used!
-- Fix: CREATE INDEX ON users (LOWER(email));

-- ❌ LIKE with leading wildcard:
WHERE name LIKE '%smith%'   -- Can't use B-tree
-- Fix: Use full-text search for this pattern

-- ❌ Small tables (< 1000 rows):
-- Sequential scan is faster — optimizer may ignore index

-- ❌ Very high write/read ratio:
-- Index maintenance on every INSERT/UPDATE/DELETE

EXPLAIN ANALYZE — diagnose query plans

-- PostgreSQL: show query plan with timing:
EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 123;

-- Look for:
-- "Seq Scan" → no index used (often a problem)
-- "Index Scan" → good, using an index
-- "Bitmap Heap Scan" → reading multiple index hits, then heap
-- "Hash Join" / "Nested Loop" → different join strategies
-- "Sort" → missing index to support ORDER BY

-- Rows vs actual rows — large differences indicate stale statistics:
-- Run: ANALYZE users;

Composite indexes

-- Index on multiple columns — order matters!
CREATE INDEX idx_orders_user_date ON orders (user_id, created_at);

-- This index helps:
WHERE user_id = 123                          -- ✅ Uses first column
WHERE user_id = 123 AND created_at > '2026-01-01'  -- ✅ Uses both columns
ORDER BY user_id, created_at                 -- ✅ Uses index order

-- This index doesn't help:
WHERE created_at > '2026-01-01'             -- ❌ Second column only
WHERE created_at > '2026-01-01' AND user_id = 123  -- ❌ Wrong order

-- Rule: order columns by selectivity (most selective first),
-- or by how they're used in your most important queries.

Partial indexes

-- Index only a subset of rows (PostgreSQL):
-- Index only active users — saves space, faster maintenance:
CREATE INDEX idx_users_email_active
    ON users (email)
    WHERE is_active = true;

-- This index only helps when the query includes: is_active = true
SELECT * FROM users WHERE email = 'alice@example.com' AND is_active = true;

-- Partial index for non-null values:
CREATE INDEX idx_orders_coupon
    ON orders (coupon_code)
    WHERE coupon_code IS NOT NULL;
-- 90% of orders have no coupon — index stays small

Covering indexes

-- Include all columns needed by the query — eliminates heap lookup:
CREATE INDEX idx_users_email_covering
    ON users (email)
    INCLUDE (id, name, created_at);

-- This query is satisfied entirely by the index (no table access):
SELECT id, name, created_at FROM users WHERE email = 'alice@example.com';
-- PostgreSQL: "Index Only Scan" — fastest possible

Index maintenance

-- Check index bloat (PostgreSQL):
SELECT
    schemaname,
    tablename,
    indexname,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

-- Rebuild bloated index:
REINDEX INDEX idx_users_email;

-- Find unused indexes:
SELECT indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0;
-- Indexes never used since last stats reset — candidates for removal

Related posts

Related tool

SQL Formatter

Format and beautify SQL queries. 12 dialect options (PostgreSQL, MySQL, SQLite, MSSQL, BigQuery, Snowflake, and more). Keyword casing control.

Written by Mian Ali Khalid. Part of the Dev Productivity pillar.