Catastrophic Backtracking: The Regex That Kills Your Server
One regex with nested quantifiers can reduce your server to 100% CPU in milliseconds. This is how catastrophic backtracking works, how to spot it, and how to fix it.
On July 2, 2019, Cloudflare’s global network went down for 27 minutes. Not from a DDoS attack. Not from a hardware failure. From a single regular expression deployed in a WAF rule that caused catastrophic backtracking on every request, pinning CPU at 100% across all Cloudflare edge nodes simultaneously.
The regex contained .*.*=.*. Nested quantifiers. Ambiguous matching paths. Given certain inputs, the engine explored an exponential number of possibilities before failing.
This is what catastrophic backtracking looks like in production.
How regex engines work
Most production regex engines — Python’s re, JavaScript’s built-in regex, Java’s java.util.regex, .NET, Go’s old regexp/syntax — use a Non-deterministic Finite Automaton (NFA) with backtracking. The engine tries each possible path through the pattern. If a path fails, it backtracks and tries a different path.
For most patterns on most inputs this is fine — the engine quickly finds a match or determines there isn’t one. The problem arises when:
- The pattern has multiple ways to match the same prefix of the string, AND
- The overall match ultimately fails (or the match is at the end)
In that case, the engine exhausts every possible combination of paths before concluding “no match.” For certain pattern structures, the number of paths grows exponentially with input length.
The classic example
Pattern: (a+)+$
Input: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaab"
The outer + can repeat the group 1 time, 2 times, 3 times, … up to 29 times. For each choice of outer repetitions, the inner a+ can split the as in different ways. The $ anchor means the engine must confirm the whole string matches — and it doesn’t, because of the trailing b.
So the engine tries: outer loops 29 times, inner each loop 1 char → fails. Outer loops 28 times (inner splits differently) → fails. And so on. The number of paths is 2^(n-1) where n is the length of the a sequence.
For 30 as followed by b, that’s roughly 500 million paths to try. For 40 as, it’s 500 billion. The engine doesn’t finish — it saturates a CPU core until the process is killed or the request times out.
Test this right now in the Regex Tester: paste (a+)+$ as the pattern and aaaaaaaaaaaaaaaaaaaaab as the input. Watch the browser tab hang.
Vulnerable pattern structures
These are the patterns that commonly cause catastrophic backtracking:
Nested quantifiers
(a+)+
(a*)*
(a|a?)+
Any pattern where a quantified group contains another quantified expression can create exponential paths.
Alternation with common prefixes
(cat|catch)+$
(com|com\w+)+$
When alternation alternatives can match the same characters and the overall match fails, the engine tries both branches at every position.
Overlapping alternatives
(\d+|\w+)+$
\d+ and \w+ can match the same digits. Exponential path explosion when applied repeatedly.
The ReDoS pattern
The general form of a ReDoS-vulnerable regex: a pattern that is ambiguous — it can match the same input in multiple ways — combined with a suffix that forces the match to fail. The “evil input” is a long string that matches the prefix of the pattern but fails the suffix.
Real incidents
Cloudflare (2019): The WAF rule (?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbol|math)|\|-|+)+[)];?((?:\s|-|~|!|{}||||+).(?:.=.)))— simplified, it contained..=.`. On certain request bodies this triggered exponential backtracking and consumed 100% CPU across all edge nodes. Revenue loss estimated at millions.
Stack Overflow (2016): A regex in the markdown parser ^[\s]+|[\s]+$ with a specific 20-character input caused 5 seconds of CPU time per request, effectively taking the site down during a traffic spike.
npm package ua-parser-js (2021): A user agent regex exhibited ReDoS under crafted strings, enabling attackers to DoS any Node.js service that parsed user agent strings.
How to detect vulnerable regexes
Manual analysis
Look for:
- Nested quantifiers:
(X+)+,(X*)*,(X+)* - Alternation with overlapping options inside a quantified group:
(a|ab)+ - Multiple quantified terms that can match the same characters:
(.+)*
If your pattern has any of these structures AND the overall match can fail, it is potentially vulnerable.
Automated tools
safe-regex (Node.js) — detects vulnerable patterns statically. Integrates into CI.
npx safe-regex '(a+)+'
# → VULNERABLE
vuln-regex-detector — more comprehensive static analysis.
rxxr2 — academic tool for ReDoS detection.
The Regex Tester shows match time per execution — if a match takes more than a few milliseconds on a short string, that’s a red flag worth investigating.
How to fix catastrophic backtracking
1. Rewrite to remove ambiguity
The goal: ensure the pattern has exactly one way to match any given string.
# Vulnerable
(a+)+
# Fixed — make the inner match possessive or atomic
(?>a+)+ # atomic group (no backtracking into the group)
a++ # possessive quantifier (no backtracking after matching a+)
Possessive quantifiers (a++, \w++) and atomic groups ((?>...)) are supported in Java, PHP (PCRE), .NET, and Ruby. They are NOT supported in JavaScript or Python’s re module.
2. Use an input length limit
If you can’t rewrite the pattern, cap input length before running the regex:
if (userInput.length > 1000) {
return { error: 'Input too long' };
}
const result = userInput.match(pattern);
This doesn’t fix the vulnerability, but it limits the blast radius. A 1,000-character input might cause milliseconds of backtracking; a 100,000-character input could cause seconds.
3. Use a linear-time engine
RE2 (Google’s regex library) uses a Deterministic Finite Automaton (DFA) that guarantees linear time — O(n) in input length — at the cost of not supporting backreferences or lookaheads. For most validation use cases, this trade-off is worth it.
- Python:
google-re2package - Node.js:
re2package (bindings to the C++ RE2 library) - Go: the
regexppackage uses RE2 semantics natively
// Node.js — drop-in RE2 replacement
const RE2 = require('re2');
const re = new RE2('(a+)+');
re.test('aaaaaaaaaaaaaab'); // Always terminates in linear time
4. Apply a timeout
As a last resort, run the regex in a worker with a timeout:
const { Worker, isMainThread, parentPort } = require('worker_threads');
// Spin up a worker, post the string, kill it after 100ms if no response
Not elegant, but it prevents a vulnerable pattern from pinning the main thread.
Summary
Catastrophic backtracking happens when a pattern can match a string in exponentially many ways, and the match ultimately fails. The structural warning signs are nested quantifiers and overlapping alternation. The fix — in order of preference — is to rewrite the pattern to remove ambiguity, switch to a linear-time engine like RE2, or impose an input length limit. The Regex Tester shows per-match timing, making slow patterns visible before they reach production.
Further reading
Related posts
- The 2026 Regex Cheatsheet (PCRE, JS, Python — Side by Side) — A dense reference for modern regex: anchors, character classes, quantifiers, loo…
- JavaScript Regex Flags — g, i, m, s, u, and v Explained — JavaScript regex flags change how patterns match. Learn when to use global /g, c…
- Regex Email Validation — Patterns, Edge Cases, and Best Practices — Email validation regex needs to balance strictness with real-world email formats…
Related tool
Test regular expressions with live match highlighting and explanation.
Written by Mian Ali Khalid. Part of the Dev Productivity pillar.