URL Encoding: The 7 Bugs That Break Your API

Every API has at least one URL-encoding bug. Here are the seven I see most — what each looks like in production, the symptom, and the fix.

Mian Ali Khalid · 2026-05-21 · 9 min read

Use the tool

URL Encoder / Decoder

Percent-encode and decode URLs per RFC 3986.

Open URL Encoder / Decoder →

URL encoding is one of those topics every developer thinks they understand until production breaks at 2 AM. After enough debugging sessions, you start recognizing the same seven bugs over and over. Here’s the field guide — symptom first, then root cause, then fix.

For the underlying spec, see Percent Encoding and RFC 3986 Explained.

Bug #1: Double-encoding (the `%2520` mystery)

Symptom: Your URLs in logs show %2520 where you expected %20. Search queries fail because %20 is being treated as literal text.

Example:

Expected: /search?q=hello%20world
Logged:   /search?q=hello%2520world

Root cause: The URL is being encoded twice. The first encoding turned space → %20. The second encoding saw % and turned it into %25, leaving %2520.

Where it happens:

Frontend encodes user input, then a framework re-encodes the whole URL.
A redirect chain where each redirect re-percent-encodes the target URL.
A reverse proxy that decodes/re-encodes incorrectly.
Manually concatenating an already-encoded value into another encoded URL.

Fix: Find the second encoder and skip it. Trace the request through every layer. The pattern: search your codebase for encodeURI and encodeURIComponent calls, identify which one is acting on already-encoded input, and remove that call.

Quick diagnosis: if you see %25xx patterns in URLs (where xx is two hex digits), you’re double-encoding. The %25 is “literal percent.”

Bug #2: Using encodeURI for query values

Symptom: A query parameter contains characters that look fine, but the server parses the URL with extra parameters or wrong values.

Example:

// 🚫 Wrong
const userInput = "name=Ada&role=admin";
const url = `/api/lookup?q=${encodeURI(userInput)}`;
// → /api/lookup?q=name=Ada&role=admin
// Server parses: { q: "name=Ada", role: "admin" } — TWO params!

Root cause: encodeURI preserves URL syntax characters (&, =). When user input contains those, they’re left unencoded and get parsed as if they were query syntax.

Fix: Use encodeURIComponent for values. It encodes everything that has URL-syntax meaning.

// ✅ Correct
const url = `/api/lookup?q=${encodeURIComponent(userInput)}`;
// → /api/lookup?q=name%3DAda%26role%3Dadmin
// Server parses: { q: "name=Ada&role=admin" } — ONE param

Better: Use URLSearchParams and let it handle the encoding:

const params = new URLSearchParams({ q: userInput });
const url = `/api/lookup?${params}`;

The mnemonic: encodeURI is for whole URLs. encodeURIComponent is for URL components (which means “everything that goes inside a URL”). Always reach for encodeURIComponent.

Bug #3: Plus sign confusion (space vs literal)

Symptom: Your search query “C++” returns results for “C ” (two spaces). Or your form submission for “1+2” gets stored as “1 2”.

Root cause: + has two meanings in URLs depending on context:

In application/x-www-form-urlencoded (form bodies, common in query strings): + is a space.
Everywhere else (path segments, RFC 3986 strict): + is a literal plus.

If your code uses URLSearchParams to encode and a parser that doesn’t decode + as space, the literal + gets through. If your code uses regular encodeURIComponent (which encodes space as %20, not +) and the receiver expects form encoding (+ for space), the mismatch creates garbage.

Fix: Match the encoding to the consumer:

Consumer	Use
Reading a typical query string	`+` = space (form-encoded convention)
Reading a URL path	`+` = literal `+`
Sending to an HTML form-style endpoint	Use `URLSearchParams` (produces form encoding)
Sending to an API expecting RFC 3986	Use `encodeURIComponent` for each value

If you need a literal + in form-encoded query, encode it explicitly: %2B.

Bug #4: Trailing-slash inconsistency

Symptom: /api/users/42 returns 200; /api/users/42/ returns 404 (or vice versa). Or analytics shows them as separate URLs and you can’t aggregate.

Root cause: Trailing slashes change the URL identity. The HTTP spec doesn’t say what they mean — it says “they identify different resources” and leaves it to the server.

Most frameworks have an opinion:

Express and Fastify: usually treat /foo and /foo/ as the same.
Some Apache rules: redirect one to the other.
Static site generators: /foo/ becomes a directory with index.html; /foo would need a redirect.

Fix:

Pick a convention and enforce it everywhere. Either always with trailing slashes or never.
Add a 301 redirect from the non-canonical form to the canonical form.
Set the canonical URL in your <link rel="canonical"> tags.
Reference the canonical form in internal links.

This isn’t strictly an encoding bug, but it surfaces in URL handling and is the same shape of problem.

Bug #5: Fragment encoding mismatch

Symptom: Deep-linked URLs with #fragment in path-style state break in some clients but work in others.

Example:

https://app.example.com/page#section?filter=active

The ? after # is part of the fragment, not a query separator. But some parsers treat # as a query stop and try to parse the rest as a query string.

Root cause: Fragments (#) terminate the URL parser. Everything after # is the fragment — not parsed further by the browser, sent client-side only. But hash-based SPA routes overload the fragment to carry route state, including query-like syntax.

Fix:

For SPA hash routing, use a clear convention like #! or #/ to signal “this is a route, not a static fragment.”
Always encodeURIComponent user input that goes into the fragment.
Don’t rely on fragment-as-query parsing — it’s not standard.
Modern SPAs should prefer the History API (path-based routing) over hash routing.

Bug #6: Multi-byte UTF-8 character mishandling

Symptom: International characters in URLs render as Ã© instead of é, or as ? literals.

Root cause: A layer in the chain decoded the URL as Latin-1 / Windows-1252 instead of UTF-8. Each UTF-8 multi-byte sequence got interpreted as multiple Latin-1 characters.

Where it happens:

Old PHP servers without mb_string extension.
A frontend that encodes text as UTF-8 but a backend that reads it as Latin-1.
File-system operations that use the OS’s default encoding (not UTF-8 on Windows historically).
Database connections that don’t specify UTF-8 charset.

Fix:

Standardize on UTF-8 everywhere — HTTP Content-Type header, database connection charset, file reading code, file system encoding.
Validate at every boundary. If the input should be UTF-8 but isn’t, reject it loudly rather than silently mojibake-ing.
For URLs specifically: ensure your server sees Content-Type: ...; charset=utf-8 on POST bodies and decodes path/query as UTF-8.

Bug #7: Inconsistent strict vs lenient parsing

Symptom: Some URLs work in your tests but break in production, or vice versa. Same URL behaves differently in different parts of the system.

Root cause: “Strict” RFC 3986 parsers reject characters that “lenient” parsers (most browsers) accept. JavaScript’s URL object follows the WHATWG URL Standard, which is more lenient than RFC 3986. Other languages (Java, Go, Rust) often use stricter RFC 3986 parsing.

Example: A URL with raw spaces or ^ will be rejected by Go’s url.Parse but accepted by browser new URL().

Fix:

Always run user input through encodeURIComponent (or your language’s equivalent) before assembling URLs.
Don’t rely on lenient parsing as a feature — code as if every parser is strict.
Add input validation to reject out-of-spec characters before they get to URL assembly.

A protective workflow

Whenever I’m building URL handling, I follow this:

// 1. Treat user input as raw, untrusted data
const raw = userInput;

// 2. Encode at the URL boundary, never deeper
const safe = encodeURIComponent(raw);

// 3. Use URL-aware tools to assemble
const params = new URLSearchParams({ q: raw });  // URLSearchParams encodes for you
const url = new URL('/api/search', baseUrl);
url.search = params.toString();

// 4. Convert to string only at the last moment
fetch(url.toString());

Three rules:

Never manually concat URL pieces with +. Use URL and URLSearchParams.
Encode at the boundary. Don’t spread encode/decode calls through your codebase.
Pick one convention for trailing slashes, query parameter ordering, and fragment usage. Document it.

Quick-fix lookup table

Symptom	Likely bug	Fix
`%2520` in URLs	Double encoding	Find duplicate `encodeURIComponent` call
Query splits into multiple params	`encodeURI` for value	Switch to `encodeURIComponent`
`+` shows as literal `+`	Form vs URL encoding mismatch	Replace `+` with `%20` before decode
`Ã©` instead of `é`	UTF-8 / Latin-1 mismatch	Force UTF-8 in `Content-Type` and DB
404 with trailing slash	Inconsistent slash convention	Add 301 redirect, pick one canonical
Hash-route + query confusion	Parser treating `#` as `?`	Encode user input in fragment, use proper SPA router
URL works in browser, fails server	Lenient vs strict parsing	Always `encodeURIComponent` user input

Bottom line

URL encoding is consistently one of the most-debugged topics in web dev because the rules are simple but the boundaries between systems are subtle. The fix for 90% of these bugs is the same: trust nothing, encode at the boundary with the right function, and use URLSearchParams whenever building query strings.

Related tool

URL Encoder / Decoder

Percent-encode and decode URLs per RFC 3986.

Written by Mian Ali Khalid. Part of the Encoding & Crypto pillar.

URL Encoding: The 7 Bugs That Break Your API

Bug #1: Double-encoding (the `%2520` mystery)

Bug #2: Using encodeURI for query values

Bug #3: Plus sign confusion (space vs literal)

Bug #4: Trailing-slash inconsistency

Bug #5: Fragment encoding mismatch

Bug #6: Multi-byte UTF-8 character mishandling

Bug #7: Inconsistent strict vs lenient parsing

A protective workflow

Quick-fix lookup table

Bottom line

Further reading

Related posts

Related tool

URL Encoding: The 7 Bugs That Break Your API

Bug #1: Double-encoding (the %2520 mystery)

Bug #2: Using encodeURI for query values

Bug #3: Plus sign confusion (space vs literal)

Bug #4: Trailing-slash inconsistency

Bug #5: Fragment encoding mismatch

Bug #6: Multi-byte UTF-8 character mishandling

Bug #7: Inconsistent strict vs lenient parsing

A protective workflow

Quick-fix lookup table

Bottom line

Further reading

Related posts

Related tool

Bug #1: Double-encoding (the `%2520` mystery)