iKit
Technical · 10 min read ·

Why Your URL Has Plus Signs: Form Encoding Explained (2026)

Why does your URL have plus signs instead of spaces? Form encoding explained — when + means space, when it means literal +, and the bug it causes.

Why Your URL Has Plus Signs: Form Encoding Explained (2026)

Why Your URL Has Plus Signs: Form Encoding Explained

You paste a URL and see + instead of spaces. Or a webhook lands with name=Alice+Walker and your handler stores literally Alice+Walker. Both are caused by the same 30-year-old quirk: HTML form submissions encode spaces as +, while RFC 3986 URLs encode them as %20. Picking the wrong decoder for the wrong half of a URL is the most reliable way to break a request in 2026 — and the fix is narrower than most teams think.

TL;DR

  • + means "space" only inside application/x-www-form-urlencoded query strings — never in the path or fragment.
  • %20 is always a space, in every part of a URL.
  • A literal + in form-encoded data must be written as %2B, or the decoder reads it as a space.
  • URLSearchParams is the only built-in JavaScript API that round-trips + correctly.
  • Decoders that treat + as space when applied to a path will silently corrupt filenames, IDs, and S3 keys.

Where the Plus Sign Came From

HTML 2.0 and the 1995 forms specification

When the original HTML form submission spec landed in 1995, the engineers at the time made a small but lasting decision. The new MIME type application/x-www-form-urlencoded would represent spaces using the + character rather than the percent-encoded %20. The reason was pragmatic: a typical form value of "first name last name" turned into first+name+last+name instead of first%20name%20last%20name, saving four bytes per space on the slow modems of the era. Multiply that across every form on every dial-up site and the savings looked meaningful at the time.

RFC 3986 path encoding (2005)

A decade later, RFC 3986 standardised URI syntax for the modern web. It defined a clean grammar for the components of a URL — scheme, authority, path, query, fragment — and assigned a single canonical encoding for spaces in any component: %20. RFC 3986 doesn't recognise + as a space. Inside a path, a + is just a +. The spec is short, well-organised, and worth a 30-minute read if you maintain anything that handles URLs.

Why the two specs never reconciled

You'd expect one to back down. Neither did. application/x-www-form-urlencoded is owned by WHATWG, which now maintains the URL Standard. RFC 3986 is owned by the IETF. Both are still authoritative, both ship in every browser, and the result is that a single character has two meanings depending on which half of the URL it sits in. Both meanings are correct.

When + Means Space and When It Means +

Most URL bugs come from applying the wrong rule in the wrong place. The table below is the rule of thumb worth memorising:

Where the + appears Encoding rule What + decodes to
Path segment (/users/+name) RFC 3986 A literal +
Query string (?name=Alice+Walker) form-urlencoded A space
Fragment (#section+two) RFC 3986 A literal +
Cookie value varies (server) Usually literal +

Query strings

Every browser, every form library, and every standard HTTP client encodes form data with + for space. If you read [email protected] from a browser-submitted form, the decoder must treat + as space and produce alice [email protected]. To send a literal +, the form has to encode it as %2B. The browser's built-in FormData API does this automatically when you call fetch() with a body type of application/x-www-form-urlencoded.

URL paths

In the path, + is just a +. A URL like /files/draft+v2.pdf refers to a file literally named draft+v2.pdf. Decoding that segment as form-urlencoded turns it into draft v2.pdf, which is a different file. S3 bucket keys, GitHub commit links, raw file URLs on Dropbox, and Google Drive paths all rely on this rule. If your CDN serves the wrong file when a user pastes a link with a + in it, the bug is almost certainly here.

Fragments and reserved zones

Fragments — the part after # — follow path rules. Authority components (host, port, userinfo) don't allow + at all in most parsers. The dual-meaning problem is confined almost entirely to the query string. If you make a habit of using URLSearchParams to read and write that one component, you'll avoid almost every variant of this bug.

The Code That Gets It Wrong (and Right)

encodeURIComponent vs encodeURI

JavaScript's two built-in URL encoders disagree on whether to escape +:

encodeURI("a + b");
// "a%20+%20b"     ← + left alone

encodeURIComponent("a + b");
// "a%20%2B%20b"   ← + escaped to %2B

Use encodeURIComponent for any value you're inserting into a URL. The encodeURI function exists for whole-URL escaping only, and even that use case is brittle — pass it a URL that already contains a + and you get a half-encoded mess that no decoder will recover from cleanly.

URLSearchParams: the right answer

For query strings, the only built-in API that round-trips + correctly is URLSearchParams:

const p = new URLSearchParams();
p.set("email", "[email protected]");
p.toString();
// "email=alice%2Bdev%40example.com"

const back = new URLSearchParams(p.toString());
back.get("email");
// "[email protected]"

It encodes + as %2B on the way out and decodes both %2B and bare + (as space) on the way in. If you've ever hand-written a query parser using split("&") and decodeURIComponent, you've shipped this bug at least once. Replacing those parsers with new URL(input).searchParams is one of the cheapest reliability wins available in any JavaScript codebase.

Server-side parsers

Server frameworks vary by language. The behaviour to verify in your stack:

  • Node Express: req.query uses qs by default, which decodes + as space in query strings — correct.
  • PHP $_GET and $_POST: decode + as space — correct.
  • Python urllib.parse.parse_qs: decodes + as space — correct.
  • Python urllib.parse.unquote: leaves + as +. This is the trap — use unquote_plus if the string came from a form.
  • Go url.QueryUnescape: decodes + as space. url.PathUnescape leaves it alone — and that distinction is the whole point.
  • Java URLDecoder.decode: decodes + as space. There is no path-aware variant in the standard library; you have to swap + for %2B before decoding if you're parsing a path segment.

Most language standard libraries provide both flavours. Pick the right one for the part of the URL you're parsing.

Debugging the 400 Bad Request

When a request fails because of a +/%20 mix-up, the symptoms cluster into three patterns. Here's how to diagnose each in under a minute.

Symptom 1: search box loses everything after the first space

You type red shoes into a search box, the URL becomes ?q=red%20shoes (browser auto-encoded), and the server reads q="red shoes" correctly. Then someone adds a tracking layer and writes the URL by hand as ?q=red+shoes. Now the server gets q="red shoes" from its form decoder — fine. But your custom analytics layer reads the raw query string with decodeURIComponent and gets q="red+shoes" (literal +). The two layers disagree about what was searched for, your reports drift, and nobody notices for a quarter.

The fix is to normalise on URLSearchParams (or the language equivalent) at the edge of every system that touches the query string. Never call decodeURIComponent on a raw query value.

Symptom 2: API receives "Hello+World" as the literal string

You POST to a JSON API and the body is {"name": "Hello World"}. It works. Then someone changes the client to use application/x-www-form-urlencoded and sets the body to name=Hello+World. The API stores Hello+World as the literal name because the JSON parser is still active and the form parser was never wired in.

The fix is to read Content-Type and route the body to the matching parser. Every modern server framework does this automatically — most bugs come from middleware that strips, overrides, or defaults the header to application/json when the request actually carries a different MIME type.

Symptom 3: double-encoding (%252B everywhere)

A signed S3 URL contains +. Your code calls encodeURIComponent(url) before passing it to a redirect, which turns the + into %2B. The browser then percent-encodes the % to %25, leaving the final value as %252B. The S3 server decodes once, sees %2B, and rejects the signature because the canonical string no longer matches.

The fix is to never re-encode an already-encoded URL. If you must wrap a URL inside another URL — a redirect, a callback, a webhook target — encode it exactly once, and decode it exactly once on the receiving side. The iKit URL Encoder shows you each stage of encoding in a side-by-side panel, which is the easiest way to spot a double-encode without writing assertions.

Edge Cases Worth Knowing

A handful of one-off rules every team eventually trips on:

  • Email addresses with +: [email protected] is a valid email used widely for inbox tagging. The literal + must be encoded as %2B in any form submission, or the email-validation library on the server treats it as alice [email protected] and rejects it as malformed. This is by far the most common real-world incarnation of the bug.
  • Base64 in URLs: standard Base64 uses + and /, both of which need encoding for URL use. Use Base64-URL (- and _ instead) when embedding binary data in a URL — the iKit Base64 tool emits the URL-safe variant on request. Skipping this step is how JWT bearer tokens occasionally end up %2B-corrupted in cookie jars.
  • Plus signs in JSON inside URLs: if you serialise a JSON payload into a query parameter, URLSearchParams will correctly escape any + characters in the JSON. Hand-rolled string concatenation will not. Validate the round-trip in the iKit JSON Decoder before shipping anything that nests JSON inside a query string.
  • Phone numbers: international phone numbers in E.164 format begin with + (e.g. +15551234567). When passed through a query string, the + becomes a space and your validator now sees 15551234567, which is one digit short of an E.164 number. Encoding as %2B15551234567 keeps the format intact.

Quick Diagnostic Snippets

Two one-liners worth bookmarking. Open a browser DevTools console and try the first; the second runs in any shell with curl installed.

// What does this query string actually contain?
[...new URLSearchParams(location.search)]
  .forEach(([k, v]) => console.log(k, "=>", v));
# How does curl encode a + when you pass --data-urlencode?
curl -G --data-urlencode "name=alice+dev" \
     https://httpbin.org/get
# httpbin echoes back: "name": "alice+dev"
# (Yes — the + survived because curl encoded it as %2B.)

If your server is reading alice dev from this curl call, the bug is in your decoder, not in the client.

A Quick Reference

Print this and stick it next to your monitor:

Symbol In path In query (form-urlencoded) In JS encodeURIComponent
Space %20 + or %20 %20
+ literal + %2B %2B
%20 space space always emitted
& literal & parameter separator %26

The first column is RFC 3986. The second is the WHATWG URL Standard. Neither is wrong — they apply to different parts of the URL, and the only mistake is to apply them to the wrong half.

Specifications Worth Reading

The single most useful thirty minutes you can spend on this topic is reading section 2.1 of RFC 3986 on percent-encoding, then skimming the WHATWG URL Standard on form-urlencoded parsing. They're short, readable, and they're the documents your HTTP libraries are implementing under the hood. Almost every URL bug in production is a violation of one specific paragraph of one of these two docs — usually the same one.

Related on iKit

Related posts