URLs permit limited unescaped ASCII; everything else uses percent-encoding (%HH hex bytes). UTF-8 text becomes multiple percent escapes.
Forms
application/x-www-form-urlencodedspaces become+in payloads.multipart/form-datahandles binary uploads separately.
JavaScript helpers
encodeURIComponentfor query parameter values.encodeURIfor entire URIs sparing reserved characters.
Server frameworks
Use standard libraries—manual concatenation risks injection bugs.
IRIs
Internationalized domain names apply Punycode in DNS; paths may include Unicode when properly encoded.
Double-encoding bugs
Frameworks sometimes encode once client-side and again server-side—watch for %2520 style sequences in logs.
Encode table (query strings)
| Character | Meaning | Encoded |
|---|---|---|
| space | in forms often + | %20 or + |
& | separator | %26 |
= | name=value | %3D |
? | query start | %3F |
| é | UTF-8 bytes | %C3%A9 |
Example — constructing a search URL
<!-- User typed: HTML & CSS ? -->
<a href="/search?q=HTML%20%26%20CSS%20%3F">Run search</a>
Rendered (link uses encoded query)
In JavaScript prefer encodeURIComponent per parameter value—never hand-roll for untrusted text.
Important interview questions and answers
- Q: What is the safest default character encoding for modern HTML?
A: UTF-8, declared early with `` and matched by server `Content-Type` headers. - Q: When are HTML entities still useful in UTF-8 pages?
A: For reserved characters (`&`, `<`) and contexts where explicit escaping avoids parser ambiguity. - Q: What is the key difference between HTML5 parsing and XHTML parsing?
A: HTML5 recovers from many errors; XHTML (XML) treats many parse errors as fatal.
Tip: Encode query values—spaces become %20 or + in forms.