What is HTML Encoding?
HTML encoding converts special characters into their HTML entity equivalents. This prevents characters like < and > from being interpreted as HTML tags, allowing you to safely display code and special characters on web pages.
Common HTML Entities
| Character | Entity Name | Entity Number |
|---|---|---|
| < | < | < |
| > | > | > |
| & | & | & |
| " | " | " |
| ' | ' | ' |
🔒 Security: XSS Prevention
HTML encoding is critical for security. Unencoded user input can contain malicious scripts. Always encode before displaying user-provided content to prevent Cross-Site Scripting (XSS) attacks.
When to Use HTML Encoding
Displaying Code Examples
To show <div> on a webpage rather than having the browser interpret it as an actual div element, encode the angle brackets.
User Input Display
When showing user-submitted content (comments, usernames, etc.), always encode to prevent script injection attacks.
Special Symbols
Characters like © (copyright), ™ (trademark), and non-breaking spaces have HTML entities for reliable cross-browser display.
Frequently Asked Questions
Should I encode everything?
Common characters (letters, numbers) don't need encoding. Focus on the critical five: < > & " ' and any special symbols that might not display correctly.
What's the difference between named and numeric entities?
Named entities (<) are readable. Numeric entities (<) work for any Unicode character. Both render identically in browsers.
Does this replace sanitization libraries?
Encoding is one part of security. For complex HTML handling, use dedicated sanitization libraries like DOMPurify that handle edge cases and malformed input.