HTML Escape / Unescape
HTML Conversion
Understanding HTML Escaping and Unescaping
HTML escaping is the process of converting special characters (such as angle brackets, ampersands, and quotes) into their corresponding HTML entities (e.g., <, >, &, ", ' or '). This is a crucial security measure, especially when displaying user-generated content on a web page.
The primary reason for escaping HTML is to prevent Cross-Site Scripting (XSS) attacks. If unescaped user input containing malicious scripts (e.g., script tags with executable code) is rendered directly in a browser, the script could execute, leading to session hijacking, data theft, or defacement of the website.
Conversely, HTML unescaping is the process of converting these HTML entities back into their original characters. This is useful when you need to retrieve the original text from an escaped string, for example, when processing data that was previously escaped for safe storage or transmission.
Common HTML Entities
- Less than symbol becomes <
- Greater than symbol becomes >
- Ampersand becomes &
- Double quote becomes "
- Single quote / apostrophe becomes ' or '