A tiny HTML sanitizer that reduces markup to a constrained subset defined by a rule list. The sanitizer applies rules recursively until output converges.
This project was created with the assistance of AI tools. Contributions and review are welcome to validate behavior, security posture, and documentation.
npm install html-allowlistThis package is ESM-only.
import { sanitize } from "html-allowlist";
const html = "<p><a href=\"https://example.com\">ok</a><a>nope</a></p>";
const rules = ["p", "a", "a|href"];
const output = sanitize(html, rules, { allowCommonAttributes: true });If you sanitize repeatedly with the same rules, compile them once:
import { compileRules, sanitizeWithPolicy } from "html-allowlist";
const policy = compileRules(rules, { allowCommonAttributes: true });
const output = sanitizeWithPolicy(html, policy);sanitize(html: string, rules: string[], config?: SanitizerConfig): stringReturns a cleaned HTML document string (including <html>, <head>, and <body>). The sanitizer runs multiple passes until the output stops changing or a maximum pass count is reached.
compileRules(rules: string[], config?: SanitizerConfig): CompiledPolicyParses and normalizes rules once, returning a reusable policy object.
sanitizeWithPolicy(html: string, policy: CompiledPolicy): stringSanitizes using a precompiled policy. Output matches sanitize for the same rules and config.
allowCommonAttributes?: boolean(default:false)- Allows conservative default attributes for some tags plus global
class/id.
- Allows conservative default attributes for some tags plus global
allowJavaScript?: boolean(default:false)- When
false, removeson*attributes and blocksjavascript:anddata:URLs. - When
true, script-related constructs still require explicit rules.
- When
maxPasses?: number(default:10)- Maximum number of recursive passes before stopping.
An internal policy shape produced by compileRules and consumed by sanitizeWithPolicy. It is deterministic and safe to reuse across calls for the same rules and config.
Rules are strings; duplicates are meaningful.
A bare tag name allows that tag a limited number of times.
Examples:
"a"allows at most 1<a>element."a", "a"allows at most 2<a>elements.
Matching is case-insensitive. Canonical form is lowercase.
Format:
tag|attr
Examples:
"a|href"allows<a href="...">."html|lang"allows<html lang="...">.
Attributes are only kept if the tag is allowed and the attribute is explicitly allowed (or permitted by allowCommonAttributes).
When enabled, the sanitizer allows a conservative set of attributes without extra rules:
- Global:
class,id a:href,title,target,relimg:src,alt,title,width,heighthtml:lang
Format:
style|<selectorOrTag>|<cssProperty>
Examples:
"style|.header|margin""style|div|background-color"
<style> tags are removed unless the style tag is allowed and at least one style|...|... rule exists. Declarations not matching allowed selector/property pairs are removed. The CSS filter parses styles and removes all at-rules (including @import) and any declarations that use url().
Inline style attributes are filtered using the same allowlist. To keep any inline styles, the tag must allow the style attribute (e.g. p|style) and there must be a matching style rule using either style|*|prop or style|tag|prop. Declarations that are not allowlisted are removed, and the attribute is dropped if nothing remains.
allowJavaScriptdefaults tofalse.- Event handlers (
on*) are stripped whenallowJavaScriptisfalse. javascript:anddata:URLs are removed fromhref,src,xlink:href,action,formaction,poster, andsrcsetwhenallowJavaScriptisfalse.<script>tags are removed whenallowJavaScriptisfalse.- Output is always sanitized by DOMPurify using the configured allowlist to mitigate XSS in both browser and Node environments. When
allowJavaScriptistrue, scripts and script-related attributes still require explicit rules to be preserved.
npm testSee SECURITY.md for reporting guidance and supported versions.
MIT