NYC
skills/letta-ai/skills/break-filter-js-from-html

break-filter-js-from-html

SKILL.md

Break Filter JS From HTML

Overview

This skill provides a systematic methodology for analyzing and bypassing HTML sanitization filters that attempt to prevent JavaScript execution. The focus is on understanding filter mechanics deeply before attempting bypasses, and on robust verification of solutions.

When to Use This Skill

  • Analyzing HTML sanitization filters to find bypass vectors
  • CTF challenges involving XSS filter evasion
  • Authorized security testing of web application input sanitization
  • Understanding parser differentials between server-side parsers and browsers

Phase 1: Environment and Filter Analysis

Before attempting any bypass, thoroughly understand the test environment and filter mechanics.

Environment Reconnaissance

  1. Identify all relevant file locations - Locate the filter implementation, test harness, and any configuration files
  2. Understand the test verification process - Determine how success is measured (browser alert, DOM inspection, etc.)
  3. Verify path dependencies - Check if tests expect files at specific paths; create symlinks or copies if needed
  4. Document the execution flow - Trace how input flows from your payload through the filter to the browser

Filter Mechanism Analysis

Examine the filter code to understand:

  1. Parsing library used - Different parsers (BeautifulSoup, DOMPurify, html-sanitizer, etc.) have different behaviors
  2. What elements are removed - Script tags, iframes, objects, embeds, etc.
  3. What attributes are stripped - Event handlers (on*), href with javascript:, etc.
  4. Processing order - Does the filter run once or recursively? Are there multiple passes?
  5. Output encoding - Is the output HTML-encoded, or passed through raw?

Create a Filter Output Test

Before running browser tests, create a quick method to see the filter's output directly:

# Example: Check what the filter outputs for a given input
echo '<script>alert(1)</script>' > /tmp/test.html && python filter.py /tmp/test.html && cat /tmp/test.html

This allows rapid iteration without slow browser-based testing.

Phase 2: Bypass Strategy Selection

Based on the filter analysis, select appropriate bypass strategies. Order these by likelihood of success given the specific filter.

Parser Differential Exploits

Parser differentials occur when the server-side filter parses HTML differently than browsers. This is often the most effective approach for library-based filters.

Key concept: The filter's parser may interpret certain HTML constructs differently than browsers, allowing tags that appear "safe" to the filter to execute JavaScript in browsers.

Elements that commonly cause parser differentials:

  • <noscript> - Parsed differently with/without JavaScript enabled
  • <template> - Content may not be parsed as HTML by some libraries
  • <textarea> and <title> - RCDATA parsing contexts
  • Comments and CDATA sections
  • Malformed or nested tags

Encoding and Obfuscation

  • HTML entity encoding (decimal, hex, named entities)
  • Unicode normalization issues
  • Double encoding
  • Null bytes and other special characters
  • Case variations (if filter is case-sensitive)

DOM Clobbering and Indirect Execution

  • Creating elements that shadow built-in properties
  • Exploiting existing JavaScript that reads from DOM
  • CSS-based attacks (if JavaScript reads computed styles)

Lesser-Known Vectors

  • SVG with embedded scripts or event handlers
  • MathML elements
  • XML processing instructions (if XHTML mode)
  • Data URIs in appropriate contexts

Phase 3: Systematic Testing

Testing Methodology

  1. Test filter output first - Before browser testing, verify the filter passes your payload through
  2. Use a minimal payload - Start with the simplest possible XSS (alert(1)) before complex payloads
  3. Document each attempt - Record what was tried, filter output, and browser result
  4. Understand failures - When a technique fails, determine if it was filtered or if the browser didn't execute it

Efficient Iteration Pattern

1. Hypothesize a bypass based on filter analysis
2. Test against filter directly (fast)
3. If filter passes payload through, test in browser
4. If browser doesn't execute, investigate why
5. If filter blocks, analyze how and adjust approach

Avoid These Inefficiencies

  • Running slow browser tests for payloads that don't survive the filter
  • Moving to new techniques without understanding why previous ones failed
  • Trying browser-incompatible techniques (e.g., deprecated HTML features)

Phase 4: Verification

Robust Solution Verification

A single passing test is insufficient. Verify solutions thoroughly:

  1. Run multiple times - Ensure the solution works consistently, not just once
  2. Test filter idempotency - Run the filtered output through the filter again to ensure it still works
  3. Check for timing issues - Browser-based tests may have race conditions
  4. Verify in isolation - Test the filtered HTML directly in a browser outside the test harness
  5. Document exact steps - Record the precise sequence to reproduce the successful bypass

Before Declaring Success

  • Confirm the test passes multiple consecutive runs
  • Verify no pending file modifications could invalidate the solution
  • Ensure the solution doesn't depend on test environment quirks
  • Check that the final state of all files is correct

Common Pitfalls

Environment Issues

  • Path mismatches - Test harnesses may expect files at specific locations different from where you found them
  • Stale state - Previous failed attempts may leave files in unexpected states
  • Permission issues - Filters may fail silently if they can't write output files

Analysis Mistakes

  • Assuming filter behavior - Always verify by reading the code; don't guess what's filtered
  • Ignoring processing order - A filter that removes <script> then <iframe> may be bypassed differently than one that does it in reverse
  • Missing recursive filtering - Some filters process until no more matches; others run once

Testing Mistakes

  • Browser-specific payloads - Techniques that work in one browser may fail in another
  • Deprecated HTML - Many classic XSS vectors no longer work in modern browsers
  • Premature optimization - Getting a complex payload through is worthless if a simpler one works

Verification Mistakes

  • Single test run - Flaky tests can pass once then fail
  • Modifying files after success - Any changes after a successful test may invalidate it
  • Ignoring test harness quirks - The test may measure success differently than expected
Weekly Installs
18
Repository
letta-ai/skills
First Seen
Jan 24, 2026
Installed on
claude-code12
codex11
gemini-cli11
opencode11
antigravity10
windsurf9