deobfuscating-javascript-malware
SKILL.md
Deobfuscating JavaScript Malware
When to Use
- Investigating a phishing page with obfuscated JavaScript that performs credential harvesting or redirect
- Analyzing a web skimmer (Magecart-style) injected into an e-commerce site
- Deobfuscating a JavaScript dropper that downloads and executes second-stage malware
- Examining malicious email attachments containing HTML files with embedded obfuscated scripts
- Analyzing browser exploit kits that use heavy JavaScript obfuscation to hide exploit delivery
Do not use for obfuscated JavaScript that is merely minified production code; use a standard beautifier instead.
Prerequisites
- Node.js 18+ installed for executing and debugging JavaScript in a controlled environment
- Python 3.8+ with
jsbeautifierlibrary for code formatting - Browser developer tools (Chrome DevTools) for controlled execution in an isolated browser
- CyberChef (https://gchq.github.io/CyberChef/) for encoding/decoding operations
- de4js or JStillery for automated JavaScript deobfuscation
- Isolated analysis VM with no access to production systems or sensitive data
Workflow
Step 1: Safely Extract and Examine the Obfuscated Script
Isolate the malicious JavaScript without executing it:
# Extract JavaScript from HTML file
python3 << 'PYEOF'
from html.parser import HTMLParser
class ScriptExtractor(HTMLParser):
def __init__(self):
super().__init__()
self.in_script = False
self.scripts = []
self.current = ""
def handle_starttag(self, tag, attrs):
if tag == "script":
self.in_script = True
self.current = ""
def handle_endtag(self, tag):
if tag == "script":
self.in_script = False
if self.current.strip():
self.scripts.append(self.current)
def handle_data(self, data):
if self.in_script:
self.current += data
with open("malicious_page.html") as f:
parser = ScriptExtractor()
parser.feed(f.read())
for i, script in enumerate(parser.scripts):
with open(f"script_{i}.js", "w") as f:
f.write(script)
print(f"Extracted script_{i}.js ({len(script)} bytes)")
PYEOF
# Beautify the extracted JavaScript
npx js-beautify script_0.js -o script_0_pretty.js
Step 2: Identify Obfuscation Techniques
Categorize the obfuscation methods used:
Common JavaScript Obfuscation Techniques:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
String Encoding:
- Hex encoding: "\x68\x65\x6c\x6c\x6f" -> "hello"
- Unicode escapes: "\u0068\u0065\u006c\u006c\u006f" -> "hello"
- Base64: atob("aGVsbG8=") -> "hello"
- charCodeAt/fromCharCode: String.fromCharCode(104,101,108,108,111)
- Array-based lookup: var _0x1234 = ["hello","world"]; _0x1234[0]
Eval Chains:
- eval(atob("..."))
- eval(unescape("..."))
- new Function("return " + decoded)()
- document.write("<script>" + decoded + "</script>")
- setTimeout(decoded, 0)
Control Flow:
- Switch-case dispatcher with shuffled case order
- Opaque predicates (always-true/false conditions)
- Dead code insertion
- Variable name mangling (_0x4a3b, _0xab12)
Anti-Analysis:
- Debugger traps: setInterval(function(){debugger;}, 100)
- Console detection: overriding console.log
- Timing checks: performance.now() deltas
- DevTools detection: window.outerWidth - window.innerWidth > 100
Step 3: Remove Anti-Analysis Protections
Neutralize anti-debugging and anti-analysis traps:
// Remove debugger traps before analysis
// Replace in the obfuscated script:
// Before:
setInterval(function() { debugger; }, 100);
// After (neutralized):
setInterval(function() { /* debugger removed */ }, 100);
// Neutralize DevTools detection
// Before:
if (window.outerWidth - window.innerWidth > 160) { window.location = "about:blank"; }
// After:
if (false) { window.location = "about:blank"; }
// Neutralize timing checks
// Override performance.now to return consistent values
const originalNow = performance.now;
performance.now = function() { return 0; };
Step 4: Decode String Obfuscation Layers
Progressively decode encoded strings:
# Python script to decode common JS obfuscation patterns
import re
import base64
import urllib.parse
def decode_hex_strings(code):
"""Replace \\xNN sequences with ASCII characters"""
def hex_replace(match):
hex_str = match.group(0)
try:
return bytes.fromhex(hex_str.replace("\\x", "")).decode("ascii")
except:
return hex_str
return re.sub(r'(?:\\x[0-9a-fA-F]{2})+', hex_replace, code)
def decode_unicode_escapes(code):
"""Replace \\uNNNN sequences with characters"""
def unicode_replace(match):
return chr(int(match.group(1), 16))
return re.sub(r'\\u([0-9a-fA-F]{4})', unicode_replace, code)
def decode_charcode_arrays(code):
"""Resolve String.fromCharCode calls"""
def charcode_replace(match):
codes = [int(c.strip()) for c in match.group(1).split(",")]
return '"' + "".join(chr(c) for c in codes) + '"'
return re.sub(r'String\.fromCharCode\(([0-9,\s]+)\)', charcode_replace, code)
def decode_base64_strings(code):
"""Resolve atob() calls with static strings"""
def atob_replace(match):
try:
decoded = base64.b64decode(match.group(1)).decode("utf-8")
return f'"{decoded}"'
except:
return match.group(0)
return re.sub(r'atob\(["\']([A-Za-z0-9+/=]+)["\']\)', atob_replace, code)
# Apply all decoders
with open("script_0.js") as f:
code = f.read()
code = decode_hex_strings(code)
code = decode_unicode_escapes(code)
code = decode_charcode_arrays(code)
code = decode_base64_strings(code)
with open("script_0_decoded.js", "w") as f:
f.write(code)
print("Decoded strings written to script_0_decoded.js")
Step 5: Resolve Eval Chains Safely
Unwrap eval/Function constructor chains without executing:
// Node.js script to safely resolve eval chains
// Run in isolated environment: node --experimental-vm-modules deobfuscate.js
const vm = require('vm');
// Create sandboxed context with logging
const sandbox = {
eval: function(code) {
console.log("=== EVAL INTERCEPTED ===");
console.log(code.substring(0, 500));
console.log("========================");
return code; // Return the code instead of executing it
},
document: {
write: function(html) {
console.log("=== DOCUMENT.WRITE INTERCEPTED ===");
console.log(html.substring(0, 500));
},
getElementById: function() { return { innerHTML: "" }; }
},
window: { location: { href: "" } },
atob: function(s) { return Buffer.from(s, 'base64').toString(); },
unescape: unescape,
setTimeout: function(fn) { if (typeof fn === 'string') console.log("TIMEOUT CODE:", fn); },
console: console,
String: String,
Array: Array,
parseInt: parseInt,
RegExp: RegExp,
};
const context = vm.createContext(sandbox);
// Load and execute the obfuscated script in sandbox
const fs = require('fs');
const code = fs.readFileSync('script_0.js', 'utf8');
try {
vm.runInContext(code, context, { timeout: 5000 });
} catch(e) {
console.log("Execution error (expected):", e.message);
}
Step 6: Analyze the Deobfuscated Payload
Examine the revealed malicious logic:
Deobfuscated Malware Categories and IOC Extraction:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Credential Harvester:
- Form action URLs (exfiltration endpoints)
- XMLHttpRequest/fetch destinations
- Targeted input field names (username, password, cc_number)
Web Skimmer (Magecart):
- Payment form overlay injection
- Card data exfiltration URLs
- Keylogger event listeners (onkeypress, oninput)
Redirect Script:
- Destination URLs in location.href assignments
- Conditional redirects based on user-agent or referrer
- Cloaking logic (show benign content to bots)
Exploit Kit Landing:
- Browser/plugin version checks
- Exploit payload URLs
- Shellcode embedded as arrays or encoded strings
Key Concepts
| Term | Definition |
|---|---|
| Eval Chain | Nested layers of eval(), Function(), or document.write() calls that each decode one layer of obfuscation before passing to the next |
| String Array Rotation | Obfuscation technique storing all strings in a shuffled array and accessing them by computed index to hide string literals |
| Dead Code Insertion | Adding non-functional code blocks that never execute to increase analysis complexity and confuse pattern matching |
| Opaque Predicate | Conditional expression whose outcome is predetermined but difficult to determine statically; used to obscure control flow |
| Anti-Debugging | JavaScript techniques to detect and thwart browser DevTools or debugger usage including debugger statements and timing checks |
| Web Skimmer | Malicious JavaScript injected into e-commerce sites to steal payment card data from checkout forms (Magecart attack) |
Tools & Systems
- CyberChef: GCHQ's web-based tool for encoding/decoding transformations useful for unwinding multi-layer obfuscation
- de4js: Online JavaScript deobfuscator supporting common obfuscation tools (obfuscator.io, JScrambler)
- Node.js VM Module: Sandboxed JavaScript execution environment for safely evaluating obfuscated code with intercepted APIs
- Chrome DevTools: Browser developer tools for stepping through JavaScript execution with breakpoints and console access
- JSDetox: JavaScript malware analysis tool providing execution emulation and deobfuscation
Common Scenarios
Scenario: Deobfuscating a Magecart Web Skimmer
Context: A compromised e-commerce site has obfuscated JavaScript injected into its checkout page. The script needs deobfuscation to identify the data exfiltration endpoint and determine what customer data was stolen.
Approach:
- Extract the injected script from the page source (often appended to a legitimate JS file or loaded from an external domain)
- Beautify the code and identify the obfuscation technique (typically string array + rotation + hex encoding)
- Decode string encoding layers (hex -> Unicode -> base64) using the Python decoder script
- Resolve the string array by evaluating the array definition and rotation function
- Identify the form targeting logic (querySelector for payment form fields)
- Extract the exfiltration URL from the XMLHttpRequest or fetch call
- Document stolen data fields and exfiltration endpoint for incident response
Pitfalls:
- Executing obfuscated scripts on a connected system (the script may phone home during analysis)
- Not removing anti-debugging traps before using browser DevTools (infinite debugger loops)
- Missing additional obfuscation layers loaded dynamically from external URLs
- Overlooking base64-encoded inline images or data URIs that may contain additional scripts
Output Format
JAVASCRIPT MALWARE DEOBFUSCATION REPORT
=========================================
Source: checkout.js (injected into example-shop.com)
Obfuscation: obfuscator.io (string array + rotation + hex encoding)
Layers Removed: 3
OBFUSCATION TECHNIQUES IDENTIFIED
[1] String array with 247 entries, rotated by 0x1a3
[2] Hex-encoded string references (\x68\x65\x6c\x6c\x6f)
[3] Base64-wrapped eval chain (2 layers)
[4] Anti-debugging: setInterval debugger trap
DEOBFUSCATED FUNCTIONALITY
Type: Magecart Payment Card Skimmer
Target Forms: input[name*="card"], input[name*="cc_"]
Data Captured: Card number, expiration, CVV, cardholder name
Exfil Method: POST via XMLHttpRequest
Exfil URL: hxxps://analytics-cdn[.]com/collect
Exfil Format: JSON { "cn": card_number, "exp": expiry, "cv": cvv }
Trigger: Form submit event on checkout page
EXTRACTED IOCs
Domains: analytics-cdn[.]com
IPs: 185.220.101[.]42
URLs: hxxps://analytics-cdn[.]com/collect
hxxps://analytics-cdn[.]com/gate.js
Weekly Installs
5
Repository
mukul975/anthro…y-skillsGitHub Stars
2.4K
First Seen
4 days ago
Security Audits
Installed on
opencode5
gemini-cli5
github-copilot5
codex5
kimi-cli5
amp5