address-clustering-attribution
Address clustering and attribution
Educational only. Do not assist with harassment, non-consensual doxxing, or sanctions evasion. High-stakes conclusions require legal and compliance process—not a vendor UI alone.
Why clustering exists
Addresses are pseudonymous. Clustering infers that multiple addresses are likely controlled by the same real-world actor, so analysts can reason about “wallets” rather than isolated strings.
All clustering is probabilistic—wrong merges and missed links happen.
UTXO chains (e.g. Bitcoin)
- Common input ownership heuristic (CIOH) — If two (or more) addresses appear as inputs to the same transaction, they are often assumed to be controlled by the same spender (standard wallet behavior). Exceptions exist (coinjoin, collaborative txs, some privacy techniques).
- Change outputs — Many spends send value to a recipient and “change” back to a new address controlled by the sender; change-detection heuristics try to link those outputs to the same wallet cluster over time.
Privacy practices (coinjoin, careful coin selection) weaken naive clustering.
Account-based chains (e.g. EVM)
- Deposit–sweep patterns — Exchanges often give users unique deposit addresses; funds later sweep to hot/cold pools. Analytics tools model “user deposit → exchange cluster” links using repeated patterns and timing.
- Smart contract wallets — Multisig, account abstraction, and relayers can complicate “one EOA = one person” assumptions.
Entity, label, and tag (metadata layers)
| Layer | Typical meaning |
|---|---|
| Entity | A cluster of addresses grouped as one actor (person, fund, exchange)—often multichain |
| Label | Name for a specific address (e.g. named hot wallet); may include private user labels |
| Tags | Behavioral or risk descriptors (many per address)—often mixed automation, analyst review, community input |
Attribution maps a cluster to a real-world name using OSINT, subpoenas, exchange cooperation, or leaks—confidence varies.
Flow tracing concepts
- Peel chains — Series of rapid partial withdrawals/movements sometimes used to obfuscate; pattern-based detection is heuristic.
- Taint / proximity scoring — Some tools score how closely funds relate to a flagged source along a path; definitions differ by vendor and are not legal verdicts.
Quality practices
- Treat clusters as hypotheses; seek independent corroboration for accusations or compliance actions.
- Stale or wrong labels and merges occur—especially after protocol upgrades or custodial restructuring.
- Mixer / privacy and cross-chain bridges break simple narratives—trace may be incomplete.
Related
For investigation workflow (OSINT steps), see crypto-investigation-compliance. For platform use cases (AML dashboards), see blockchain-analytics-operations. For bridge-linked and multi-chain unified clustering (graphs across chains, wrapped-asset normalization), see cross-chain-clustering-techniques-agent.
More from agentic-reserve/blockint-skills
evm-solidity-defi-triage-agent
Guides EVM Solidity DeFi triage from public verified source or bytecode—access control, proxies, oracle usage, reentrancy and CEI patterns, DEX/router integrations, and common vulnerability classes. Use when the user asks for Ethereum or L2 smart contract security review, Solidity audit triage, OpenZeppelin proxy risks, or EVM-specific DeFi patterns—not for live exploits or private keys.
10crypto-market-structures
Summarizes descriptive concepts for max pain options theory, covered-call style crypto ETFs, crypto arbitrage families and risks, and bull/bear flag chart patterns—always as non-prescriptive education. Use when the user asks about max pain, premium income ETFs, arbitrage, funding rates, flash loans, or bull/bear flags in crypto trading context.
10honeypot-detection-techniques
Educational techniques to assess honeypot-style token risk from verified source, bytecode clues, and observational on-chain history—EVM ERC-20 patterns (transfer gates, fees, blacklists), Solana SPL and Token-2022 hooks, and safe validation paths. Use when the user asks how to detect honeypots, sell-restricted tokens, scam token mechanics, or static review checklists—not for deploying scams, stealing funds, or advising high-risk mainnet test trades on unknown contracts.
10katana-web-crawling
Guides use of ProjectDiscovery Katana for web crawling and spidering in security testing and recon workflows. Covers installation, standard vs headless mode, scope and rate limits, JSONL output, and piping from httpx or URL lists. Use when the user mentions Katana, projectdiscovery/katana, web crawling, spidering, endpoint discovery, attack surface mapping, or chaining crawlers in automation pipelines.
10solana-defi-vulnerability-analyst-agent
Guides discovery and documentation of Solana DeFi protocol risks from public code and chain state—Anchor/native programs, PDAs, CPIs, oracles, pools, SPL mechanics, and historical tx reconstruction. Use when the user asks for Solana program security review, DeFi vulnerability triage, PDA or CPI safety, oracle or liquidity-pool risk, launchpad/bonding-curve issues, or evidence-backed severity findings without exploits or private keys.
10solana-tracing-specialist
Guides Solana-specific on-chain forensics—ATA resolution, SPL instruction parsing, transaction history via RPC and indexers (e.g. Helius-style APIs), fund-flow graphs, Solana clustering heuristics, and program authority review. Use when the user investigates Solana wallets, SPL tokens, DEX/Jito flows, rug or phishing patterns on Solana, or needs evidence-structured tracing reports with public data only.
10