skills/natsufox/a-stockit/watchlist-import

watchlist-import

Installation
SKILL.md

Watchlist Import

Import a watchlist for: $ARGUMENTS

Overview

  • Implementation status: code-backed
  • Local entry script: <bundle-root>/watchlist-import/run.py
  • Primary purpose: normalize raw universe input into resolved and unresolved items without silently guessing ambiguous names
  • Research layer: universe formation (Stage 1: Strategy Design & Hypothesis Formation - Universe definition)
  • Workflow stage: stage 1 Strategy Design & Universe Formation
  • Local executor guarantee: parse raw text or supported files, normalize symbol-like items, preserve unresolved entries, and return a structured import summary

Use When

  • The user starts with pasted symbols, symbol-like text, or a watchlist file.
  • The caller needs symbol normalization before screening, dashboards, or repeat coverage.
  • The user wants unresolved names separated from recognized symbols instead of guessed mapping.
  • The user wants to build a universe from external sources (spreadsheets, text files, web scraping results).
  • The user wants to validate and clean a symbol list before analysis.

Do Not Use When

  • The user already has a clean symbol list and wants ranking. Use market-screen or decision-dashboard.
  • The user wants post-import grouping, review cadence, or repeated tracking. Use watchlist.
  • The user wants one-symbol analysis rather than universe construction.
  • The user wants portfolio construction with position sizing. Use decision-support or strategy-design.

Inputs

  • Direct positional items interpreted as raw symbol-like text.
  • Optional --text TEXT: explicit text payload as the watchlist source.
  • Optional --file PATH: load from a local text or .xlsx file.
  • Current implementation scope: text and .xlsx parsing only.
  • Universe-construction note:
    • import does not validate liquidity, sector fit, tradeability, or strategy fit
    • it only resolves symbol identity as far as the local parser can do so safely

Execution

Step 1: Define import source and objectives

Before importing, clarify the source and purpose:

Import source types:

  • Direct text input (pasted symbols, comma-separated, space-separated)
  • Text file (one symbol per line, or delimited)
  • Spreadsheet file (.xlsx, .csv with symbol column)
  • Mixed format (symbols + company names + other text)
  • Web scraping result (HTML table, JSON, etc.)

Import objectives:

  • Build initial universe for screening
  • Clean and validate existing watchlist
  • Merge multiple watchlists
  • Convert company names to symbols
  • Validate symbol format and existence

Expected output:

  • Clean symbol list (resolved symbols only)
  • Unresolved items list (for manual review)
  • Import statistics (total, resolved, unresolved, confidence)
  • Ready for downstream screening or analysis

Step 2: Parse and normalize input

Apply systematic parsing and normalization:

Text parsing:

  • Delimiter detection: Identify separators (comma, space, newline, tab, semicolon)
  • Symbol extraction: Extract symbol-like patterns (6-digit codes, alphanumeric with dots)
  • Company name detection: Identify Chinese company names (contains 公司, 股份, 集团, etc.)
  • Noise removal: Remove common non-symbol text (headers, labels, punctuation)

Symbol normalization:

  • A-share format: 6-digit code (e.g., 600519, 000001, 300750)
  • Board prefix: Add board prefix if missing (SH for 600xxx/601xxx/603xxx/688xxx, SZ for 000xxx/001xxx/002xxx/300xxx)
  • Case normalization: Uppercase for consistency
  • Padding: Pad short codes to 6 digits if needed (e.g., 1 → 000001)

Company name resolution:

  • Exact match: Look up company name in symbol database
  • Fuzzy match: Try partial matching for common abbreviations
  • Ambiguity handling: If multiple matches, mark as unresolved
  • No match: Keep in unresolved list for manual review

Spreadsheet parsing:

  • Column detection: Identify symbol column (by header or content pattern)
  • Multi-column support: Extract symbols from multiple columns if present
  • Row filtering: Skip header rows, empty rows, total rows
  • Data type handling: Convert numeric codes to strings with leading zeros

Step 3: Validate and categorize symbols

Validate each resolved symbol:

Symbol format validation:

  • 6-digit numeric code
  • Valid board prefix (SH or SZ)
  • Matches A-share symbol pattern
  • No invalid characters

Symbol existence validation:

  • Symbol exists in A-share market
  • Symbol is not delisted
  • Symbol has recent trading data
  • Symbol board matches prefix

Symbol categorization:

  • Main board: 600xxx, 601xxx, 603xxx (Shanghai), 000xxx, 001xxx (Shenzhen)
  • ChiNext: 300xxx (Shenzhen)
  • STAR Market: 688xxx (Shanghai)
  • Beijing Stock Exchange: 8xxxxx, 4xxxxx
  • ST/*ST: Special treatment stocks (higher risk)

Confidence scoring (0-100):

  • High confidence (90-100): Valid 6-digit code, exists in market, has recent data
  • Medium confidence (70-89): Valid format, exists in market, but stale data or suspended
  • Low confidence (50-69): Valid format, but cannot verify existence
  • Unresolved (0-49): Invalid format, ambiguous, or no match found

Step 4: Handle unresolved items

Process items that cannot be resolved:

Unresolved categories:

  • Invalid format: Not a valid A-share symbol format
  • Ambiguous: Multiple possible matches (e.g., company name matches multiple symbols)
  • Not found: Valid format but symbol does not exist
  • Delisted: Symbol was valid but is now delisted
  • Foreign: Non A-share symbol (H-share, US stock, etc.)

For each unresolved item:

  • Original text (as provided by user)
  • Reason for non-resolution (invalid format, ambiguous, not found, etc.)
  • Suggested actions (manual lookup, format correction, removal)
  • Possible matches (if ambiguous, list candidates)

Unresolved handling strategies:

  • Preserve for manual review: Keep in separate list for user to resolve
  • Suggest corrections: Provide likely corrections based on fuzzy matching
  • Flag for removal: Clearly invalid items that should be removed
  • Request clarification: Ambiguous items that need user input

Step 5: Generate import summary

Organize results into structured report:

Part 1: Import Summary

  • Import source (text, file, spreadsheet)
  • Import timestamp
  • Total items processed
  • Resolved symbols (count and %)
  • Unresolved items (count and %)
  • Average confidence score

Part 2: Resolved Symbols For each resolved symbol:

  • Symbol (with board prefix)
  • Company name (if available)
  • Board (Main/ChiNext/STAR/Beijing)
  • Confidence score (0-100)
  • Status (active/suspended/ST/*ST)
  • Latest price and date (if available)

Part 3: Resolved Symbol Statistics

  • Board distribution (% Main, ChiNext, STAR, Beijing)
  • Status distribution (% active, suspended, ST/*ST)
  • Confidence distribution (% high, medium, low)
  • Sector distribution (if available)
  • Market cap distribution (if available)

Part 4: Unresolved Items For each unresolved item:

  • Original text
  • Reason for non-resolution
  • Suggested action
  • Possible matches (if any)

Part 5: Unresolved Item Statistics

  • Unresolved by category (invalid format, ambiguous, not found, delisted, foreign)
  • Unresolved by source (which part of input had most issues)
  • Resolvability assessment (% that could be resolved with manual effort)

Part 6: Import Quality Assessment

  • Import quality score (0-100)
    • Based on resolution rate, confidence scores, and data completeness
  • Data completeness (% with company name, price, status)
  • Source quality (clean vs. noisy input)
  • Screening readiness (ready/needs cleanup/major issues)

Part 7: Recommended Next Steps

  • If high quality (>90% resolved, high confidence):
    • Ready for market-screen or decision-dashboard
  • If medium quality (70-90% resolved):
    • Review unresolved items, then proceed to screening
  • If low quality (<70% resolved):
    • Manual cleanup required before screening
    • Suggest using watchlist for ongoing management

Step 6: Run the local executor

python3 <bundle-root>/watchlist-import/run.py [--text TEXT] [--file PATH] [items...]

Step 7: Review result as universe-intake artifact

When delivering results, maintain proper framing:

Import interpretation:

  • This is symbol normalization and validation, not investment recommendation
  • Resolved symbols are format-valid and exist in market, not necessarily good investments
  • Confidence scores reflect data quality, not investment quality
  • Unresolved items require manual review, not automatic exclusion

Source transparency:

  • State import source (text, file, spreadsheet, mixed)
  • State source quality (clean, noisy, mixed format)
  • State whether source looked like pure symbols or mixed content
  • State any parsing assumptions made

Quality assessment:

  • State import quality score and what it means
  • State resolution rate and confidence distribution
  • State whether result is screening-ready or needs cleanup
  • State specific issues found (invalid formats, ambiguous names, etc.)

Ambiguity preservation:

  • Do not silently coerce ambiguous names into symbols
  • Keep unresolved items explicit with reasons
  • Provide suggested actions for each unresolved item
  • Do not guess or fabricate symbol mappings

Validation limitations:

  • Import validates format and existence, not tradeability
  • Import does not validate liquidity, listing status, or strategy fit
  • Import does not check for duplicates across different formats (e.g., 600519 vs. SH600519)
  • Import does not validate sector fit, market cap range, or other strategy criteria

Step 8: Route to appropriate next skill

Based on import quality, route to next step:

High quality import (>90% resolved):

  • market-screen: Rank symbols by technical strength
  • decision-dashboard: Generate batch decision summary
  • watchlist: Save for ongoing monitoring

Medium quality import (70-90% resolved):

  • Manual review of unresolved items first
  • Then proceed to market-screen or decision-dashboard
  • Consider using watchlist for ongoing cleanup

Low quality import (<70% resolved):

  • Manual cleanup required
  • Use watchlist for iterative refinement
  • Consider re-importing with cleaner source

For individual symbol analysis:

  • market-brief: Quick overview of specific symbols
  • analysis: Full thesis development for selected symbols

Output Contract

  • Minimum local executor output: human-readable text beginning with 导入结果.
  • Core fields: total item count, resolved symbol count, unresolved count, resolved entries, unresolved entries.
  • Side effects: stores a session-history entry for the import operation.
  • Caller-facing delivery standard:
    • Seven-part structure: Import summary, resolved symbols, resolved statistics, unresolved items, unresolved statistics, quality assessment, recommended next steps
    • Source transparency: State source form (text, file, spreadsheet, mixed) and quality (clean, noisy)
    • Resolution accounting: Distinguish resolved symbols from unresolved items with specific reasons
    • Confidence scoring: Provide confidence scores (0-100) for each resolved symbol
    • Ambiguity preservation: Keep unresolved entries explicit, do not silently coerce or guess
    • Quality assessment: Import quality score, resolution rate, confidence distribution, screening readiness
    • Validation scope: State what was validated (format, existence) and what was NOT (liquidity, tradeability, strategy fit)
    • Next steps guidance: Recommend appropriate downstream skills based on import quality
    • No investment claims: Resolved symbols are format-valid, not investment recommendations
  • Non-guarantees:
    • No liquidity, listing-status, survivorship, or strategy-fit validation is guaranteed here
    • No durable local watchlist store is created by this skill alone (use watchlist for persistence)
    • No duplicate detection across different symbol formats
    • No sector, market cap, or other strategy criteria validation

Failure Handling

  • Parse and argument errors: non-zero exit with a readable 命令错误 message.
  • File read or parse failures: readable failure text beginning with 执行失败:.
  • Unrecognized names: stay in the unresolved list instead of being guessed.
  • If the source format is unsupported, say so directly instead of partial silent import.
  • Ambiguous company names: list all possible matches in unresolved items, do not pick one arbitrarily.
  • Invalid symbol formats: keep in unresolved list with specific format error.
  • Delisted symbols: flag as delisted in unresolved list, do not silently include.

Key Rules

  • Treat import as preprocessing, not as recommendation.
  • Preserve ambiguity and unresolved rows explicitly.
  • Do not silently coerce names into symbols.
  • Use the import result as the intake gate before ranking, dashboarding, or repeated coverage.
  • Confidence scoring is mandatory. Every resolved symbol must have confidence score.
  • Unresolved categorization is mandatory. Every unresolved item must have reason.
  • Quality assessment is mandatory. Provide import quality score and screening readiness.
  • Source transparency is mandatory. State source type and quality.
  • Validation scope must be explicit. State what was validated and what was NOT.
  • No silent guessing. If ambiguous, keep unresolved with suggestions.
  • No fabricated mappings. Only resolve when confident.
  • Routing guidance must be specific. Recommend next skill based on import quality.

Composition

  • Natural upstream step for market-screen, decision-dashboard, and watchlist.
  • Can also prepare symbol lists for analysis and repeated one-symbol workflows.
  • Should be followed by watchlist for persistent storage and ongoing management.
  • High quality imports can proceed directly to market-screen or decision-dashboard.
  • Low quality imports should route to watchlist for iterative cleanup.
  • Resolved symbols can feed into any single-symbol skill (market-brief, analysis, etc.).
Weekly Installs
1
GitHub Stars
2
First Seen
Mar 20, 2026