linear-a-decipherment
Linear A Decipherment
Computational pipeline for analyzing Linear A inscriptions against Semitic roots, formalizing Cyrus H. Gordon's five-step decipherment methodology. Built on data from lashon-ha-kretan (1,701 inscriptions, 60 Gordon readings, 2,871 Proto-Semitic roots).
Base directory: ~/.claude/skills/linear-a-decipherment
Scholarly Disclaimer
All readings are hypothetical. Linear A remains officially undeciphered. Gordon's Semitic hypothesis is one of several competing frameworks. Include this disclaimer on every analytical output.
Confidence Taxonomy
Every proposed reading must be tagged with a confidence level:
| Level | Criteria | Example |
|---|---|---|
| CONFIRMED | Ideographic + phonetic + mathematical confirmation | KU-NI-SU (emmer wheat) |
| PROBABLE | Direct Gordon reading + external attestation | DA-KU-SE-NE (Hurrian name at Nuzi) |
| CANDIDATE | Gordon reading or strong Proto-Semitic match (d < 0.3) | New cognate from distance search |
| SPECULATIVE | Weak phonetic match or single-source evidence | Proto-Semitic match with d > 0.5 |
Reference File Protocol
Route questions to the right reference before answering:
Question about a specific reading or word?
→ Read references/gordon-lexicon.md
→ Run: uv run scripts/cognate_search.py "WORD"
Question about methodology or approach?
→ Read references/methodology.md
Question about sign values or the syllabary?
→ Read references/sign-values.md
Question about ML/computational approaches?
→ Read references/ml-approaches.md
Question about a specific inscription?
→ Run: uv run scripts/analyze.py single INSCRIPTION_NAME
Question about corpus statistics?
→ Run: uv run scripts/sign_analysis.py SUBCOMMAND
Data Dependencies
Source data from lashon-ha-kretan:
| File | Path | Contents |
|---|---|---|
| Inscriptions | ~/Desktop/Programming/lashon-ha-kretan/LinearAInscriptions.js |
~1,701 GORILA inscriptions |
| Lexicon | ~/Desktop/Programming/lashon-ha-kretan/semiticLexicon.js |
60 Gordon + 3 YasharMana + 7 scholarly readings |
| Proto-Semitic | ~/Desktop/Programming/lashon-ha-kretan/etymology/Semitic.json |
2,871 roots |
Extracted data cached in data/ (generated by corpus_extract.py --all):
data/corpus.json— Structured inscriptionsdata/gordon.json— Gordon + YasharMana lexicondata/semitic_roots.json— Proto-Semitic rootsdata/cognate_cache.json— Precomputed cognate scores (built bycognate_search.py --build-cache)
If data/ files are missing, run extraction first:
uv run ~/.claude/skills/linear-a-decipherment/scripts/corpus_extract.py --all
Workflows
1. Analyze a Single Inscription
Runs Gordon's 5-step pipeline on one inscription:
# Human-readable report
uv run ~/.claude/skills/linear-a-decipherment/scripts/analyze.py single HT88
# JSON output
uv run ~/.claude/skills/linear-a-decipherment/scripts/analyze.py single HT88 --format json
Steps performed: transliteration extraction, segmentation, consonantal skeleton for each word, cognate search (Gordon → YasharMana → Proto-Semitic cache), coverage summary.
2. Search Cognates for a Word
Find Semitic cognates for any Linear A transliteration:
# Full search with table output
uv run ~/.claude/skills/linear-a-decipherment/scripts/cognate_search.py "KI-RE-TA"
# Skeleton extraction only
uv run ~/.claude/skills/linear-a-decipherment/scripts/cognate_search.py "KI-RE-TA" --skeleton
# JSON with top 10 matches
uv run ~/.claude/skills/linear-a-decipherment/scripts/cognate_search.py "KI-RE-TA" --top 10 --format json
# Skip cache for live Proto-Semitic search
uv run ~/.claude/skills/linear-a-decipherment/scripts/cognate_search.py "KI-RE-TA" --no-cache
Pipeline: transliteration → skeleton (k-r-t) → Gordon direct → YasharMana → Proto-Semitic distance.
3. Find Unknown Words (Discovery Mode)
Identify frequently-occurring words with no known reading—best targets for new cognate proposals:
# Top 20 unknown words appearing 3+ times
uv run ~/.claude/skills/linear-a-decipherment/scripts/analyze.py batch --mode unknowns
# More restrictive: top 10 appearing 5+ times
uv run ~/.claude/skills/linear-a-decipherment/scripts/analyze.py batch --mode unknowns --min-count 5 --top 10
4. Find Promising Inscriptions
Inscriptions with the highest ratio of identified words—best for study:
uv run ~/.claude/skills/linear-a-decipherment/scripts/analyze.py batch --mode promising --top 15
5. Compare Libation Formulas
Group inscriptions containing the libation formula (JA-SA-SA-RA-ME pattern):
# List all libation inscriptions
uv run ~/.claude/skills/linear-a-decipherment/scripts/analyze.py batch --mode libation
# With skeleton alignment
uv run ~/.claude/skills/linear-a-decipherment/scripts/analyze.py batch --mode libation --alignment
6. Corpus Statistics
Statistical analysis of sign patterns:
# Sign frequency (top 30)
uv run ~/.claude/skills/linear-a-decipherment/scripts/sign_analysis.py frequency
# Word frequency with hapax legomena count
uv run ~/.claude/skills/linear-a-decipherment/scripts/sign_analysis.py words
# Sign co-occurrence within words
uv run ~/.claude/skills/linear-a-decipherment/scripts/sign_analysis.py cooccurrence --signs KI,RO,SA
# Positional distribution (initial/medial/final)
uv run ~/.claude/skills/linear-a-decipherment/scripts/sign_analysis.py position
# Site distribution (HT, ZA, PK, etc.)
uv run ~/.claude/skills/linear-a-decipherment/scripts/sign_analysis.py distribution
# JSON output for any subcommand
uv run ~/.claude/skills/linear-a-decipherment/scripts/sign_analysis.py frequency --format json
7. Generate Training Data
Prepare JSONL for ML fine-tuning:
# Preview first 3 entries
uv run ~/.claude/skills/linear-a-decipherment/scripts/finetune_prep.py gordon-pairs --preview 3
# Generate full JSONL
uv run ~/.claude/skills/linear-a-decipherment/scripts/finetune_prep.py gordon-pairs --output data/gordon_pairs.jsonl
v1 produces 63 chat-format pairs (Gordon + YasharMana). See references/ml-approaches.md for v2 augmentation strategy.
8. Reverse Root Search (Semitic Root → Corpus Words)
Given a Semitic consonantal root, find all Linear A words in the corpus whose skeletons match:
# Find corpus words matching root KNS (e.g., kiništu "gathering place")
uv run ~/.claude/skills/linear-a-decipherment/scripts/cognate_search.py --reverse kns
# Broader search with higher distance tolerance
uv run ~/.claude/skills/linear-a-decipherment/scripts/cognate_search.py --reverse kns --max-dist 0.5 -n 30
# JSON output for programmatic use
uv run ~/.claude/skills/linear-a-decipherment/scripts/cognate_search.py --reverse thm --format json
# Search for Baal-related words (b-'-l root)
uv run ~/.claude/skills/linear-a-decipherment/scripts/cognate_search.py --reverse bl
# Search for "give" root (y-t-n)
uv run ~/.claude/skills/linear-a-decipherment/scripts/cognate_search.py --reverse ytn
Pipeline: root consonants → weighted Levenshtein against all corpus word skeletons → ranked by distance, annotated with Gordon/YasharMana readings, occurrence counts, sites, and inscriptions.
9. Extract / Rebuild Corpus
Extract structured data from JS source files:
# Extract everything (inscriptions + lexicons + Proto-Semitic roots)
uv run ~/.claude/skills/linear-a-decipherment/scripts/corpus_extract.py --all
# Inscriptions only, filtered by site
uv run ~/.claude/skills/linear-a-decipherment/scripts/corpus_extract.py --site HT
# Include Gordon lexicon
uv run ~/.claude/skills/linear-a-decipherment/scripts/corpus_extract.py --with-gordon
# Build cognate cache (takes ~10 seconds)
uv run ~/.claude/skills/linear-a-decipherment/scripts/cognate_search.py --build-cache
Integration with Other Skills
| Skill | Usage |
|---|---|
rlama |
Create gordon-dossiers RAG collection from ~/Desktop/minoanmystery-astro/souls/minoan/dossiers/scholarly-sources/gordon/ |
ancient-near-east-research |
Sefaria for Hebrew cognate verification, CDLI for Akkadian parallels |
exa-search |
Search recent computational decipherment papers |
llama-cpp |
Local inference with fine-tuned decipherment models (v2) |
Architecture
~/.claude/skills/linear-a-decipherment/
├── SKILL.md # This file
├── lib/ # Shared Python library
│ ├── __init__.py
│ ├── types.py # Frozen dataclasses (Inscription, LexiconEntry, CognateMatch)
│ ├── js_parser.py # JS Map → Python dict extraction
│ ├── normalization.py # normalize(), lookup_in(), J/Y swap
│ ├── skeleton.py # SIGN_DECOMPOSITION, extract_skeleton()
│ └── phonetics.py # SEMITIC_DISTANCES, weighted_levenshtein()
├── scripts/
│ ├── corpus_extract.py # JS → JSON extraction
│ ├── cognate_search.py # Forward + reverse cognate search + cache builder
│ ├── sign_analysis.py # Corpus-wide sign statistics
│ ├── analyze.py # Gordon 5-step pipeline (single + batch)
│ └── finetune_prep.py # ML training data generation
├── references/
│ ├── gordon-lexicon.md # Complete 60+3+7 entry lexicon tables
│ ├── methodology.md # Gordon's methods, 5-step pipeline
│ ├── sign-values.md # Sign confidence levels (HIGH/MEDIUM/LOW)
│ └── ml-approaches.md # Computational decipherment survey (v2)
└── data/ # Generated (not committed)
├── corpus.json # 1,701 inscriptions
├── gordon.json # 60 Gordon + 3 YasharMana + 7 scholarly entries
├── semitic_roots.json # 2,871 Proto-Semitic roots
└── cognate_cache.json # Precomputed cognate scores
All scripts use uv run with PEP 723 inline metadata. Dependencies: stdlib only.
More from tdimino/claude-code-minoan
academic-research
Search academic papers, build literature reviews, and synthesize research findings — combines Exa MCP (research_paper category, arxiv filtering) with arxiv-mcp-server for paper discovery, download, and deep analysis. Triggers on academic paper, literature review, research synthesis, arxiv, find papers, scholarly search.
69travel-requirements-expert
Plan a trip, create an itinerary, or research a destination through a structured 5-phase workflow---discovery questions, Exa/Firecrawl research, expert detail gathering, and a day-by-day requirements spec. This skill should be used when a user says "plan a trip," "create an itinerary," "help me visit [place]," or needs travel research with specific venues, safety protocols, and dietary accommodations.
67twilio-api
Use this skill when working with Twilio communication APIs for SMS/MMS messaging, voice calls, phone number management, TwiML, webhook integration, two-way SMS conversations, bulk sending, or production deployment of telephony features. Includes official Twilio patterns, production code examples from Twilio-Aldea (provider-agnostic webhooks, signature validation, TwiML responses), and comprehensive TypeScript examples.
65figma-mcp
Convert Figma designs into production-ready code using MCP server tools. Use this skill when users provide Figma URLs, request design-to-code conversion, ask to implement Figma mockups, or need to extract design tokens and system values from Figma files. Works with frames, components, and entire design files to generate HTML, CSS, React, or other frontend code.
61firecrawl
Scrape web pages to clean markdown using Firecrawl v2 — handles JS-heavy pages, site crawls, URL mapping, document parsing (PDF/DOCX/XLSX), LLM-powered extraction, autonomous agent scraping, and post-scrape browser interaction (Interact API). Prefer over WebFetch for quality and completeness. Triggers on scrape URL, fetch page, crawl site, extract content, parse document, web to markdown, DeepWiki, Firecrawl.
51scrapling
Scrape pages locally with anti-bot bypass, TLS impersonation, and adaptive element tracking — no API keys, no cloud. Handles Cloudflare protection, CSS/XPath element extraction, and survives site redesigns. Complements firecrawl (cloud) with 100% local execution. Triggers on Cloudflare bypass, anti-bot scraping, stealth fetch, local scraping, Scrapling.
47