content-memory
SKILL.md
Content Memory Pipeline
Purpose: Take various content sources → convert to markdown → chunk them → refer to them for future use.
The pipeline is the core value. Folder layout, workspace sync, and integration with other context (e.g. Vesta 7) are secondary; those pieces will be part of context anyway.
Architecture
- memory/ (project root): Content and chunks. No
converted/orchunked/subfolders—chunks go directly in topic folders. Markdown in<folder>/markdown/. - workspace/ (optional): Source content for sync_and_chunk; copied to memory.
- Convert to markdown/: PDF/DOCX → .md written in
<folder>/markdown/for each folder. - Chunk: Reads from
source/<domain>/ormemory/<domain>/, writes tomemory/<domain>/<topic>/.
When to Activate
- "Add content to memory", "refresh memory", "ingest for agent"
- "Sync workspace to memory", "convert and chunk"
Step 1: Convert to Markdown
Convert non-.md files to markdown in a markdown/ subfolder per folder:
python scripts/convert_to_markdown.py --source <path> [--memory <domain>]
python scripts/convert_to_markdown.py --from source/CBE/domain_journeys_approach
- Writes
.mdin<folder>/markdown/(e.g.CB Domain/foo.pdf→CB Domain/markdown/foo.md) .mdfiles are skipped
Step 2: Chunk to Memory
Chunk markdown and write directly into memory topic folders:
python scripts/chunk_markdown.py --memory <domain> [--incremental]
- Reads from:
source/<domain>/**/*.mdormemory/<domain>/**/*.md. - Writes to:
memory/<domain>/<topic>/(nochunked/subfolder) --incremental: Only chunk new or modified files
Step 3: Sync Workspace (Convert + Copy + Chunk)
One command for workspace content:
python scripts/sync_and_chunk.py --workspace <topic> --memory <domain> [--incremental]
- Converts non-.md to
<folder>/markdown/(in workspace) - Copies
workspace/<topic>→memory/<domain>/<topic> - Chunks to
memory/<domain>/<topic>/
Chunking Strategy
- Slide decks (
<!-- Slide number: N -->): One chunk per slide - Other docs (>200 lines): Split at
#or##boundaries - Small files (<200 lines): Single chunk
Each chunk includes: <!-- Source: path | file://url -->
Key Behaviors
- Take content sources – PDF, PPTX, DOCX, etc. (or workspace content).
- Convert to markdown – Non-.md files →
.mdin place. - Chunk – Split markdown into referable chunks (by slide, by heading, or whole file).
- Refer for future use – Chunks live where agents/context can find them; source attribution in each chunk.
- Incremental – Use
--incrementalto skip unchanged files.
Project-Specific Transformers
| Location | Scope |
|---|---|
memory/<name>/transformers/ |
Memory-specific |
.content-memory/transformers/ |
Workspace-level |
Each .py exports EXTENSIONS and convert(path: Path) -> str.
Scripts
| Script | Purpose |
|---|---|
convert_to_markdown.py |
Convert to markdown/ |
chunk_markdown.py |
Chunk to memory |
sync_and_chunk.py |
Convert + copy + chunk (workspace) |
Run from workspace root. Set CONTENT_MEMORY_ROOT if needed.
Troubleshooting
| Issue | Fix |
|---|---|
| No markdown | Run convert; then chunk. Or sync workspace to memory first. |
| Missing markitdown | pip install "markitdown[all]" |
Weekly Installs
7
Repository
agilebydesign/a…n-skillsFirst Seen
Feb 26, 2026
Security Audits
Installed on
gemini-cli7
github-copilot7
codex7
kimi-cli7
cursor7
opencode7