architect-python-uv-takehome
Architect: Python + uv Take-Home
Use this skill for short, time-boxed Python projects such as take-home assignments, experiments, and proof-of-concepts. This skill is intentionally lighter than the production architect skills:
- no Docker unless the user explicitly asks for it
- no CI scaffolding unless the user explicitly asks for it
- no review gate, observability, or deployment requirements by default
- prefer direct CLI workflows and simple local validation
Use this when the prompt already defines the expected output shape, for example a solution/ directory with specific subfolders and artifacts.
This skill is optimized for reviewer speed:
- make the project easy to run in one pass
- keep dependencies low and familiar
- show correctness with deterministic checks
- use LLMs only where they add clear value
This skill is scaffolding, not the core solution. Use it to reduce setup time and standardize the workflow, but keep the assignment-specific reasoning, parsing rules, and validation logic clearly visible in the project itself.
Inputs
Collect these values:
PROJECT_ROOT: defaultsolutionPROJECT_NAME: human-readable project nameMODULE_NAME: snake_case Python package nameCLI_NAME: command name foruv run ...PYTHON_VERSION: default3.12USE_LLM:yesornoLLM_STYLE:direct-sdk|langchain(defaultdirect-sdk)HAS_FIXED_OUTPUT_SHAPE:yesornoOUTPUT_ARTIFACTS: required files the prompt expects
Selection Rules
- If the user needs a quick CLI or offline data-processing submission, use this skill instead of
architect-python-uv-batch. - If the prompt defines a specific deliverable layout, preserve that layout exactly.
- If the task is evaluation-heavy, keep checks local and deterministic before adding any LLM-based review.
- If the assignment includes both "generate output" and "explain differences," split those into deterministic generation plus a separate reporting layer.
- If the user needs a web API, database, Docker-first runtime, or CI pipeline, use a heavier architect skill instead.
Scaffold Workflow
- Create a lightweight Python project with
uv. - Match the prompt's required directory layout exactly.
- Add only the dependencies needed to complete the assignment.
- Build direct CLI entrypoints for the required tasks.
- Produce output artifacts in the expected paths.
- Add a minimal smoke test and deterministic validation path.
- Make the happy path runnable in two commands or fewer.
- Keep the generated scaffolding subordinate to the assignment-specific logic.
Default Project Shape
When the prompt specifies a solution/ directory, use:
solution/
README.md
pyproject.toml
requirements.txt
src/
{{MODULE_NAME}}/
extractor/
agent/
models/
utils/
cli.py
tests/
output/
If the prompt explicitly requires a different structure such as solution/src/extractor, follow the prompt instead of this default package layout.
For assignment prompts with fixed paths, a direct layout is also valid:
solution/
README.md
pyproject.toml
requirements.txt
src/
extractor/
agent/
common/
cli.py
tests/
output/
Dependency Defaults
Start with the smallest useful set:
uv init
uv add pydantic orjson jsonschema typer rich
uv add -d pytest
Add only as needed:
- markdown parsing:
markdown-it-py - tabular parsing:
pandas - LLM direct SDK: provider package such as
openai - diff/report helpers: stdlib first, add packages only if they remove meaningful complexity
If the prompt explicitly asks for requirements.txt or the reviewer is likely to expect it, export one after dependency setup:
uv export --format requirements-txt -o requirements.txt
CLI Pattern
Prefer a small CLI with explicit commands:
extract
validate
run-all
Expected behavior:
extract: read source inputs and write normalized output JSONvalidate: compare extracted output to schema and any provided reference filesrun-all: run extraction, then validation, then write final artifacts
Prefer exposing these through a script entrypoint so the reviewer can run:
uv run {{CLI_NAME}} run-all
Best-Fit Assignment Profile
This skill is a strong fit when the assignment looks like:
- parse a messy input document into a normalized JSON schema
- compare the generated JSON to a provided reference file
- produce a human-readable validation report
- optionally use an LLM to explain discrepancies, confidence, or edge cases
For these tasks, the strongest default architecture is:
- deterministic extractor first
- deterministic schema and reference comparison second
- optional LLM narrative layer last
Do not start with a fully agentic design when a simple pipeline is clearer and easier to verify.
Positioning Rule
When using this skill in a real submission:
- treat the skill as a setup accelerator, not the main artifact
- keep the submitted repository focused on the assignment, not on the skill system
- make sure the core extraction and validation logic is plainly readable without any knowledge of the skill
The reviewer should conclude:
- the skill saved setup time
- the implementation decisions are still yours
- the correctness claims come from deterministic checks, not from trust in scaffolding
The reviewer should not conclude:
- the solution is mostly generated boilerplate
- the workflow is more interesting than the result
- the candidate is hiding weak reasoning behind tool usage
Extraction Pattern
For document extraction tasks:
- identify the 2-6 sections that actually contain authoritative data
- parse those sections first instead of reading the entire corpus linearly
- normalize values into a stable schema
- record uncertainty when text is ambiguous or spread across multiple sections
- prefer deterministic parsing of known headings, tables, and labels before calling an LLM
- if an LLM is needed, send only the narrowest relevant excerpt rather than the whole document
For legal or contract-like markdown:
- prefer heading and article boundaries
- preserve source citations when practical (
section,heading,line snippet) - treat summary schedules and tables as corroborating evidence, not always the sole source of truth
- when multiple sections repeat the same term, prefer the most authoritative contractual definition and note conflicts explicitly
Extraction + Validation Layout
For fixed-layout assignments that ask for an extractor and a validator, prefer:
solution/
src/
extractor/
load_inputs.py
parse_document.py
normalize_terms.py
write_output.py
agent/
compare_terms.py
score_confidence.py
write_report.py
common/
schema_tools.py
types.py
cli.py
output/
extracted_terms.json
validation_report.md
This keeps the responsibilities obvious to a reviewer:
extractor/generates normalized dataagent/evaluates and explainscommon/holds shared schema and utility code
Keep the extractor and validator code concrete and assignment-specific. Do not bury domain parsing behind generic framework layers if that makes the logic harder to inspect.
Validation Pattern
Start with deterministic checks:
- Validate output JSON against the provided schema.
- Compare extracted output to the provided reference output.
- Produce a field-by-field discrepancy summary.
- Fail only on real structural or content mismatches, not formatting differences.
- Separate objective mismatches from subjective interpretation notes.
If an LLM is used, keep it advisory:
- use it to explain discrepancies
- use it to draft a narrative validation report
- do not let it replace the deterministic diff
- do not let it silently "fix" extracted values without showing the original mismatch
For reviewer-facing scoring, prefer:
- exact pass/fail for schema compliance
- explicit discrepancy counts for content mismatches
- optional confidence scores per field with brief reasoning
Minimal Eval Guidance
Prefer simple, assignment-aligned evals over production gating.
Required checks:
- schema compliance
- golden-file comparison against provided expected output
- required artifact existence
- at least one end-to-end smoke test
For extraction tasks, the most useful deterministic eval is a semantic diff:
- compare keys recursively
- report missing fields, unexpected fields, and changed values
- treat ordering as insignificant unless the schema makes order meaningful
- normalize trivial formatting differences before diffing (whitespace, case where appropriate, numeric string formatting)
- use
scripts/semantic_diff.py expected.json actual.jsonfor a fast machine-readable baseline diff
For fixed test input plus fixed expected output, the default grading path should be:
- Run the extractor on the provided input.
- Validate the resulting JSON against the schema.
- Compare it to the provided expected JSON.
- Emit a concise diff summary and an expanded markdown report.
This is usually enough to demonstrate real correctness in a take-home without adding heavy eval infrastructure.
If the prompt includes a provided reference output, that reference comparison should be the centerpiece of the validation story. Do not substitute a vague "AI evaluation" for an explicit diff against the expected output.
README Contract
Keep setup instructions short and local-first:
- install dependencies
- run the extractor
- run the validator
- run the all-in-one command
- explain where outputs are written
- document any environment variables for LLM usage
- include one copy-paste command block a reviewer can run without reading the code first
Do not add deployment, infrastructure, or CI instructions unless the user asked for them.
Walkthrough Guidance
If you explain your process in a walkthrough, demo, or follow-up interview, frame the skill usage like this:
- Use
npx skills add ...to install a lightweight scaffold so setup and project shape are handled quickly. - Adapt the scaffold to the assignment's required directory layout and deliverables.
- Implement the assignment-specific extraction logic directly in the repo.
- Run deterministic validation against the provided schema and expected output.
- Use any LLM-powered reporting only as a final explanatory layer.
In walkthroughs, emphasize:
- why the chosen structure fits the prompt
- what parts are deterministic
- what parts required interpretation
- how you verified correctness
In walkthroughs, do not emphasize:
- the skill mechanism itself as the clever part
- generic scaffolding over domain-specific logic
- "agent" behavior that is not backed by clear validation
Guardrails
- Match the prompt's required output paths exactly.
- Prefer readable code over abstractions that save little time.
- Keep modules small and easy to review during a time-boxed assessment.
- Use deterministic logic for core correctness checks.
- Surface ambiguous extraction decisions explicitly instead of hiding them.
- Keep LLM calls optional and bounded behind a small interface.
- Optimize for "easy to grade" over "architecturally impressive."
- Prefer transparent diffs and traceable reasoning over clever but opaque agent loops.
- Avoid making the final repository feel like a generic generated starter.
- Keep the submission credible even if the reviewer ignores the skill entirely.
Validation Checklist
Run the smallest useful verification set:
uv run pytest -q
uv run {{CLI_NAME}} extract
uv run {{CLI_NAME}} validate
uv run {{CLI_NAME}} run-all
Manual checks:
- expected output files exist
- extracted JSON validates against the target schema
- discrepancy report is readable and references concrete differences
- the README's quickstart works exactly as written
Decision Justification Rule
- Every non-trivial decision must include a concrete justification.
- Capture the alternatives considered and why they were rejected.
- State tradeoffs and residual risks for the chosen option.
- If justification is missing, treat the task as incomplete and surface it as a blocker.