Draw Thing

Local AI image generation and editing via draw-things-cli. Wraps the Draw Things inference stack for txt2img, img2img, upscaling, inpainting, ControlNet, LoRA, batch generation, and hi-res fix on macOS.

Scope: Local Draw Things image generation and editing only. NOT for UI implementation (frontend-designer), ad copy iteration (ad-creative), or broad vendor/tool research (research).

Canonical Vocabulary

Term	Meaning	NOT
txt2img	Text-to-image generation: prompt in, image out	img2img
img2img	Image-to-image: input image + prompt, modified image out	txt2img
upscale	Increase resolution while preserving/enhancing detail	img2img with high strength
inpaint	Replace content within a masked region	img2img (full image)
ControlNet	Structural guidance from a control image (edges, depth, pose)	LoRA (style/subject)
LoRA	Low-Rank Adaptation: small model add-on for style/subject	ControlNet (structure)
negative prompt	Text describing what to exclude; essential for SD 1.5, minimal for SDXL, unused for Flux	positive prompt
cfg_scale	Guidance scale: how literally the model follows the prompt	denoising strength
denoising strength	How much to change an input image (0.0 = none, 1.0 = complete redraw)	cfg_scale
sampler	Diffusion algorithm (DPM++ 2M Karras, Euler a, DDIM, etc.)	model
seed	Random number determining exact output; same seed = same image	prompt
batch	Generate multiple images in one run with different seeds	sequential runs
hi-res fix	Two-pass: generate at low res, then upscale with denoising	standalone upscaler

Dispatch

`$ARGUMENTS`	Mode	Action
`generate <prompt>` / `create <prompt>`	Generate	txt2img via CLI
`edit <path> <prompt>` / `transform <path>`	Edit	img2img with `--strength`
`upscale <path>` / `enhance <path>` / `superres <path>`	Upscale	`--upscaler` + `--upscaler-scale`
`inpaint <path> <mask> <prompt>`	Inpaint	img2img with mask input
`controlnet <control_path> <prompt>` / `cn <path> <prompt>`	ControlNet	`--controls` JSON
`lora <prompt> --lora <name>`	LoRA	`--loras` JSON
`batch <prompt>` / `variations <prompt>`	Batch	`--batch-count` variations
`model <name>`	Model info	Show recommended settings
`refine` / `iterate`	Refine	Re-run with adjusted params, locked seed
`gallery` / `recent`	Gallery	List recent outputs
(empty)	Help	Verify CLI, show modes, examples
Natural language image description	Auto: Generate	Detect prompt intent
Path to image + modification intent	Auto: Edit/Upscale	Detect intent from context

Auto-Detection Heuristic

Keywords: animate, video, motion, gif, mp4 -> Refuse: out of scope for v1.0
File path + "upscale/enhance/bigger/higher res/superres" -> Upscale
File path + mask path + descriptive prompt -> Inpaint
File path + "inpaint" keyword but NO mask path -> inform user a mask is required; offer to create one via ImageMagick or suggest Edit mode
File path + modification verb (change, edit, transform, restyle) -> Edit
Descriptive text with no file path -> Generate
Ambiguous -> ask which mode

Prerequisite Protocol

Run this before any generation operation:

Check CLI: command -v draw-things-cli

If NOT found, show install command and STOP:

brew tap drawthingsai/draw-things
brew install --HEAD drawthingsai/draw-things/draw-things-cli

If found, verify: draw-things-cli generate --help
Detect model directory:
- Default: ~/Library/Containers/com.liuliu.draw-things/Data/Documents/Models
- Override: $DRAWTHINGS_MODELS_DIR

List available models if user needs guidance:

ls "${DRAWTHINGS_MODELS_DIR:-$HOME/Library/Containers/com.liuliu.draw-things/Data/Documents/Models}"/*.{ckpt,safetensors} 2>/dev/null

Model Quick-Reference

Family	`--model`	Dims	Steps	CFG	Sampler	Prompt Style
Flux Schnell	`flux_1_schnell_q5p.ckpt`	1024x1024	4	1.0	`"Euler a"`	Natural language
Flux Dev	`flux_1_dev_q6p.ckpt`	1024x1024	30	1.0	`"Euler a"`	Natural language
Flux Klein	`flux_2_klein_4b_q6p.ckpt`	1024x1024	4	1.0	`"DPM++ 2M AYS"`	Natural language
Flux Klein 9B	`flux_2_klein_9b_q6p.ckpt`	1024x1024	8	1.0	`"DPM++ 2M AYS"`	Natural language
SDXL	`sd_xl_base_1.0.safetensors`	1024x1024	25	7.0	`"DPM++ 2M Karras"`	Tags + sentences
SD 1.5	`v1-5-pruned-emaonly.ckpt`	512x512	25	7.5	`"DPM++ 2M Karras"`	Comma-separated tags

Decision guide:

Need fast prototyping? -> Flux Schnell or Klein (4 steps, ~1-2s)
Need best quality? -> Flux Dev (30 steps) or SDXL (Juggernaut XL)
Need huge LoRA library? -> SD 1.5 (most mature ecosystem)
Need text in images? -> Flux (dramatically better text rendering)
Low VRAM / fastest? -> SD 1.5 (4-6 GB)

For full model catalog with checkpoints, quantization guide, and SDXL resolutions, load references/model-catalog.md.

Core Generation Protocols

Every mode follows this pattern:

Validate — file exists? CLI available? model downloaded?
Select defaults — from model quick-ref table (user overrides take precedence)
Build CLI command — assemble all flags
Show command — display the full command to user before running
Execute — run via Bash, capture output
Report — image path, seed used, dimensions

Flag verification note: The examples below reflect the approved research plan. If your local CLI help differs — especially around --image, --mask, --upscaler, or output-path flags — trust draw-things-cli generate --help over this file and adapt the command.

Mode: Generate (txt2img)

Build the command using model-appropriate defaults:

draw-things-cli generate \
  --model <model> \
  --prompt "<prompt>" \
  --negative-prompt "<negative>" \
  --width <W> --height <H> \
  --steps <N> \
  --guidance-scale <cfg> \
  --sampler "<sampler>" \
  --seed <seed or -1>

For Flux: omit --negative-prompt entirely (not supported). Write detailed natural language prompts.
For SD 1.5: include aggressive negative prompt. Use comma-separated tags. Load references/prompt-patterns.md for templates.
For SDXL: include short targeted negative prompt. Use descriptive sentences.

Mode: Edit (img2img)

draw-things-cli generate \
  --model <model> \
  --image <input_path> \
  --prompt "<what to change>" \
  --strength 0.75 \
  --steps <N> --guidance-scale <cfg>

--strength controls how much to change: 0.3 = subtle, 0.5 = moderate, 0.75 = significant, 0.9 = near-complete redraw
If width/height not specified, preserve input image dimensions

Mode: Upscale

draw-things-cli generate \
  --model <model> \
  --image <input_path> \
  --upscaler <upscaler_filename> \
  --upscaler-scale <2 or 4> \
  --strength 0.2 \
  --steps 30

Available upscalers:

Upscaler	Filename	Scale
Real-ESRGAN X2+	`realesrgan_x2plus_f16.ckpt`	2x
Real-ESRGAN X4+	`realesrgan_x4plus_f16.ckpt`	4x
Real-ESRGAN X4+ Anime	`realesrgan_x4plus_anime_6b_f16.ckpt`	4x
Remacri	`remacri_4x_f16.ckpt`	4x
4x UltraSharp	`4x_ultrasharp_f16.ckpt`	4x

Default upscaler: realesrgan_x4plus_f16.ckpt
Use --strength 0.2-0.4 for upscaling (preserve detail). Higher values alter the image.

Mode: Inpaint

draw-things-cli generate \
  --model <model> \
  --image <input_path> \
  --mask <mask_path> \
  --prompt "<what to paint in masked area>" \
  --strength 0.75 \
  --mask-blur 4 \
  --preserve-original-after-inpaint true

Mask: white = area to repaint, black = keep original
Prompt should describe ONLY what goes in the masked area, not the full image
--mask-blur 4 default; increase if seams are visible

Mode: ControlNet

Load references/controlnet-guide.md for module details and weight recommendations.

draw-things-cli generate \
  --model <model> \
  --image <control_image_path> \
  --prompt "<prompt>" \
  --controls '[{"file": "<controlnet_model>", "weight": 0.6, "guidanceStart": 0.0, "guidanceEnd": 1.0, "controlMode": "Balanced"}]' \
  --width <W> --height <H>

Common modules: Canny (edges), Depth (spatial layout), Pose (human skeleton), Scribble (sketches), Tile (upscaling).

Mode: LoRA

draw-things-cli generate \
  --model <model> \
  --prompt "<prompt>" \
  --loras '[{"file": "<lora_filename>", "weight": 0.8}]' \
  --width <W> --height <H>

Default weight: 0.6. Range: -1.5 to 2.5. Typical: 0.5-1.0.
Multiple LoRAs: add objects to the JSON array
Modes: "All" (default), "Base", "Refiner"

Mode: Batch

draw-things-cli generate \
  --model <model> \
  --prompt "<prompt>" \
  --batch-count <N> \
  --seed <start_seed> \
  --width <W> --height <H>

--batch-count 4 generates 4 images with incrementing seeds
Use to explore variations, then pick the best seed for refinement

Mode: Refine

Re-run the previous generation with adjustments:

Lock the seed from the previous generation
Adjust one parameter at a time (prompt, cfg, steps, strength)
Compare results

Example — previous SDXL generate used seed 42, now increase guidance:

draw-things-cli generate \
  --model sd_xl_base_1.0.safetensors \
  --prompt "same prompt as before" \
  --seed 42 \
  --guidance-scale 9.0 \
  --steps 25 \
  --sampler "DPM++ 2M Karras" \
  --width 1024 --height 1024

If the previous generation is not visible in the current conversation, ask the user for: the seed, the prompt, and the model used.

Mode: Gallery

ls -lt "${DRAWTHINGS_OUTPUT_DIR:-$HOME/Pictures/draw-thing}/" | head -20

Mode: Model info

Load references/model-catalog.md. Display the requested model's recommended settings (dimensions, steps, CFG, sampler, prompt style). If the model name is not recognized, list available model families.

Prompt Engineering Quick-Reference

Model	Style	Example
Flux	Natural language sentences, subject-first, camera/lens terms	`"Portrait of a woman with auburn hair, studio headshot, 85mm lens, f/1.8, soft diffused light, neutral backdrop"`
SDXL	Descriptive sentences, Subject-Action-Location-Style	`"A majestic castle on a cliff overlooking the sea, golden hour lighting, dramatic clouds, highly detailed, masterpiece"`
SD 1.5	Comma-separated tags, most important first	`"castle, cliff, ocean, golden hour, dramatic sky, highly detailed, masterpiece, best quality, 8k"`

Flux has NO negative prompt support. Frame exclusions positively: "perfect hands with five fingers" not "no extra fingers".

For advanced prompt patterns, quality boosters, negative prompt templates, and weighting syntax, load references/prompt-patterns.md.

Iterative Refinement Workflow

Generate with a starting prompt and note the seed
Evaluate the result — what's good? what needs changing?
Lock seed (--seed <value>) to isolate the effect of parameter changes
Adjust one thing at a time:
- Prompt wording -> changes content/composition
- --guidance-scale -> higher = more literal, lower = more creative
- --steps -> more steps = more detail (diminishing returns past 30-40)
- --strength (img2img) -> how much to change
Unlock seed when satisfied with parameters, generate variations with --seed -1

Output Handling

Default output directory: ~/Pictures/draw-thing/
Create it if it doesn't exist: mkdir -p ~/Pictures/draw-thing
PNG files include embedded metadata (prompt, seed, model, parameters)
If draw-things-cli outputs to a different location, move/copy to the standard directory
Always report the output file path and seed to the user

Error Recovery

Error	Likely Cause	Action
Model file not found	Wrong filename or missing download	List models dir, suggest correct name from model quick-ref
Process killed / OOM	Model too large for available memory	Suggest smaller model or quantized variant (e.g., q5p/q6p)
Unknown flag error	CLI version mismatch with this skill	Run `draw-things-cli generate --help`, adapt command
No output file	Silent failure or wrong output path	Check CLI stderr, verify output location

Reference Files

Load ONE reference at a time. Do not preload all references into context.

File	Content	Load When
`references/cli-reference.md`	Complete flag tables: 60+ flags, 19 samplers, 4 seed modes, JSON schemas	Building non-trivial commands, user asks about flags
`references/model-catalog.md`	Model variants, checkpoints, SDXL resolutions, quantization guide	User asks about models, `model` mode
`references/prompt-patterns.md`	Prompt engineering, quality boosters, negative templates, weighting	Complex prompts, quality issues
`references/controlnet-guide.md`	Modules, weights, scheduling, multi-ControlNet, JSON format	ControlNet mode
`references/workflow-recipes.md`	Multi-step recipes: character design, photo restoration, style transfer	Complex creative goals

Critical Rules

Always check CLI before any operation — command -v draw-things-cli
Always report the seed so results are reproducible
Model-appropriate dimensions: SD 1.5 -> 512x512, SDXL -> 1024x1024, Flux -> 1024x1024
Flux has NO negative prompt — omit --negative-prompt entirely; use detailed positive descriptions
Prompt style must match model: Flux = natural language, SD 1.5 = comma tags, SDXL = hybrid
Upscale preserves originals — always output to a new file, never overwrite
Default output: ~/Pictures/draw-thing/ with descriptive filenames
Show the full CLI command before running — transparency enables learning and debugging
Upscaling denoising 0.2-0.4 — higher values alter the image instead of enhancing
Single-quote JSON for --loras and --controls to prevent shell expansion
Refuse video requests — out of scope for v1.0; Draw Things supports it but workflows differ
Verify unknown flags — if unsure about a flag, run draw-things-cli generate --help first

draw-thing