aios-jimeng-image-ops
AIOS Jimeng Image Ops
Overview
Use this runbook for stable image generation on Jimeng. It includes selectors, completion signals, policy-failure handling, and evidence requirements verified in a live run.
Preconditions
- Browser profile has valid Jimeng login session.
defaultprofile should connect to fingerprint browser via CDP (port9222by default).- MCP tools available:
chrome.launch_cdp,browser.connect_cdp,page.goto,page.click,page.type,page.extract_text,page.get_html,page.screenshot,browser.close. - For page understanding, prefer
page.extract_textfirst, thenpage.get_htmlfor structure details. - Use
page.screenshotonly when text/HTML evidence is insufficient, and prefer target-area capture over full-page screenshots when supported. - Prompt is policy-safe (avoid risky terms, political/person-identifiable/sensitive wording).
Execution Flow (Updated 2026-03-13)
- Open generation page:
- Navigate to home:
https://jimeng.jianying.com/ai-tool/home/ - Click "图片生成" button (selector:
button.button-RNHVcx:has-text("图片生成")) - Or use direct URL:
https://jimeng.jianying.com/ai-tool/generate?enter_from=ai_feature&from_page=explore&ai_feature_name=image
- Navigate to home:
- Use
page.extract_text/page.get_htmlto detect login/challenge markers (captcha, risk check, 2FA, login wall); if detected, ask user to complete human action and resume. - Confirm prompt box exists:
- Selector:
div.tiptap.ProseMirror(NOT textarea, this is a rich text editor)
- Selector:
- Click prompt box to activate, then fill prompt.
- Model & Ratio Selection (Bottom Toolbar):
- Model selector:
div.lv-select:has-text("图片")(first one, position x:344) - Ratio selector: Second
div.lv-select(position x:458, shows "图片 4.1" or "图片 5.0") - Note: These may show
clickable: falsein snapshot but are still interactive
- Model selector:
- Generate Button:
- Primary:
button.lv-btn.lv-btn-primary(position x:976, y:647) - Note: Button may show as
collapsed-submit-buttonclass, enabled only after prompt is entered
- Primary:
- Poll
page.extract_text(andpage.get_htmlwhen needed) until completion:- Success markers: latest task has image tiles +
重新编辑and再次生成 - Policy failure:
你输入的文字不符合平台规则,请修改后重试 - Timeout: still
生成中after budget
- Success markers: latest task has image tiles +
Error Handling
- Policy failure: rewrite prompt to neutral scene/style terms and retry once.
- Timeout: refresh generation page and retry with same prompt once.
- Selector failure: recapture snapshot and update selectors from visible controls.
Evidence Standard
- Keep at least one snapshot proving success or failure.
- Save one screenshot for run summary.
- Record prompt, outcome, and retry path in final doc.
Resources
references/run-report-2026-03-02.md: verified live-run record.references/selectors-and-errors.md: selector cookbook and failure taxonomy.
More from rexleimo/rex-cli
skill-creator
Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
3contextdb-autopilot
Use when running tasks in Codex CLI, Claude Code, Gemini CLI, or opencode and you need automatic context persistence (init/session/event/checkpoint/context-pack) plus interactive auto-routing without manual contextdb commands.
3seed2-manga-drama
当用户要把单图或创意脚本做成AI漫剧短视频时使用。提供Seed2.0风格的四阶段流程:分镜脚本、主角设定、分镜生成、视频验收,并输出可直接投喂生成模型的结构化提示词。
3debug
Evidence-first runtime debugging for application bugs, regressions, flaky behavior, and unclear failures. Use when an agent is asked to debug an issue and should avoid speculative fixes by forming hypotheses, attaching to or starting a logging session, instrumenting code, collecting runtime logs, analyzing the recorded log file, applying only proven fixes, and verifying the result before removing instrumentation, especially for browser or frontend issues where logs should go directly to the active collector endpoint instead of app-local proxy APIs.
2find-skills
Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.
2xhs-ops-methods
当用户要学习或执行小红书运营方法时使用。提供可复用的“定方向-定人设-写排发互-复盘增长”流程,支持多账号协作,并强调人工审核与合规发布。
2