byted-airesearch-videoeval
Byted Airesearch Videoeval
Use this skill to submit and query long-running material evaluation tasks.
When to use
Use this skill when the user wants to:
- evaluate a video material or creative asset
- submit a material evaluation task and get back a task identifier
- check the status of an existing evaluation task later
- fetch the final detail/result of a previously created evaluation task
Do not use this skill for generic video upload requests.
Current workflow
- Validate the full input batch before any upload starts.
- Upload the local video files and capture the returned
attachment_idvalues. - Create a task with the uploaded attachment IDs.
- Return success immediately after the task is created.
- Ask the user to query task list or task detail later if they want progress or results.
This workflow is intentionally non-blocking. Do not poll automatically after task creation.
Mandatory behavior
- Do not expose the upload API as a standalone user-facing capability.
- Do not trigger this skill for generic requests such as “upload this video”, “store this file”, or “send this video”.
- Only call the upload API when the user explicitly intends to create a new material evaluation task.
- For new task creation, prefer
scripts/submit_evaluation_task.pyso validation, upload, and task creation stay in one controlled flow. - Treat
scripts/upload_video.pyas an internal helper used by the orchestration flow, not as the primary user entrypoint. The script itself rejects direct use unless it is called with the internal orchestration marker.
For multi-file submissions, use the orchestration entrypoint so the whole batch is validated before the first upload starts.
Submission limits
- A single task can include at most 10 videos.
- Non-enabled users have a rolling free quota of at most 10 submitted videos within the last 24 hours.
- The new task's video count is added to the number of videos already submitted in the last 24 hours. If the total exceeds 10, the service rejects the task and asks the user to contact Volcengine sales to enable access.
- Enabled users are not restricted by this rolling 24-hour free quota.
- Quota accounting is based on the actual number of videos submitted per task, with no deduplication.
- Any task created within the last 24 hours counts toward the rolling quota, including running tasks.
- Login-based access and API key access share the same quota pool.
- The skill enforces the per-task limit locally before upload starts. The rolling 24-hour quota is enforced by the service, and the skill should surface the service rejection with a clear explanation.
Authentication
The current APIs use API key authentication.
All API requests sent by this skill must include the header:
x-product-version: 20Authorization: bearer {API_KEY}
Preferred input methods:
--api-key "<api-key>"BYTED_AIRESEARCH_VIDEOEVAL_API_KEY
If no API key is available, ask the user to create or view one at:
https://console.volcengine.com/datatester/ai-research/audience/list?tab=apikey
Then ask the user to provide the API key before calling the API.
If the API key is missing, the scripts must fail immediately with a clear error that points the user to the API key page above.
Known API coverage
Upload attachment
- Endpoint:
POST https://console.volcengine.com/datatester/compass/api/v3/survey/attachment - Request:
multipart/form-data - File field:
file - Upload constraints:
- each upload request contains exactly one video file
- each video must be 50MB or smaller
- file format must be
mp4 - MIME type must be
video/mp4
- Output mapping:
- preserve the raw attachment payload
- map the attachment object's
idfield toattachment_id
The upload_video.py script is an internal helper for the create-task workflow. It is not the primary user-facing entrypoint.
Create task
- Endpoint:
POST https://console.volcengine.com/datatester/compass/api/v3/survey/task - Fixed request fields:
form_id: 0agent_id: 125audience_id: 3664529
- Optional request fields currently exposed:
promptlanguageis_typical_user_enabledtypical_user_counttypical_user_selection_modeis_report_enabledattachment_ids
- Create constraints:
- one task submission can include at most 10 videos
attachment_idsmust contain at most 10 items per request
The create step should send the uploaded attachment_id as a one-element attachment_ids array unless multiple attachment IDs are explicitly provided.
List and detail
Task detail is wired and can query an existing task directly:
- Endpoint:
GET https://console.volcengine.com/datatester/compass/api/v3/survey/task/{id} - Current auth: API key bearer token
- Output mapping:
- map the task object's
idfield totask_id - map the task object's
statusfield totask_status - parse
task.detail - keep only items where
key == video_structured_resultandsub_tab != null - expose a compact
summaryblock for downstream agent use
- map the task object's
Task list is wired and can query existing tasks directly:
- Endpoint:
GET https://console.volcengine.com/datatester/compass/api/v3/survey/task - Current auth: API key bearer token
- Current fixed query params:
page=1page_size=100agent_id=125
- Output mapping:
- map each task item to
task_id,name,status,created_at,updated_at - preserve pagination info in
data.page
- map each task item to
Commands
# Validate a whole batch before upload, then upload all files and create the task
python scripts/submit_evaluation_task.py \
--file /path/to/video-1.mp4 \
--file /path/to/video-2.mp4 \
--api-key "<api-key>"
# Single-video create flow
python scripts/submit_evaluation_task.py \
--file /path/to/video.mp4 \
--prompt "Evaluate this material for audience fit and content quality." \
--api-key "<api-key>"
# Query task list later
python scripts/list_evaluation_tasks.py
# Query task detail later
python scripts/get_evaluation_task_detail.py --task-id 12345
Response handling
All scripts emit JSON to stdout with the same top-level envelope:
statusmessagerequest_iddataerror
Important normalized fields:
- submit task:
data.task_id,data.task_status,data.submitted_video_count - list:
data.items - detail:
data.detail,data.summary
Final answer rules
- Use the structured JSON returned by the detail endpoint as the internal source of truth.
- Present the final answer in human-readable natural language.
- When the task is finished, prefer a concise report-style answer rather than a raw data dump.
- Do not dump raw JSON to the user.
- Do not expose internal field names such as
video_eval,video_user_report,distribution,field_desc, or similar implementation-oriented keys. - Apply the same rule to task list responses: use the list result as internal source data, but present the outcome as a natural-language summary rather than raw fields.
- For task list answers, it is acceptable to include the task ID, task name, status, created time, and updated time in human-readable prose, because those fields help the user choose a task for follow-up detail queries.
- When multiple videos are present, summarize them separately.
- For finished task detail results, prefer a readable report flow such as: task conclusion first, then one short section per video, then overall recommendations if the source data supports them.
- If the task is not finished yet, do not fabricate a report. Clearly state the current status and ask the user to check again later.
- Only provide raw structured data if the user explicitly asks for the raw result.
Practical guidance
- For new submissions, use the orchestration flow rather than exposing upload as a standalone step to the user.
- Validate the local file before upload. Reject non-MP4 files, files with non-
video/mp4MIME types, or files larger than 50MB with a direct and actionable error message. - Validate the task creation input before calling the API. Reject any request that contains more than 10 attachment IDs with a direct and actionable error message.
- For a multi-video submit flow, validate the full batch size before any upload starts. If the batch contains more than 10 files, fail immediately and do not upload anything.
- If the service rejects task creation because the rolling 24-hour free quota was exceeded, use this standard Chinese wording for the user-facing message:
免费版用户每24小时最多提交10个评估视频,如需购买请联系火山引擎销售人员 - After create succeeds, tell the user the task has been submitted successfully and can be checked later.
- Use task list when the user wants to browse or find historical tasks.
- Use task detail when the user already knows the task ID and wants the final result.
More from bytedance/agentkit-samples
byted-las-video-edit
Extracts and clips video segments from long videos using natural language descriptions. AI-powered smart video editing, video trimming, and video cutting powered by Volcengine LAS. Describe what you want — scenes, people, objects, actions, events — and get trimmed clips automatically. Video search and video content retrieval: find and locate specific people, objects, or scenes in footage. Supports reference images for person matching and object matching (search video by image). Two modes: simple (fast) and detail (thorough, optional ASR). Use this skill when the user wants to edit/clip/cut videos using natural language descriptions, extract highlights or key moments from videos, find specific people/objects/scenes in video footage (by text or reference image), compile highlight reels from long videos, trim video segments, or do AI-powered smart video editing.
163byted-las-pdf-parse-doubao
Parses and reads PDF documents into structured Markdown text using Volcengine LAS Doubao AI models. PDF parsing, PDF OCR, and document recognition — extracts text, headings, paragraphs, tables, charts, and layout structure from PDF files with high fidelity. Performs layout analysis including multi-column recognition and complex table extraction. Two modes: normal (fast, cost-effective everyday parsing) and detail (deep analysis for complex tables, charts, and multi-column layouts). Converts PDF to Markdown, PDF to text, and structured data. Digitizes scanned PDF documents and scanned images via OCR. Supports TOS paths, HTTP URLs, and local file upload. Async submit-poll workflow with batch processing support. Use this skill when the user wants to parse PDF files into Markdown/text, extract text/tables/charts from PDFs, convert PDF to Markdown format, do OCR on scanned documents, recognize PDF layout structure, digitize paper documents, process PDFs in batch, or extract structured data from PDF documents.
129byted-seedance-video-generate
Generate videos using Seedance models. Invoke when user wants to create videos from text prompts, images, or reference materials.
109byted-data-search
|
106byted-las-vlm-video
Analyzes and understands video content using Volcengine LAS Doubao vision-language models (VLM). Multimodal AI video analysis, video comprehension, and visual understanding of video clips and footage. Performs video question answering (video Q&A) — ask questions about what happens in a video and get AI answers. Scene recognition and scene description, object recognition and object detection, action recognition and action detection from video frames. Generates video descriptions, video captions, video summaries, video annotations, and content summarization. Visual frame analysis for identifying people, objects, actions, and events in video. Auto-compresses video to 50MB before inference. Synchronous single-call processing. Use this skill when the user wants to analyze or understand video content using VLM/AI, do video Q&A (ask questions about a video), describe what happens in a video, recognize objects/actions/scenes in video frames, generate video captions/descriptions/summaries, annotate or label video content, get AI-powered visual understanding of video clips, or perform multimodal video analysis with vision-language models.
97byted-text-to-speech
将文本合成为语音(TTS)。使用火山引擎豆包语音合成 API,支持流式合成、多种音色、语速/音调/音量调节、Markdown 过滤和 LaTeX 公式播报。当用户需要把文字转成语音、生成朗读音频、配音、旁白、播报,或提到「文字转语音」「TTS」「语音合成」「朗读」「配音」时使用本技能。
93