byted-tos-video-process
Bytedance TOS Video Process Skill
This skill provides essential video processing functions for video files stored in Bytedance's TOS (TeraObjectStore). It allows you to retrieve video metadata and perform single-frame or multi-frame snapshots directly using the Volcengine TOS SDK.
Quick Start
1. Client Initialization
The following Python snippet demonstrates how to initialize the TosClientV2 from environment variables.
import os
import tos
from tos.exceptions import TosClientError, TosServerError
def create_client() -> tos.TosClientV2:
"""Initializes a TosClientV2 using AK/SK (and optional STS token) from environment variables."""
try:
ak = os.getenv('TOS_ACCESS_KEY')
sk = os.getenv('TOS_SECRET_KEY')
endpoint = os.getenv('TOS_ENDPOINT')
region = os.getenv('TOS_REGION')
security_token = os.getenv('TOS_SECURITY_TOKEN') # Optional, for STS
if not all([ak, sk, endpoint, region]):
raise ValueError("Required environment variables are missing (AK, SK, Endpoint, Region).")
return tos.TosClientV2(
ak=ak,
sk=sk,
endpoint=endpoint,
region=region,
security_token=security_token,
)
except (ValueError, ImportError) as e:
print(f"Error initializing client: {e}")
# Handle initialization failure
return None
# Create the client
client = create_client()
2. Basic Workflow
# (Assumes 'client' is initialized and 'bucket_name', 'object_key' are set)
# 1. Get Video Info
try:
response = client.get_object(bucket_name, object_key, process="video/info")
info_data = response.read()
print("Video Info:", info_data.decode('utf-8'))
except TosServerError as e:
print(f"Error getting video info: {e}")
# 2. Take a Single Snapshot and save locally
try:
client.get_object_to_file(
bucket_name,
object_key,
"snapshot_1000ms.jpg",
process="video/snapshot,t_1000,f_jpg,w_720"
)
print("Snapshot saved to snapshot_1000ms.jpg")
except TosServerError as e:
print(f"Error taking snapshot: {e}")
# 3. Take a Snapshot and save back to TOS
try:
response = client.get_object(
bucket_name,
object_key,
process="video/snapshot,t_5000,f_jpg",
save_bucket=bucket_name,
save_object="processed/snapshot_5000ms.jpg"
)
save_result = response.read()
print("Snapshot saved to TOS:", save_result.decode('utf-8'))
except TosServerError as e:
print(f"Error saving snapshot to TOS: {e}")
Core Operations
All video processing is achieved via the process parameter in the get_object or get_object_to_file SDK methods.
1. Get Video Info (videoInfo)
Retrieves metadata of a video file, such as resolution, duration, and format.
SDK Method: client.get_object(..., process="video/info")
# 'client', 'bucket_name', and 'object_key' must be defined
try:
response = client.get_object(bucket_name, object_key, process="video/info")
# The response body is a JSON string
video_metadata = response.read().decode('utf-8')
print(video_metadata)
except TosServerError as e:
print(f"Server Error: {e.code} - {e.message}")
2. Take a Single Snapshot (videoSnapshot)
Captures a single frame from a video. It supports various parameters for customization and can either return the image data or save the result directly back to TOS.
SDK Method: client.get_object_to_file(..., process="video/snapshot,...") for local save.
SDK Method: client.get_object(..., process="video/snapshot,...", save_bucket=..., save_object=...) for saving to TOS.
# Example: Take a snapshot at 10 seconds, resize to 720p width, and save locally
try:
client.get_object_to_file(
bucket_name,
object_key,
file_path="local_snapshot.jpg",
process="video/snapshot,t_10000,w_720,f_jpg"
)
print("Snapshot saved successfully to local_snapshot.jpg")
except (TosClientError, TosServerError) as e:
print(f"An error occurred: {e}")
3. Take Multiple Snapshots (videoSnapshots)
This is a client-side orchestration pattern. You loop through a series of timestamps and make multiple calls to the videoSnapshot operation. The scripts/video_snapshots.py provides a reference implementation for parallel execution.
# (Assumes 'client', 'bucket_name', 'object_key' are set)
timestamps = [1000, 5000, 10000] # In milliseconds
for i, ts in enumerate(timestamps):
output_filename = f'snapshot_{i+1}_at_{ts}ms.jpg'
process_rule = f"video/snapshot,t_{ts},w_720,f_jpg"
try:
client.get_object_to_file(
bucket_name,
object_key,
output_filename,
process=process_rule
)
print(f"Saved snapshot to {output_filename}")
except (TosClientError, TosServerError) as e:
print(f"Failed for timestamp {ts}: {e}")
Authorization
Authentication is handled directly by the tos.TosClientV2 constructor. Provide credentials via environment variables.
Required Environment Variables
TOS_ACCESS_KEY: Your Access Key ID.TOS_SECRET_KEY: Your Secret Access Key.TOS_ENDPOINT: The endpoint for the TOS service (e.g.,https://tos-cn-beijing.volces.com).TOS_REGION: The region for the TOS service (e.g.,cn-beijing).
Optional for STS
TOS_SECURITY_TOKEN: If using a temporary token (STS), provide the session token here. The client will automatically use it if present.
Best Practices
- Error Handling: Always wrap SDK calls in
try...exceptblocks to handleTosClientErrorandTosServerError. - Parameter Validation: Validate parameters like
time,width, andheighton the client side before making an API call to prevent unnecessary errors. - Batch Operations: For
videoSnapshots, use a thread pool (likeThreadPoolExecutor) to perform multiple snapshot requests in parallel for better performance. Seescripts/video_snapshots.pyfor an example. - Credentials Management: Use a secure method to manage and refresh credentials, especially when using short-lived STS tokens.
Additional Resources
- For detailed parameters of each operation, see REFERENCE.md.
- For common end-to-end examples, see WORKFLOWS.md.
- For executable Python examples, see the
scripts/directory.
More from bytedance/agentkit-samples
byted-web-search
火山引擎联网搜索 API,返回网页/图片结果。联网搜索场景优先使用本 skill。触发词包括:查/搜/找、真的吗/靠谱吗/确认/核实、最近/今天/最新/近期、出处/来源/链接、有什么/有哪些/推荐、价格/政策/汇率/行情、对比/区别/哪个好、听说/据说/不太确定、热搜/热门/火、帮我看/了解一下、求证/辟谣、值不值得/该不该。任务依赖在线事实或时效性时优先使用。若回答可能依赖外部事实,优先调用本 skill 再作答。支持 API Key / AK/SK。
371byted-seedream-image-generate
Generate high-quality images from text prompts using Volcano Engine Seedream models. Supports multiple artistic styles and aspect ratios. Use this skill when users want to create images from text descriptions, generate artwork in various styles, create visual content for creative projects, or need AI-powered image generation capabilities.
187byted-las-video-edit
Extracts and clips video segments from long videos using natural language descriptions. AI-powered smart video editing, video trimming, and video cutting powered by Volcengine LAS. Describe what you want — scenes, people, objects, actions, events — and get trimmed clips automatically. Video search and video content retrieval: find and locate specific people, objects, or scenes in footage. Supports reference images for person matching and object matching (search video by image). Two modes: simple (fast) and detail (thorough, optional ASR). Use this skill when the user wants to edit/clip/cut videos using natural language descriptions, extract highlights or key moments from videos, find specific people/objects/scenes in video footage (by text or reference image), compile highlight reels from long videos, trim video segments, or do AI-powered smart video editing.
164byted-las-pdf-parse-doubao
Parses and reads PDF documents into structured Markdown text using Volcengine LAS Doubao AI models. PDF parsing, PDF OCR, and document recognition — extracts text, headings, paragraphs, tables, charts, and layout structure from PDF files with high fidelity. Performs layout analysis including multi-column recognition and complex table extraction. Two modes: normal (fast, cost-effective everyday parsing) and detail (deep analysis for complex tables, charts, and multi-column layouts). Converts PDF to Markdown, PDF to text, and structured data. Digitizes scanned PDF documents and scanned images via OCR. Supports TOS paths, HTTP URLs, and local file upload. Async submit-poll workflow with batch processing support. Use this skill when the user wants to parse PDF files into Markdown/text, extract text/tables/charts from PDFs, convert PDF to Markdown format, do OCR on scanned documents, recognize PDF layout structure, digitize paper documents, process PDFs in batch, or extract structured data from PDF documents.
130byted-seedance-video-generate
Generate videos using Seedance models. Invoke when user wants to create videos from text prompts, images, or reference materials.
111byted-data-search
|
107