tos-file-access
TOS File Access
This skill provides utilities for uploading files and directories to Volcano Engine TOS (Torch Object Storage) and downloading files from URLs.
Overview
TOS (Torch Object Storage) is Volcano Engine's object storage service, similar to AWS S3. This skill enables:
- Upload: Upload Agent-generated files or entire directories to TOS and get shareable signed URLs (for files) or TOS paths (for directories)
- Download: Download files from URLs to local storage for Agent processing
Typical Workflows
Pre-Agent Execution: Download User Files
When users provide file URLs (TOS or external), download them before processing:
# Download single file
python scripts/file_download.py https://example.com/data.csv
# Download multiple files
python scripts/file_download.py https://example.com/data.csv https://example.com/config.json
# Specify save directory and filenames
python scripts/file_download.py https://example.com/data.csv --save-dir /workspace --filenames dataset.csv
Post-Agent Execution: Upload Output Files or Directories
After generating files or directories (videos, charts, reports, output folders, etc.), upload them to TOS for user access:
# Upload single file (auto-detected)
python scripts/tos_upload.py /path/to/output.mp4 --bucket my-bucket
# Upload entire directory (auto-detected)
python scripts/tos_upload.py /path/to/output_folder --bucket my-bucket
# Upload with custom region and expiration
python scripts/tos_upload.py /path/to/report.pdf --bucket my-bucket --region cn-beijing --expires 86400
Scripts
scripts/file_download.py
Download files from URLs to local storage.
Usage:
python scripts/file_download.py <url1> [url2 ...] [--save-dir DIR] [--filenames NAME1 NAME2 ...]
Arguments:
urls: One or more URLs to download (positional, required)--save-dir: Save directory (optional, defaults to/tmp)--filenames: Custom filenames for downloaded files (optional, must match number of URLs)
Examples:
# Download single file to /tmp
python scripts/file_download.py https://tos-cn-beijing.volces.com/bucket/file.pdf
# Download to specific directory
python scripts/file_download.py https://example.com/data.json --save-dir /workspace/data
# Download multiple files with custom names
python scripts/file_download.py \
https://example.com/file1.pdf \
https://example.com/file2.jpg \
--save-dir /workspace \
--filenames document.pdf image.jpg
Returns: Prints absolute paths of downloaded files (one per line)
scripts/tos_upload.py
Upload files or directories to TOS and generate signed access URLs (for files) or TOS paths (for directories).
Key Features:
- Auto-detection: Automatically detects whether the path is a file or directory
- Session-based paths: Uses
TOOL_USER_SESSION_IDenvironment variable to organize uploads - Preserves structure: For directories, maintains the full directory structure in TOS
- Automatic bucket creation: Creates bucket if it doesn't exist (with private ACL)
Usage:
python scripts/tos_upload.py <path> --bucket BUCKET [--region REGION] [--expires SECONDS]
Arguments:
path: Local file or directory path to upload (positional, required)--bucket: TOS bucket name (required)--region: TOS region (optional, defaults tocn-beijing)--expires: Signed URL expiration in seconds (optional, defaults to 604800 = 7 days, only applies to file uploads)
Upload Structure:
- File:
upload/{session_prefix}/{filename}- Example:
upload/skill_agent_veadk_default_user_tmp-session-20251210150057/video.mp4
- Example:
- Directory:
upload/{session_prefix}/{directory_name}/{relative_path}- Example:
upload/skill_agent_veadk_default_user_tmp-session-20251210150057/output_folder/file1.txt
- Example:
Session Prefix:
- If
TOOL_USER_SESSION_IDis set, uses that value as prefix - Otherwise, falls back to timestamp format
YYYYMMDD_HHMMSS
Authentication: Requires one of:
- Environment variables:
VOLCENGINE_ACCESS_KEYandVOLCENGINE_SECRET_KEY - VeFaaS IAM Role (automatic credential retrieval)
Examples:
# Upload single file (auto-detected)
python scripts/tos_upload.py /workspace/output.mp4 --bucket my-bucket
# Upload entire directory (auto-detected)
python scripts/tos_upload.py /workspace/results_folder --bucket my-bucket
# Upload to different region with 1-day expiration
python scripts/tos_upload.py /workspace/report.pdf \
--bucket my-reports \
--region cn-beijing \
--expires 86400
# Upload directory with all options
python scripts/tos_upload.py /workspace/output_dir \
--bucket data-storage \
--region cn-beijing
Returns:
- For files: Prints a signed URL that can be shared with users (valid for specified duration)
- For directories: Prints a TOS path in format
tos://bucket-name/path/to/directory
Output Examples:
# File upload output
============================================================
✅ Upload Successful!
============================================================
Signed URL:
https://my-bucket.tos-cn-beijing.volces.com/upload/skill_agent_xxx/video.mp4?X-Tos-Signature=...
============================================================
# Directory upload output
============================================================
✅ Upload Successful!
============================================================
TOS Path:
tos://my-bucket/upload/skill_agent_xxx/output_folder
============================================================
Environment Variables
VOLCENGINE_ACCESS_KEY: Volcano Engine access key for TOS authenticationVOLCENGINE_SECRET_KEY: Volcano Engine secret key for TOS authenticationTOOL_USER_SESSION_ID: Session ID used to generate organized upload paths (optional, falls back to timestamp)
Common Use Cases
-
Video Processing: Download source video → process → upload result
# Download python scripts/file_download.py https://example.com/input.mp4 --save-dir /workspace # Process video (your code here) # Upload result python scripts/tos_upload.py /workspace/output.mp4 --bucket video-storage -
Data Analysis: Download datasets → analyze → upload charts/reports folder
# Download data python scripts/file_download.py https://example.com/data.csv --save-dir /workspace # Analyze and generate charts (your code here) # Upload entire results folder python scripts/tos_upload.py /workspace/analysis_results --bucket analysis-bucket -
Batch Processing: Process multiple files and upload output directory
# Process files (your code here, generates /workspace/batch_output/) # Upload entire output directory python scripts/tos_upload.py /workspace/batch_output --bucket batch-results -
Project Export: Upload a complete project directory structure
# Upload project directory with subdirectories python scripts/tos_upload.py /workspace/my_project --bucket project-storage # Result: tos://project-storage/upload/{session_id}/my_project/...
Error Handling
- IF the script raises the error
VOLCENGINE_ACCESS_KEY and VOLCENGINE_SECRET_KEY are not provided or IAM Role is not configured., inform the user that they need to provide theVOLCENGINE_ACCESS_KEYandVOLCENGINE_SECRET_KEYenvironment variable. Write it to the environment variable file in the workspace. If the file already exists, append it to the end. Ensure the environment variable format is correct, make the environment variable effective, and retry the task that just failed.
Notes
- Auto-detection: No need to specify whether uploading a file or directory - the script detects automatically
- Session organization: Files and directories are automatically organized by session ID for easy tracking
- Signed URLs (files only): Valid for 7 days by default (adjustable via
--expires) - TOS Paths (directories): Returned as
tos://bucket/pathformat for reference - Structure preservation: Directory uploads maintain the complete folder structure in TOS
- No timestamp in filenames: Original filenames are preserved (session prefix provides uniqueness)
- Auto-bucket creation: Bucket is automatically created with private ACL if it doesn't exist
- Auto-deduplication: Downloads automatically rename files if they already exist
- IAM Role support: Scripts automatically retrieve credentials from VeFaaS IAM when available
- Error handling: Scripts print clear error messages for network, permission, or file issues
- Bucket requirement: Bucket name must be specified via
--bucketparameter (required)