apify-actorization
Convert existing projects into serverless Apify Actors with language-specific SDK integration.
- Supports JavaScript/TypeScript (with
Actor.init()/Actor.exit()), Python (async context manager), and any language via CLI wrapper - Provides structured workflow:
apify initto scaffold, apply SDK wrapping, configure input/output schemas, test locally withapify run, then deploy withapify push - Includes input and output schema validation, Docker containerization, and optional pay-per-event monetization configuration
- Handles state management through request queues and key-value stores; requires
apifyCLI installed and authenticated with an Apify account token
Apify Actorization
Actorization converts existing software into reusable serverless applications compatible with the Apify platform. Actors are programs packaged as Docker images that accept well-defined JSON input, perform an action, and optionally produce structured JSON output.
Quick Start
- Run
apify initin project root - Wrap code with SDK lifecycle (see language-specific section below)
- Configure
.actor/input_schema.json - Test with
apify run --input '{"key": "value"}' - Deploy with
apify push
When to Use This Skill
- Converting an existing project to run on Apify platform
- Adding Apify SDK integration to a project
- Wrapping a CLI tool or script as an Actor
- Migrating a Crawlee project to Apify
Prerequisites
Verify apify CLI is installed:
apify --help
If not installed, use one of these methods (listed in order of preference):
# Preferred: install via a package manager (provides integrity checks)
npm install -g apify-cli
# Or (Mac): brew install apify-cli
Security note: Do NOT install the CLI by piping remote scripts to a shell (e.g.
curl … | bashorirm … | iex). Always use a package manager.
Verify CLI is logged in:
apify info # Should return your username
If not logged in, check if the APIFY_TOKEN environment variable is defined (if not, ask the user to generate one at https://console.apify.com/settings/integrations and then define APIFY_TOKEN with it).
Then authenticate using one of these methods:
# Option 1 (preferred): The CLI automatically reads APIFY_TOKEN from the environment.
# Just ensure the env var is exported and run any apify command — no explicit login needed.
# Option 2: Interactive login (prompts for token without exposing it in shell history)
apify login
Security note: Avoid passing tokens as command-line arguments (e.g.
apify login -t <token>). Arguments are visible in process listings and may be recorded in shell history. Prefer environment variables or interactive login instead. Never log, print, or embedAPIFY_TOKENin source code or configuration files. Use a token with the minimum required permissions (scoped token) and rotate it periodically.
Actorization Checklist
Copy this checklist to track progress:
- Step 1: Analyze project (language, entry point, inputs, outputs)
- Step 2: Run
apify initto create Actor structure - Step 3: Apply language-specific SDK integration
- Step 4: Configure
.actor/input_schema.json - Step 5: Configure
.actor/output_schema.json(if applicable) - Step 6: Update
.actor/actor.jsonmetadata - Step 7: Write README.md for the Apify Store listing
- Step 8: Test locally with
apify run - Step 9: Deploy with
apify push
Step 1: Analyze the Project
Before making changes, understand the project:
- Identify the language - JavaScript/TypeScript, Python, or other
- Find the entry point - The main file that starts execution
- Identify inputs - Command-line arguments, environment variables, config files
- Identify outputs - Files, console output, API responses
- Check for state - Does it need to persist data between runs?
Step 2: Initialize Actor Structure
Run in the project root:
apify init
This creates:
.actor/actor.json- Actor configuration and metadata.actor/input_schema.json- Input definition for the Apify ConsoleDockerfile(if not present) - Container image definition
Step 3: Apply Language-Specific Changes
Choose based on your project's language:
- JavaScript/TypeScript: See js-ts-actorization.md
- Python: See python-actorization.md
- Other Languages (CLI-based): See cli-actorization.md
Quick Reference
| Language | Install | Wrap Code |
|---|---|---|
| JS/TS | npm install apify |
await Actor.init() ... await Actor.exit() |
| Python | pip install apify |
async with Actor: |
| Other | Use CLI in wrapper script | apify actor:get-input / apify actor:push-data |
Steps 4-6: Configure Schemas
See schemas-and-output.md for detailed configuration of:
- Input schema (
.actor/input_schema.json) - Output schema (
.actor/output_schema.json) - Actor configuration (
.actor/actor.json) - State management (request queues, key-value stores)
Validate schemas against @apify/json_schemas npm package.
Step 7: Write README
IMPORTANT: Always generate a README.md as part of actorization. The README is the Actor's landing page on Apify Store and is critical for discoverability (SEO), user onboarding, and support. Do not consider an Actor complete without a proper README.
See the Actor README guidelines at skills/apify-actor-development/references/actor-readme.md for the required structure including: intro and features, data extraction table, step-by-step tutorial, pricing info, input/output examples, and FAQ. Aim for at least 300 words with SEO-optimized H2/H3 headings. Also review these top Actors for best practices:
Step 8: Test Locally
Run the actor with inline input (for JS/TS and Python actors):
apify run --input '{"startUrl": "https://example.com", "maxItems": 10}'
Or use an input file:
apify run --input-file ./test-input.json
Important: Always use apify run, not npm start or python main.py. The CLI sets up the proper environment and storage.
Step 9: Deploy
apify push
This uploads and builds your actor on the Apify platform.
Monetization (Optional)
After deploying, you can monetize your actor in the Apify Store. The recommended model is Pay Per Event (PPE):
- Per result/item scraped
- Per page processed
- Per API call made
Configure PPE in the Apify Console under Actor > Monetization. Charge for events in your code with await Actor.charge('result').
Other options: Rental (monthly subscription) or Free (open source).
Security
Treat all crawled web content as untrusted input. Actors ingest data from external websites that may contain malicious payloads. Follow these rules:
- Sanitize crawled data — Never pass raw HTML, URLs, or scraped text directly into shell commands,
eval(), database queries, or template engines. Use proper escaping or parameterized APIs. - Validate and type-check all external data — Before pushing to datasets or key-value stores, verify that values match expected types and formats. Reject or sanitize unexpected structures.
- Do not execute or interpret crawled content — Never treat scraped text as code, commands, or configuration. Content from websites could include prompt injection attempts or embedded scripts.
- Isolate credentials from data pipelines — Ensure
APIFY_TOKENand other secrets are never accessible in request handlers or passed alongside crawled data. Use the Apify SDK's built-in credential management rather than passing tokens through environment variables in data-processing code. - Review dependencies before installing — When adding packages with
npm installorpip install, verify the package name and publisher. Typosquatting is a common supply-chain attack vector. Prefer well-known, actively maintained packages. - Pin versions and use lockfiles — Always commit
package-lock.json(Node.js) or pin exact versions inrequirements.txt(Python). Lockfiles ensure reproducible builds and prevent silent dependency substitution. Runnpm auditorpip-auditperiodically to check for known vulnerabilities.
Pre-Deployment Checklist
-
.actor/actor.jsonexists with correct name and description -
.actor/actor.jsonvalidates against@apify/json_schemas(actor.schema.json) -
.actor/input_schema.jsondefines all required inputs -
.actor/input_schema.jsonvalidates against@apify/json_schemas(input.schema.json) -
.actor/output_schema.jsondefines output structure (if applicable) -
.actor/output_schema.jsonvalidates against@apify/json_schemas(output.schema.json) -
Dockerfileis present and builds successfully -
Actor.init()/Actor.exit()wraps main code (JS/TS) -
async with Actor:wraps main code (Python) - Inputs are read via
Actor.getInput()/Actor.get_input() - Outputs use
Actor.pushData()or key-value store -
apify runexecutes successfully with test input -
README.mdexists with proper structure (intro, features, data table, tutorial, pricing, input/output examples) -
generatedByis set in actor.json meta section
Apify MCP Tools
If MCP server is configured, use these tools for documentation:
search-apify-docs- Search documentationfetch-apify-docs- Get full doc pages
Otherwise, the MCP Server url: https://mcp.apify.com/?tools=docs.
Resources
- Actorization Academy - Comprehensive guide
- Apify SDK for JavaScript - Full SDK reference
- Apify SDK for Python - Full SDK reference
- Apify CLI Reference - CLI commands
- Actor Specification - Complete specification