alibabacloud-dataworks-infra-manage
DataWorks Infrastructure Management
Unified management of Data Sources, Compute Resources, and Resource Groups in Alibaba Cloud DataWorks workspaces, supporting create and query operations.
Architecture
DataWorks
├── Workspaces ─── Query and search workspaces
│ ├── Data Sources ─── 51 types: MySQL, Hologres, MaxCompute, ...
│ └── Compute Resources ─── Hologres, MaxCompute, Flink, Spark
└── Resource Groups ─── Serverless resource group management (cross-workspace)
Dependencies:
Workspace ◀── Data Sources, Compute Resources (must belong to a workspace)
Workspace ◀── Resource Groups (associated via binding; one resource group can bind to multiple workspaces)
Connectivity Test ──depends on──▶ Resource Group (must be bound to the workspace of the data source)
Standard Mode ──requires──▶ Dev (Development) + Prod (Production) dual data sources and compute resources
Global Rules
Prerequisites
- Aliyun CLI >= 3.3.1:
aliyun version(Installation guide: references/cli-installation-guide.md) - First-time use:
aliyun configure set --auto-plugin-install true - jq (required for resource group operations):
which jq - Credential status:
aliyun configure list, verify valid credentials exist - DataWorks edition: Basic edition or above required
Security Rules: DO NOT read/print/echo AK/SK values, DO NOT let users input AK/SK directly, ONLY use
aliyun configure listto check credential status.
Command Formatting
- User-Agent (mandatory): All
aliyunCLI commands must include the--user-agent AlibabaCloud-Agent-Skillsparameter to identify the source. - Single-line commands: When executing Bash commands, must construct as a single-line string; do not use
\for line breaks. - jq step-by-step execution: First execute the
aliyuncommand to get JSON, then format withjq(to avoid multi-line security prompts). - Endpoint mandatory: When specifying the
--regionparameter, you must also add--endpoint dataworks.<REGION_ID>.aliyuncs.com. Not needed when--regionis not specified.
Parameter Confirmation
Before executing any command, all user-customizable parameters must be confirmed by the user. Do not assume or use default values. Exception: When the user has explicitly specified parameter values in the conversation, use them directly without re-confirmation.
Resource group related parameters (mandatory user selection): VPC, VSwitch, Resource Group ID (for binding/connectivity testing) — involve networking and billing, DO NOT auto-select; must display a list for the user to explicitly choose. Confirm even if there is only one option.
RAM Permissions
All operations require dataworks:<APIAction> permissions. Creating resource groups additionally requires AliyunBSSOrderAccess and vpc:DescribeVpcs, vpc:DescribeVSwitches.
Full permission matrix: references/ram-policies.md
Quick Start: New Workspace Infrastructure Initialization
When the user is unsure about specific operations or has vague requirements, guide them through the following process:
- Environment check — Check CLI and credentials per Prerequisites
- Confirm workspace — Use
ListProjectsto locate the workspace,GetProjectto confirm the mode (Simple/Standard) - Create compute resources — Guide engine type selection; the system will automatically create corresponding data sources. Standard Mode requires Dev+Prod pairs. Only pure storage-type data sources (MySQL, Kafka, etc.) need separate data source creation
- Create/bind resource groups — Query existing resource groups → let user select → bind. Guide creation when no resource groups are available
- Test connectivity — Test with bound resource groups; when all pass, inform "Infrastructure configuration complete"
After each step, proactively suggest the next action.
Next Step Guidance
After each write operation is completed and verified, proactively suggest follow-up actions:
| Completed Operation | Recommended Next Step |
|---|---|
| Create compute resource | Standard Mode: "Create the corresponding Dev resource?"; "Test connectivity?" |
| Create data source separately | "Test connectivity?"; Standard Mode: "Create Dev/Prod environment data sources?" |
| Create resource group | "Bind to a workspace?" |
| Bind resource group | "Test data source connectivity?" |
| Connectivity test passed | "Infrastructure is ready." |
| Connectivity test failed | Analyze the error cause, guide the fix |
| Unbind resource group | "Bind to another workspace?" |
Trigger Rules
Trigger scenarios: Data source create/query, compute resource create/query, resource group management, infrastructure initialization, colloquial aliases (DW database connection failure, configure holo/mc resources, create rg)
Not triggered: Data development tasks, scheduling configuration, MaxCompute table management, data integration tasks, ECS/RDS/OSS, workspace member management, data quality/lineage/preview. Standalone workspace queries are handled by the alibabacloud-dataworks-workspace-manage skill.
Interaction Flow
All operations follow: Identify module → Environment check → Collect parameters → Execute command → Verify result → Guide next step
Common aliases: DW=DataWorks, holo=Hologres, mc/MC/odps=MaxCompute, pg=PostgreSQL, rg=Resource Group, ds=Data Source, RDS=InstanceMode MySQL/PG/SQLServer, ADB=AnalyticDB
Naming suggestions: Data source {type}_{business}_{purpose}, Compute resource {type}_{business}, Resource group dw_{purpose}_rg_{env}
Module 0: Workspace Query
If the
alibabacloud-dataworks-workspace-manageskill is available, prefer using it for workspace queries. The following is only a fallback.
aliyun dataworks-public ListProjects --user-agent AlibabaCloud-Agent-Skills --Status Available --PageSize 100
When searching by name, first get the full list then filter .PagingInfo.Projects[] by Name/DisplayName using jq.
Module 1: Data Source Management
Supports 51 data source types. See references/data-sources/README.md for details.
When do you need to create a data source separately? Creating a compute resource (Module 2) will automatically create the corresponding data source. Only pure storage-type databases (MySQL, PostgreSQL, Kafka, MongoDB, etc.) need separate creation.
Some types do not currently support OpenAPI:
polardb-o,polardb-x-2-0,oceanbase,oss-hdfs,graph-database,bigquery,dlf,hdfs,ssh,redis,salesforce,elasticsearch,httpfile
Connection modes: UrlMode (self-hosted databases, requires host/port) or InstanceMode (Alibaba Cloud managed instances, requires instanceId). When unsure, proactively ask the user. InstanceMode is preferred.
Instance query APIs: references/data-sources/instance-apis.md
⚠️ Security Restriction
IMPORTANT: For security reasons, this skill does NOT support modifying or deleting data sources. These operations are disabled to prevent:
- Accidental data loss or service interruption
- Exposure of sensitive credentials (passwords, connection strings)
- Disruption of running data integration tasks
- Unintended changes to production data source configurations
If you need to modify or delete a data source, please use the DataWorks console directly or contact your administrator.
Workspace Mode
Environment note: Prod (Production) is for production data processing; Dev (Development) is for development and debugging, physically isolated from production.
aliyun dataworks-public GetProject --user-agent AlibabaCloud-Agent-Skills --Id <PROJECT_ID> — check DevEnvironmentEnabled:
false→ Simple Mode (1 data source, envType=Prod)true→ Standard Mode (2 data sources, Dev + Prod, physically isolated)
Full mode comparison: references/data-sources/README.md
Task 1.1: Create Data Source (CreateDataSource)
aliyun dataworks-public CreateDataSource --user-agent AlibabaCloud-Agent-Skills [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --ProjectId <PROJECT_ID> --Name <NAME> --Type <TYPE> --ConnectionPropertiesMode <UrlMode|InstanceMode> --ConnectionProperties '<JSON>' --Description "<DESC>"
ConnectionProperties common structure:
- UrlMode:
{"envType":"Prod","address":[{"host":"<IP>","port":<PORT>}],"database":"<DB>","username":"<USER>","password":"<PWD>"} - InstanceMode:
{"envType":"Prod","instanceId":"<ID>","regionId":"<REGION>","database":"<DB>","username":"<USER>","password":"<PWD>"}
Special type structures (Oracle, MaxCompute, HBase, etc.): see references/data-sources/ per-type docs
Cross-account data source configuration: references/cross-account-datasources.md
Task 1.2: Get Data Source (GetDataSource)
aliyun dataworks-public GetDataSource --user-agent AlibabaCloud-Agent-Skills --Id <DATASOURCE_ID> [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com]
Task 1.3: List Data Sources (ListDataSources)
aliyun dataworks-public ListDataSources --user-agent AlibabaCloud-Agent-Skills --ProjectId <PROJECT_ID> [--Types '["mysql"]'] [--EnvType <Dev|Prod>] [--PageNumber 1] [--PageSize 20]
Returns nested structure
DataSources[].DataSource[]; Name/Type are in the outer layer, Id/Description in the inner layer.
Task 1.4: Test Connectivity (TestDataSourceConnectivity)
Process: Query resource group list → Let user select a resource group → Execute test.
# Step 1: Query project resource groups
aliyun dataworks-public ListResourceGroups --user-agent AlibabaCloud-Agent-Skills --ProjectId <PROJECT_ID>
# Step 2: Execute test after user selects a resource group
aliyun dataworks-public TestDataSourceConnectivity --user-agent AlibabaCloud-Agent-Skills --DataSourceId <ID> --ProjectId <PROJECT_ID> --ResourceGroupId "<RG_ID>"
If error
"resourceGroupId is not in the project", the resource group needs to be bound first (confirm with user, then executeAssociateProjectToResourceGroup).
Module 2: Compute Resource Management
Supports Hologres, MaxCompute, Flink, Spark, and other types. The system will automatically create corresponding data sources upon creation.
⚠️ Security Restriction
IMPORTANT: For security reasons, this skill does NOT support modifying or deleting compute resources. These operations are disabled to prevent:
- Accidental data loss or service interruption
- Disruption of running data development and scheduling tasks
- Unintended changes to production compute resource configurations
If you need to modify or delete a compute resource, please use the DataWorks console directly or contact your administrator.
authType Rules
- Dev environment:
authTypeis fixed asExecutor - Prod environment: Options are
PrimaryAccount(recommended),TaskOwner,SubAccount,RamRole. Default recommendation isPrimaryAccountunless user has special requirements
authType details and guidance: references/compute-resources/README.md
Type-Specific Notes
- Hologres: Only supports InstanceMode, requires
instanceId,securityProtocol - MaxCompute: Only supports UrlMode, requires
project,endpointMode
Full ConnectionProperties examples: references/compute-resources/README.md
Task 2.1: Create Compute Resource (CreateComputeResource)
aliyun dataworks-public CreateComputeResource --user-agent AlibabaCloud-Agent-Skills [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --ProjectId <PROJECT_ID> --Name <NAME> --Type <TYPE> --ConnectionPropertiesMode <InstanceMode|UrlMode> --ConnectionProperties '<JSON>' [--Description "<DESC>"]
After creation, use
ListDataSourcesto verify the corresponding data source was auto-generated.
Task 2.2: Get Compute Resource (GetComputeResource)
aliyun dataworks-public GetComputeResource --user-agent AlibabaCloud-Agent-Skills --Id <ID> --ProjectId <PROJECT_ID>
Task 2.3: List Compute Resources (ListComputeResources)
aliyun dataworks-public ListComputeResources --user-agent AlibabaCloud-Agent-Skills --ProjectId <PROJECT_ID> [--Name <FILTER>] [--EnvType <Dev|Prod>] [--PageSize 20] [--SortBy CreateTime] [--Order Desc]
Returns nested structure
ComputeResources[].ComputeResource[]; Name/Type are in the outer layer, Id in the inner layer.
Module 3: Resource Group Management
Manages the full lifecycle of Serverless resource groups.
Task 3.1: Create Resource Group (CreateResourceGroup)
Requires
AliyunBSSOrderAccesspermission.
Interaction flow (let user choose at each step, DO NOT auto-select):
- Query and select VPC:
aliyun vpc DescribeVpcs --user-agent AlibabaCloud-Agent-Skills --RegionId "<REGION_ID>" --PageSize 50
If the list is empty, guide the user to create a VPC; DO NOT auto-create.
- Query and select VSwitch:
aliyun vpc DescribeVSwitches --user-agent AlibabaCloud-Agent-Skills --RegionId "<REGION_ID>" --VpcId "<VPC_ID>" --PageSize 50
- Confirm name and specification → Execute creation:
aliyun dataworks-public CreateResourceGroup --user-agent AlibabaCloud-Agent-Skills [--region <REGION_ID> --endpoint dataworks.<REGION_ID>.aliyuncs.com] --Name "<NAME>" --PaymentType PostPaid --VpcId "<VPC_ID>" --VswitchId "<VSWITCH_ID>" --ClientToken "$(uuidgen 2>/dev/null || echo "token-$(date +%s)")" --Remark "Created by Agent"
After creation, poll GetResourceGroup until status becomes Normal (every 10 seconds, up to 10 minutes).
Task 3.2: Get Resource Group (GetResourceGroup)
aliyun dataworks-public GetResourceGroup --user-agent AlibabaCloud-Agent-Skills --Id "<ID>"
Task 3.3: List Resource Groups (ListResourceGroups)
aliyun dataworks-public ListResourceGroups --user-agent AlibabaCloud-Agent-Skills [--ProjectId <PROJECT_ID>] [--Statuses '["Normal"]'] --PageSize 100
Task 3.4: Bind Resource Group (AssociateProjectToResourceGroup)
Process: Query available resource groups → Display list for user to select → Bind after user confirms.
aliyun dataworks-public AssociateProjectToResourceGroup --user-agent AlibabaCloud-Agent-Skills --ResourceGroupId "<RG_ID>" --ProjectId "<PROJECT_ID>"
Task 3.5: Query Binding Relationships
aliyun dataworks-public ListResourceGroupAssociateProjects --user-agent AlibabaCloud-Agent-Skills --ResourceGroupId "<RG_ID>"
Task 3.6: Unbind Resource Group (DissociateProjectFromResourceGroup)
aliyun dataworks-public DissociateProjectFromResourceGroup --user-agent AlibabaCloud-Agent-Skills --ResourceGroupId "<RG_ID>" --ProjectId "<PROJECT_ID>"
Success Verification
After all write operations, use the corresponding Get/List command to verify the result.
Common Errors
| Error Code | Solution |
|---|---|
| Forbidden.Access / PermissionDenied | Check RAM permissions, see references/ram-policies.md |
| InvalidParameter | Check ConnectionProperties JSON and required parameters |
| EntityNotExists | Verify the ID and Region are correct |
| QuotaExceeded | Delete unused resources or request a quota increase |
| Duplicate* | Use a different name |
Region
Common: cn-hangzhou, cn-shanghai, cn-beijing, cn-shenzhen. Endpoint: dataworks.<region-id>.aliyuncs.com
Full list: references/related-apis.md
Best Practices
- Query before action — Confirm current state before create operations
- Manage by environment — Manage Dev and Prod resources separately
- Verify operations — Use Get/List to verify after each write operation
- Proactive guidance — Suggest the next step after each step completes
- Protect data sources and compute resources — Never modify or delete data sources or compute resources via this skill; use the DataWorks console for such operations
Reference Links
| Reference | Description |
|---|---|
| references/data-sources/README.md | Data source type list and ConnectionProperties examples |
| references/data-sources/ | Detailed configuration docs for each data source type (51 files) |
| references/cross-account-datasources.md | Cross-account data source configuration guide |
| references/compute-resources/README.md | Compute resource ConnectionProperties examples |
| references/cli-installation-guide.md | Aliyun CLI installation guide |
| references/ram-policies.md | RAM permission configuration and policy examples |
| references/related-apis.md | API parameter details and Region Endpoints |