sf-datacloud

SKILL.md

sf-datacloud: Salesforce Data Cloud Orchestrator

Use this skill when the user needs product-level Data Cloud workflow guidance rather than a single isolated command family: pipeline setup, cross-phase troubleshooting, data spaces, data kits, or deciding whether a task belongs in Connect, Prepare, Harmonize, Segment, Act, or Retrieve.

This skill intentionally follows sf-skills house style while using the external sf data360 command surface as the runtime. The plugin is not vendored into this repo.


When This Skill Owns the Task

Use sf-datacloud when the work involves:

  • multi-phase Data Cloud setup or remediation
  • data spaces (sf data360 data-space *)
  • data kits (sf data360 data-kit *)
  • health checks (sf data360 doctor)
  • CRM-to-unified-profile pipeline design
  • deciding how to move from ingestion → harmonization → segmentation → activation
  • cross-phase troubleshooting where the root cause is not yet clear

Delegate to a phase-specific skill when the user is focused on one area:

Phase Use this skill Typical scope
Connect sf-datacloud-connect connections, connectors, source discovery
Prepare sf-datacloud-prepare data streams, DLOs, transforms, DocAI
Harmonize sf-datacloud-harmonize DMOs, mappings, identity resolution, data graphs
Segment sf-datacloud-segment segments, calculated insights
Act sf-datacloud-act activations, activation targets, data actions
Retrieve sf-datacloud-retrieve SQL, search indexes, vector search, async query

Delegate outside the family when the user is:


Required Context to Gather First

Ask for or infer:

  • target org alias
  • whether the plugin is already installed and linked
  • whether the user wants design guidance, read-only inspection, or live mutation
  • data sources involved: CRM objects, external databases, file ingestion, knowledge, etc.
  • desired outcome: unified profiles, segments, activations, vector search, analytics, or troubleshooting
  • whether the user is working in the default data space or a custom one
  • whether the org has already been classified with scripts/diagnose-org.mjs
  • which command family is failing today, if any

If plugin availability or org readiness is uncertain, start with:


Core Operating Rules

  • Use the external sf data360 plugin runtime; do not reimplement or vendor the command layer.
  • Prefer the smallest phase-specific skill once the task is localized.
  • Run readiness classification before mutation-heavy work. Prefer scripts/diagnose-org.mjs over guessing from one failing command.
  • For sf data360 commands, suppress linked-plugin warning noise with 2>/dev/null unless the stderr output is needed for debugging.
  • Distinguish Data Cloud SQL from CRM SOQL.
  • Do not treat sf data360 doctor as a full-product readiness check; the current upstream command only checks the search-index surface.
  • Do not treat query describe as a universal tenant probe; only use it with a known DMO/DLO table after broader readiness is confirmed.
  • Preserve Data Cloud-specific API-version workarounds when they matter.
  • Prefer generic, reusable JSON definition files over org-specific workshop payloads.

Recommended Workflow

1. Verify the runtime and auth

Confirm:

  • sf is installed
  • the community Data Cloud plugin is linked
  • the target org is authenticated

Recommended checks:

sf data360 man
sf org display -o <alias>
bash ~/.claude/skills/sf-datacloud/scripts/verify-plugin.sh <alias>

Treat sf data360 doctor as a broad health signal, not the sole gate. On partially provisioned orgs it can fail even when read-only command families like connectors, DMOs, or segments still work.

2. Classify readiness before changing anything

Run the shared classifier first:

node ~/.claude/skills/sf-datacloud/scripts/diagnose-org.mjs -o <org> --json

Only use a query-plane probe after you know the table name is real:

node ~/.claude/skills/sf-datacloud/scripts/diagnose-org.mjs -o <org> --phase retrieve --describe-table MyDMO__dlm --json

Use the classifier to distinguish:

  • empty-but-enabled modules
  • feature-gated modules
  • query-plane issues
  • runtime/auth failures

3. Discover existing state with read-only commands

Use targeted inspection after classification:

sf data360 doctor -o <org> 2>/dev/null
sf data360 data-space list -o <org> 2>/dev/null
sf data360 data-stream list -o <org> 2>/dev/null
sf data360 dmo list -o <org> 2>/dev/null
sf data360 identity-resolution list -o <org> 2>/dev/null
sf data360 segment list -o <org> 2>/dev/null
sf data360 activation platforms -o <org> 2>/dev/null

4. Localize the phase

Route the task:

  • source/connector issue → Connect
  • ingestion/DLO/stream issue → Prepare
  • mapping/IR/unified profile issue → Harmonize
  • audience or insight issue → Segment
  • downstream push issue → Act
  • SQL/search/index issue → Retrieve

5. Choose deterministic artifacts when possible

Prefer JSON definition files and repeatable scripts over one-off manual steps. Generic templates live in:

  • assets/definitions/data-stream.template.json
  • assets/definitions/dmo.template.json
  • assets/definitions/mapping.template.json
  • assets/definitions/relationship.template.json
  • assets/definitions/identity-resolution.template.json
  • assets/definitions/data-graph.template.json
  • assets/definitions/calculated-insight.template.json
  • assets/definitions/segment.template.json
  • assets/definitions/activation-target.template.json
  • assets/definitions/activation.template.json
  • assets/definitions/data-action-target.template.json
  • assets/definitions/data-action.template.json
  • assets/definitions/search-index.template.json

6. Verify after each phase

Typical verification:

  • stream/DLO exists
  • DMO/mapping exists
  • identity resolution run completed
  • unified records or segment counts look correct
  • activation/search index status is healthy

High-Signal Gotchas

  • connection list requires --connector-type.
  • dmo list --all is useful when you need the full catalog, but first-page dmo list is often enough for readiness checks and much faster.
  • Segment creation may need --api-version 64.0.
  • segment members returns opaque IDs; use SQL joins for human-readable details.
  • sf data360 doctor can fail on partially provisioned orgs even when some read-only commands still work; fall back to targeted smoke checks.
  • query describe errors such as Couldn't find CDP tenant ID or DataModelEntity ... not found are query-plane clues, not automatic proof that the whole product is disabled.
  • Many long-running jobs are asynchronous in practice even when the command returns quickly.
  • Some Data Cloud operations still require UI setup outside the CLI runtime.

Output Format

When finishing, report in this order:

  1. Task classification
  2. Runtime status
  3. Readiness classification
  4. Phase(s) involved
  5. Commands or artifacts used
  6. Verification result
  7. Next recommended step

Suggested shape:

Data Cloud task: <setup / inspect / troubleshoot / migrate>
Runtime: <plugin ready / missing / partially verified>
Readiness: <ready / ready_empty / partial / feature_gated / blocked>
Phases: <connect / prepare / harmonize / segment / act / retrieve>
Artifacts: <json files, commands, scripts>
Verification: <passed / partial / blocked>
Next step: <next phase, setup guidance, or cross-skill handoff>

Cross-Skill Integration

Need Delegate to Reason
load or clean CRM source data sf-data seed or fix source records before ingestion
create missing CRM schema sf-metadata Data Cloud expects existing objects/fields
deploy permissions or bundles sf-deploy environment preparation
write Apex against Data Cloud outputs sf-apex code implementation
Flow automation after segmentation/activation sf-flow declarative orchestration
session tracing / STDM / parquet analysis sf-ai-agentforce-observability different Data Cloud use case

Reference Map

Start here

Phase skills

Deterministic helpers

Weekly Installs
18
GitHub Stars
223
First Seen
4 days ago
Installed on
cursor18
opencode17
github-copilot17
codex17
amp17
cline17