dify-knowledge-base-upload
SKILL.md
Dify Knowledge Base Upload
Use this skill when
- The target Dify knowledge base is backed by a published pipeline datasource.
- UI uploads succeed but
document/create-by-fileorcreate-by-textyieldscompletedwith0 chunks. - You need to upload one local file and optionally write existing metadata fields afterward.
Prepare caller-side inputs
- Provide a local file path to upload.
- For pure text, save it as
.txtor.mdfirst and upload that file. - Set caller env values:
DIFY_API_BASE_URLDIFY_DATASET_IDDIFY_API_KEY
- Optionally provide pipeline
inputsJSON if the published pipeline exposes user input variables. - Optionally provide metadata JSON:
- a flat object mapping existing Dify metadata field names to values, or
- a list of
{ "name": ..., "value": ... }/{ "id": ..., "value": ... }items. - The skill does not hardcode your metadata schema. Pass the current field names or ids in the JSON you provide at runtime.
- Read
references/env.mdonly if you need env or debug details.
Run workflow
- Use
scripts/upload_to_dataset.pyto:- discover the published datasource plugin from
${DIFY_API_BASE_URL}/datasets/${DIFY_DATASET_ID}/pipeline/datasource-plugins?is_published=true - upload the file to
${DIFY_API_BASE_URL}/datasets/pipeline/file-upload - run the published pipeline through
${DIFY_API_BASE_URL}/datasets/${DIFY_DATASET_ID}/pipeline/run - poll
${DIFY_API_BASE_URL}/datasets/${DIFY_DATASET_ID}/documents/{batch}/indexing-status - fetch
${DIFY_API_BASE_URL}/datasets/${DIFY_DATASET_ID}/documents/{document_id}?metadata=without - optionally resolve metadata field names through
${DIFY_API_BASE_URL}/datasets/${DIFY_DATASET_ID}/metadata - optionally update metadata through
${DIFY_API_BASE_URL}/datasets/${DIFY_DATASET_ID}/documents/metadata
- discover the published datasource plugin from
- Upload without metadata:
python3 scripts/upload_to_dataset.py \
--file /path/to/document.pdf
- Upload with pipeline inputs:
python3 scripts/upload_to_dataset.py \
--file /path/to/document.pdf \
--inputs-json assets/example-pipeline-inputs.json
- Upload and then apply metadata:
python3 scripts/upload_to_dataset.py \
--file /path/to/document.pdf \
--metadata-json assets/example-metadata.json
Interpret response
- Success prints JSON with
batch,document_id,file_upload_response,pipeline_run_response,indexing_status_response,document_response, optionalmetadata_response, andvalidation. validation.okis only true when:indexing_status = "completed"total_segments > 0segment_count > 0
- If
tokensis returned by the document detail API, it should also be greater than0. Some pipeline deployments leavetokens = null; the script reports that as a warning instead of a hard failure. - The script exits non-zero if the required checks fail.
- If
--metadata-jsonis omitted, the script uploads the file only and skips metadata API calls. - Metadata keys must already exist in the target dataset. Unknown names fail fast before the metadata write request.
assets/example-metadata.jsonis only a template. Replace its keys with your own existing Dify metadata field names before live use.- Use
--dry-runto validate local files and request shape without calling Dify.
Troubleshoot quickly
- If auth fails, verify
DIFY_API_KEYandAuthorizationheader format. - If datasource discovery fails, verify the dataset has a published pipeline datasource and the key belongs to that dataset.
- If the pipeline run fails with missing input errors, inspect
user_input_variablesfromdiscovery_responseand pass the missing keys in--inputs-json. - If the published datasource is not
local_file, this skill is not the right uploader. - If metadata resolution fails, rename keys in the metadata JSON to the exact field names defined in Dify.
- If you only have raw text, save it to
.txtor.mdand upload that file. - In the current tested deployment,
.mduploaded successfully while.txthit a Dify-side indexing400. If plain text fails as.txt, retry as.md.
References
references/env.mdreferences/request-response.mdreferences/testing.md
Assets
assets/example-pipeline-inputs.jsonassets/example-metadata.json
Weekly Installs
5
Repository
tiangong-ai/skillsGitHub Stars
4
First Seen
1 day ago
Security Audits
Installed on
openclaw5
gemini-cli4
github-copilot4
codex4
kimi-cli4
cursor4