RAGFlow Skill
Use this skill to operate RAGFlow through scripts/ragflow.js. The CLI wraps the full v0.25.6 REST API - every action goes through node {baseDir}/scripts/ragflow.js <command> [options]. Prefer --json on any command when the output will be parsed or chained into another step.
Requirements
- Set
RAGFLOW_URLandRAGFLOW_API_KEYin the environment or this skill's.env. - Use Node.js to run bundled scripts.
- Tune chunk deletion retries only when needed with
RAGFLOW_DELETE_CHUNK_RETRIESandRAGFLOW_DELETE_CHUNK_RETRY_DELAY_MS. - Tune the chunk deletion diagnostic script only when needed with
RAGFLOW_REPRO_TIMEOUT_MS,RAGFLOW_REPRO_DELETE_RETRIES,RAGFLOW_REPRO_DELETE_RETRY_DELAY_MS, andRAGFLOW_REPRO_EMBEDDING_MODEL.
Security Notes
- Use HTTPS in production. Production deployments should use
https://forRAGFLOW_URLto protect the API key in transit. Local development (http://localhost) is acceptable for testing. - Use least-privilege API keys. Consider creating dedicated API keys with minimal permissions for specific workflows rather than using admin-level keys.
- Protect your API key. Never share
RAGFLOW_API_KEYin chat messages or commit it to version control. Use environment variables or the skill's.envfile.
Quick Command Reference
| Scenario | Commands |
|----------|----------|
| Knowledge base setup | create-dataset, list-datasets, get-dataset, update-dataset, delete-datasets |
| Document ingestion | upload-documents, list-documents, get-document, update-document, delete-documents, download-document, preview-document, metadata-summary |
| Parsing & chunking | start-parsing, stop-parsing, wait-parsing, list-chunks, add-chunk, update-chunk, delete-chunks |
| Direct retrieval | retrieve |
| Chat assistant | create-chat, list-chats, get-chat, update-chat, patch-chat, delete-chats |
| Chat sessions | create-session, list-sessions, delete-sessions, chat, chat-session |
| Agent | create-agent, list-agents, get-agent, update-agent, delete-agents |
| Agent Tags | list-agent-tags, update-agent-tags |
| Agent sessions | create-agent-session, list-agent-sessions, delete-agent-sessions, agent-chat |
| Connector | list-connectors, create-connector, get-connector, update-connector, delete-connector |
| RAPTOR | run-raptor, trace-raptor |
| Embedded website access | list-system-tokens, create-system-token, delete-system-token, embed-code, embed-info, embed-chat, embed-agent-chat |
| Model discovery | list-models |
| System | system-version, get-log-levels, set-log-level |
Common Workflows
Full RAG pipeline (upload -> parse -> retrieve)
create-dataset --name "My KB" --chunk-method naiveupload-documents --dataset <id> --files ./doc1.pdf ./doc2.txtstart-parsing --dataset <id> --doc-ids <doc_id1> <doc_id2>wait-parsing --dataset <id> --doc-ids <doc_id1> <doc_id2>retrieve --question "What is X?" --datasets <id>
Chat assistant with sessions
create-chat --name "Q&A" --datasets <id> --llm-id qwen-turbo@Tongyi-Qianwencreate-session --chat <chat_id>chat-session --chat <chat_id> --session <session_id> --question "Hello"
Agent workflow
create-agent --title "Assistant" --dsl @agent_dsl.jsoncreate-agent-session --agent <agent_id>agent-chat --agent <agent_id> --session <session_id> --question "Hello"
Agent tags workflow
list-agent-tags --agent <agent_id>update-agent-tags --agent <agent_id> --tags "Tag1,Tag2"
Connector workflow
create-connector --name "GitHub" --type github --token <token>list-connectorsget-connector --id <id>
RAPTOR workflow
run-raptor --dataset <id> --method raptortrace-raptor --id <id>agent-chatis streaming by default. Use--stream falsewhen you need the final JSON result in one response.
Embedded website access
embed-code --chat <chat_id> --type fullscreenorembed-code --agent <agent_id> --type widgetembed-info --chat <chat_id>orembed-info --agent <agent_id>embed-chat --chat <chat_id> --question "Hello"orembed-agent-chat --agent <agent_id> --question "Hello"
embed-chat automatically creates the embedded chatbot session when --session is omitted. RAGFlow's shared-site route only creates a session and returns the prologue on the first no-session request, so the CLI bootstraps session_id first and then sends the real question.
Workflow Decision Guide
The first step in any RAGFlow operation is resolving the target resource ID. After that, choose the right path:
- Authoring or debugging a custom agent DSL? -> Read references/AGENT_GUIDE.md - it is a self-contained guide to the current RAGFlow agent DSL schema and includes minimal examples.
- Need CLI syntax or option details? -> Read references/COMMANDS.md - it's organized by workflow scenario with full option tables.
- Editing client code or checking request/response shapes? -> Read references/API.md - it has code examples for every
RagflowClientmethod. - A command failed? -> Read references/TROUBLESHOOTING.md - common errors with causes and fixes.
- Formatting output for the user? -> Read references/REFERENCE.md - consistent response templates and status labels.
Key Constraints
- Destructive deletes need confirmation. RAGFlow deletes are immediate and irreversible. Confirm before running
delete-datasets,delete-documents,delete-chunks,delete-chats,delete-sessions, ordelete-agents- unless the resource is a temporary artifact you created in the same workflow and the user asked you to clean up. - Upload and parsing are separate steps. RAGFlow does not auto-parse on upload because different documents may need different chunk methods. Upload first, adjust config if needed, then start parsing explicitly.
- Preserve user-uploaded filenames. RAGFlow stores the multipart
filenameas the document name. If a user attachment is materialized as a task ID or temporary path, pass the original filename inline:upload-documents --files <original-name>=<path>. - Use v0.25.6 route shapes from the references. The reference docs match the current skill.
- Tenant model identifiers use the
model@providerformat. When creating datasets with--embedding-modelor chat assistants with--llm-id, the server expects the full identifier, for exampletext-embedding-v4@Tongyi-Qianwenorqwen-turbo@Tongyi-Qianwen, not a numeric model row ID. Uselist-modelsto discover model names and providers. - Chat sessions use the v0.25.6 route.
chat-sessionposts to/api/v1/chat/completionswithchat_idandsession_idin the body. - Chat session history behavior changed in v0.25.6. By default,
POST /api/v1/chat/completionsnow appends only the latest message to stored history. Use--pass-all-historyor setpass_all_history_messages: truein the API payload to replace the entire history.conversation_idis accepted as an alias forsession_id. - Embedded access uses beta tokens and embedded sessions.
embed-code,embed-info,embed-chat, andembed-agent-chatuse the shared-site/api/v1/chatbots/*or/api/v1/agentbots/*routes. If--betais not supplied, the CLI reuses the first/api/v1/system/tokensitem withbetaor creates one. For chatbot completions, the CLI auto-bootstrapssession_idunless--sessionis supplied. - Treat embed auth material as sensitive output. System tokens,
betavalues, and embed URLs or iframe HTML containingauth=are operational secrets. Use them when needed for the task, but do not print the full values back to the user unless the user explicitly asks for them. - Embed URL generation assumes a public RAGFlow origin.
embed-codeuses--originwhen supplied; otherwise it falls back toRAGFLOW_URL. When the API base URL and the public web origin differ, pass--originexplicitly so the generated iframe points at the actual shared-site page. - Prefer the current Agent DSL schema from
AGENT_GUIDE.md. In practice, hand-authored agents should includecomponents,history,path,retrieval,variables,globals, andgraph, plusgraph.nodes[].data.namefor every component-backed node. - Agent tags must be comma-separated strings. When updating agent tags, pass them as a single string of comma-separated values.
- Connectors require valid auth tokens. Ensure the target service token is valid before creating a connector.
- Agent chat uses the v0.25.6 route.
agent-chatposts to/api/v1/agents/chat/completionswithagent_idin the body. - Iteration agents should iterate over a real list output. When an upstream
Agentproduces loop items, prefer an object-shaped structured output such as{"items":[...]}and pointIteration.params.items_refatagent:0@structured.items. Start fromreferences/examples/agents/04-iteration-agent.json. - Chunk deletion may need retries. Some servers can return
rm_chunk deleted chunks 0, expect Ndue to document-store refresh lag even when the chunk exists. The CLI handles this automatically - it retries after confirming the chunk is still visible via exact ID lookup. If retries still fail, runscripts/repro-delete-chunks.jsfor a clean diagnosis.
Output Format
When presenting results to the user, follow the templates in references/REFERENCE.md. Key conventions:
- Use a two-layer output model. For execution, chaining, and parsing, prefer the CLI's raw
--jsonoutput. For the final user-facing response, convert that raw result into a concise summary that follows the reference templates instead of pasting the CLI payload verbatim. - 3+ items with attributes -> Table, abbreviating long IDs
- Sequential steps -> Numbered list
- Parsing status -> Use labels:
UNSTART,RUNNING,CANCEL,DONE,FAIL - Search results -> Table with similarity scores, content as quote blocks
- Embed/token operations -> Summarize what was generated or fetched; redact
token,beta, and anyauth=query value unless the user explicitly asks for the secret - Errors -> Show code and human-readable message
For embed and token-related commands, apply these response rules:
- Use the CLI result internally, but do not mirror the raw JSON back to the user by default.
- Lead with the operational outcome: what resource was targeted, what mode was used, whether a token was reused or created, and whether a session was created or reused.
- Only include the minimum secret material needed to complete the user's request. If the user did not explicitly ask for the value, redact it.
- If the user needs copy-paste embed material, provide it only when explicitly requested and call out that it contains sensitive auth data.
Scan to join WeChat group