pkm-retrieval
Use this skill when the user wants information from an avatar's personal knowledge base (PKM), such as remembered notes, uploaded files, or avatar-specific reference material.
Configuration
PKM credentials are stored in:
skills/pkm-retrieval/config.json— machine-readable config (base_url, dataset_id, api_key)TOOLS.md— human-readable notes with curl examples
Load the config file when making API calls. Do not hardcode secrets.
When to use
- The user asks to search an avatar's personal knowledge base.
- The task depends on content stored in a customer-managed PKM dataset.
- The user already has a
dataset_idor can provide one.
Do not use
- General web or product knowledge questions.
- Cases where only
avatar_idis available and the user has not provided adataset_id. - Tasks that should go through the Avatar chat workflow instead of direct dataset retrieval.
Required input
dataset_id- Retrieval query
If dataset_id is missing
- Ask the user for the avatar's PKM
dataset_id. - Explain that this customer-facing interface uses a client-maintained
dataset_id. - Do not assume or derive
dataset_idfromavatar_idunless the user explicitly asks for the internal admin flow.
Question template
Please provide the PKM dataset_id for this avatar so I can run the retrieval.
API behavior
- Endpoint:
POST /v1/datasets/{dataset_id}/retrieve - In this environment, UI
Vector Searchcorresponds to APIsearch_method: "semantic_search". - Authentication must be configured outside this skill. Never hardcode API keys or secrets in the skill instructions.
- Load credentials from
skills/pkm-retrieval/config.jsonwhen making requests. - There is an internal two-step route:
avatar_id -> pkb_{avatar_id} -> /v1/datasets lookup -> dataset_id -> /retrieve, but this should not be the default customer-facing path.
Executing a Retrieval Query
When the user wants to search the PKM and has already provided a dataset_id:
- Read
skills/pkm-retrieval/config.jsonto getbase_urlandapi_key. - Construct the request with the user's query and the
request body templatebelow. - Call the endpoint and return a concise, summarized answer.
- Include source document names when helpful.
Request body template
{
"query": "<user query>",
"retrieval_model": {
"search_method": "semantic_search",
"reranking_enable": false,
"reranking_mode": "weighted_score",
"reranking_model": {
"reranking_provider_name": "langgenius/xinference/xinference",
"reranking_model_name": "bge-reranker-large"
},
"weights": {
"weight_type": "customized",
"keyword_setting": {
"keyword_weight": 0.5
},
"vector_setting": {
"vector_weight": 0.5,
"embedding_model_name": "jina-embeddings-v3",
"embedding_provider_name": "langgenius/xinference/xinference"
}
},
"top_k": 20,
"score_threshold_enabled": true,
"score_threshold": 0.3
}
}
How to interpret results
- Prefer the highest-score chunk first.
- If one chunk is a clear exact match and the rest are noisy PDF or image-preview fragments, answer from the best chunk and ignore the noisy tail.
- Summarize relevant content for the user instead of dumping raw
sign_contentor long image-preview markup. - Mention source document names when helpful.
Response style
- Return a concise answer first.
- If needed, add a short source line with the document name.
- If nothing relevant is found, say so clearly.
OpenClaw behavior
- If the user asks to use PKM but has not given a
dataset_id, pause and ask for it before calling the endpoint. - If the user only gives
avatar_id, explain that customer-facing PKM retrieval expects a storeddataset_id, and ask whether they want to provide it.
Scan to join WeChat group