pkm-retrieval

Use this skill when the user wants information from an avatar's personal knowledge base (PKM), such as remembered notes, uploaded files, or avatar-specific reference material.

Configuration

PKM credentials are stored in:

skills/pkm-retrieval/config.json — machine-readable config (base_url, dataset_id, api_key)
TOOLS.md — human-readable notes with curl examples

Load the config file when making API calls. Do not hardcode secrets.

When to use

The user asks to search an avatar's personal knowledge base.
The task depends on content stored in a customer-managed PKM dataset.
The user already has a dataset_id or can provide one.

Do not use

General web or product knowledge questions.
Cases where only avatar_id is available and the user has not provided a dataset_id.
Tasks that should go through the Avatar chat workflow instead of direct dataset retrieval.

Required input

dataset_id
Retrieval query

If dataset_id is missing

Ask the user for the avatar's PKM dataset_id.
Explain that this customer-facing interface uses a client-maintained dataset_id.
Do not assume or derive dataset_id from avatar_id unless the user explicitly asks for the internal admin flow.

Question template

Please provide the PKM dataset_id for this avatar so I can run the retrieval.

API behavior

Endpoint: POST /v1/datasets/{dataset_id}/retrieve
In this environment, UI Vector Search corresponds to API search_method: "semantic_search".
Authentication must be configured outside this skill. Never hardcode API keys or secrets in the skill instructions.
Load credentials from skills/pkm-retrieval/config.json when making requests.
There is an internal two-step route: avatar_id -> pkb_{avatar_id} -> /v1/datasets lookup -> dataset_id -> /retrieve, but this should not be the default customer-facing path.

Executing a Retrieval Query

When the user wants to search the PKM and has already provided a dataset_id:

Read skills/pkm-retrieval/config.json to get base_url and api_key.
Construct the request with the user's query and the request body template below.
Call the endpoint and return a concise, summarized answer.
Include source document names when helpful.

Request body template

{
  "query": "<user query>",
  "retrieval_model": {
    "search_method": "semantic_search",
    "reranking_enable": false,
    "reranking_mode": "weighted_score",
    "reranking_model": {
      "reranking_provider_name": "langgenius/xinference/xinference",
      "reranking_model_name": "bge-reranker-large"
    },
    "weights": {
      "weight_type": "customized",
      "keyword_setting": {
        "keyword_weight": 0.5
      },
      "vector_setting": {
        "vector_weight": 0.5,
        "embedding_model_name": "jina-embeddings-v3",
        "embedding_provider_name": "langgenius/xinference/xinference"
      }
    },
    "top_k": 20,
    "score_threshold_enabled": true,
    "score_threshold": 0.3
  }
}

How to interpret results

Prefer the highest-score chunk first.
If one chunk is a clear exact match and the rest are noisy PDF or image-preview fragments, answer from the best chunk and ignore the noisy tail.
Summarize relevant content for the user instead of dumping raw sign_content or long image-preview markup.
Mention source document names when helpful.

Response style

Return a concise answer first.
If needed, add a short source line with the document name.
If nothing relevant is found, say so clearly.

OpenClaw behavior

If the user asks to use PKM but has not given a dataset_id, pause and ask for it before calling the endpoint.
If the user only gives avatar_id, explain that customer-facing PKM retrieval expects a stored dataset_id, and ask whether they want to provide it.

shasdddd

pkm-retrieval

Configuration

Executing a Retrieval Query