Back to skills
extension
Category: OtherAPI key required

Alibabacloud Milvus Manage

Alibaba Cloud Milvus full-stack Skill for two planes: control-plane instance management via aliyun CLI, and data-plane Milvus operations via pymilvus. Use wh...

personAuthor: sdk-teamhubclawhub

Alibaba Cloud Milvus Full-Stack Skill

Prerequisites

[MUST] CLI User-Agent — Every aliyun CLI command invocation must include: --user-agent AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage

[MUST] PyMilvus User-Agent — Every pymilvus SDK connection must include: grpc_options={"grpc.primary_user_agent": "AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage"}

Alibaba Cloud CLI version must be >= 3.3.3.

Handle two distinct planes:

  • Control-plane: manage Alibaba Cloud managed Milvus instances with aliyun CLI.
  • Data-plane: operate Milvus with pymilvus Python code.

Treat SKILL.md as the router. Load references/*.md for detailed commands, parameters, and examples.

Scope

Use this skill for:

  • Alibaba Cloud managed Milvus instance lifecycle: create, inspect, scale, rename, configure, network, whitelist.
  • Milvus Python SDK workflows with pymilvus: connect, collections, vectors, search, indexes, partitions, databases, RBAC.
  • Retrieval use cases built on Milvus: semantic search, hybrid search, full-text search, RAG patterns.

Do not use this skill for:

  • self-hosted Milvus deployment on Docker, Helm, Kubernetes, or Milvus Operator,
  • Milvus Java / Go / Node SDKs,
  • other Alibaba Cloud products such as ECS, RDS, OSS, EMR, Kafka, StarRocks,
  • other vector databases such as Zilliz Cloud, Pinecone, Qdrant, or Weaviate.

Route The Request

Control-plane

Route here when the user asks about:

  • creating, scaling, renaming, or inspecting a Milvus instance,
  • connection address, component spec, configuration, public network, whitelist,
  • VPC/VSwitch prerequisites for Alibaba Cloud Milvus,
  • Milvus REST-style CLI APIs, creation parameters, or control-plane troubleshooting.

Read:

Data-plane

Route here when the user asks about:

  • connecting to Milvus with Python,
  • creating collections or schemas,
  • inserting, upserting, querying, deleting, or searching vectors,
  • hybrid search, BM25 full-text search, iterators, indexes,
  • partitions, databases, users, roles, or privileges,
  • Milvus-based RAG or semantic retrieval patterns.

Read:

Shared Guardrails

  • Decide the plane first. Do not mix control-plane instance operations with data-plane SDK code.
  • Confirm destructive actions before execution.
  • Validate untrusted user input before passing it into shell commands or code.
  • Prefer loading a targeted reference doc instead of keeping large inline examples in this file.

Control-Plane Rules

Required Environment

  • Reuse the configured aliyun profile. Verify credentials are configured before API calls.
  • Every aliyun CLI invocation must include the required User-Agent flag:
aliyun ... --user-agent AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage
  • Milvus OpenAPI calls through aliyun must include --force.

Preconditions

Before create or major modify operations:

  1. Confirm RegionId with the user.
  2. Verify VPC and VSwitch resources in that region.
  3. For create, record ZoneId, VpcId, and VSwitchId.
  4. If the request is ambiguous, ask whether the user wants dev/test standalone or production HA cluster.

Baseline decision rule:

  • standalone_pro is the default for dev/test.
  • HA cluster is for production.
  • In HA mode, streaming, data, mix_coordinator, and query must use at least 4 CU; proxy must use at least 2 CU.

Detailed templates and field definitions live in references/instance-lifecycle.md and references/create-params.md.

CLI Calling Modes

Use the API's expected parameter mode. Do not improvise.

# get / delete: business params in URL query
aliyun milvus get "/path?RegionId=<region>&instanceId=<id>" --RegionId <region> --force --user-agent AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage

# post / put with request body: business params in --body JSON
aliyun milvus post "/path?RegionId=<region>" --RegionId <region> --body '{...}' --force --user-agent AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage

# post with query-style flags: business params as --Flag value
aliyun milvus post "/path" --RegionId <region> --InstanceId <id> --force --user-agent AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage

Rules:

  • Always pass --RegionId <region>.
  • For CreateInstance and UpdateInstance, use --body.
  • For query-style POST APIs such as detail, config, network, ACL, and rename operations, use --Flag value.
  • Do not put user-provided raw text directly into a shell command unless it has been validated.

Runtime Safety

  • Do not download and execute remote scripts or unaudited dependencies during control-plane work.
  • Do not use eval or source with untrusted input.
  • Set reasonable timeouts on CLI calls. Prefer short timeouts for reads and bounded polling for long-running async operations.
  • For list APIs, do not trust total blindly; inspect the returned array.
  • Read the full error message before retrying. Automatic retry is appropriate for throttling, not for arbitrary failures.

Forbidden Operations

  • Instance deletion (DeleteInstance) is strictly forbidden through this Skill. If the user requests to delete/release a Milvus instance, do not execute the Milvus delete command through aliyun CLI. Instead, instruct the user to delete the instance via the Alibaba Cloud Milvus Console.

Destructive Operations

Require explicit confirmation before:

  • modifying instance config,
  • disabling public network access.

Use this template:

About to execute: <API>, Target: <InstanceId>, Impact: <Description>. Continue?

For config change and network troubleshooting flows, read references/operations.md or references/instance-lifecycle.md first.

Output Style

  • Summarize instance lists as a compact table.
  • Highlight instanceId, instanceName, status, dbVersion, ha, paymentType, and connection endpoints when relevant.
  • Convert timestamps to readable time.
  • Use --cli-query or jq to trim noisy payloads when useful.

Data-Plane Rules

Connection First

Before writing any pymilvus code, ask for:

  1. deployment type: Milvus Lite, self-hosted standalone/cluster, or Alibaba Cloud managed instance,
  2. URI or endpoint,
  3. authentication method and credentials if needed,
  4. database name if not using default.

Do not assume connection parameters. Use Milvus Lite only when the user explicitly wants local embedded mode.

Minimal connection shape:

from pymilvus import MilvusClient

PYMILVUS_GRPC_OPTIONS = {
    "grpc.primary_user_agent": "AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage"
}

client = MilvusClient(
    uri="<USER_URI>",
    token="<USER_TOKEN>",
    grpc_options=PYMILVUS_GRPC_OPTIONS,
)
  • Every MilvusClient(...) and connections.connect(...) example must pass grpc_options=PYMILVUS_GRPC_OPTIONS.
  • Do not emit pymilvus SDK connection code without grpc_options=PYMILVUS_GRPC_OPTIONS.

For async usage, schema details, and deployment-specific patterns, load the relevant reference doc.

Data Safety And Correctness

  • Never generate fake or placeholder vectors. Always use a real embedding model.
  • The query embedding model must match the model used to create stored vectors.
  • Vector dimensions must exactly match the collection schema.
  • A collection must be loaded before search or query.
  • Confirm destructive operations such as drop_collection, drop_database, or large deletes before executing.
  • Prefer AUTOINDEX unless the user has explicit performance requirements.

Minimal Workflow

For most SDK tasks:

  1. load references/collection.md for schema and collection operations,
  2. load references/vector.md for insert/search/query/delete patterns,
  3. load references/index.md if the user cares about index type, metric, or tuning,
  4. add partition/database/RBAC references only if the task actually needs them.

Common Patterns

Suggested Response Flow

If control-plane

  1. Confirm region and target instance scope.
  2. Read the matching control-plane reference.
  3. Run the command with the correct parameter mode.
  4. Report the key fields, next state, and any follow-up wait conditions.

If data-plane

  1. Ask for connection details first.
  2. Read only the references needed for the requested SDK task.
  3. Write or explain pymilvus code with real embeddings, real connection placeholders, and grpc_options=PYMILVUS_GRPC_OPTIONS.
  4. Call out schema, load-state, index, and dimension pitfalls if they matter.

Reference Map