Alibaba Cloud Milvus Full-Stack Skill
Prerequisites
[MUST] CLI User-Agent — Every aliyun CLI command invocation must include:
--user-agent AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage
[MUST] PyMilvus User-Agent — Every pymilvus SDK connection must include:
grpc_options={"grpc.primary_user_agent": "AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage"}
Alibaba Cloud CLI version must be >= 3.3.3.
Handle two distinct planes:
- Control-plane: manage Alibaba Cloud managed Milvus instances with
aliyunCLI. - Data-plane: operate Milvus with
pymilvusPython code.
Treat SKILL.md as the router. Load references/*.md for detailed commands, parameters, and examples.
Scope
Use this skill for:
- Alibaba Cloud managed Milvus instance lifecycle: create, inspect, scale, rename, configure, network, whitelist.
- Milvus Python SDK workflows with
pymilvus: connect, collections, vectors, search, indexes, partitions, databases, RBAC. - Retrieval use cases built on Milvus: semantic search, hybrid search, full-text search, RAG patterns.
Do not use this skill for:
- self-hosted Milvus deployment on Docker, Helm, Kubernetes, or Milvus Operator,
- Milvus Java / Go / Node SDKs,
- other Alibaba Cloud products such as ECS, RDS, OSS, EMR, Kafka, StarRocks,
- other vector databases such as Zilliz Cloud, Pinecone, Qdrant, or Weaviate.
Route The Request
Control-plane
Route here when the user asks about:
- creating, scaling, renaming, or inspecting a Milvus instance,
- connection address, component spec, configuration, public network, whitelist,
- VPC/VSwitch prerequisites for Alibaba Cloud Milvus,
- Milvus REST-style CLI APIs, creation parameters, or control-plane troubleshooting.
Read:
- first-time flow: references/getting-started.md
- create / list / detail / scale / release: references/instance-lifecycle.md
- config / network / inspection / troubleshooting: references/operations.md
- creation field meanings and templates: references/create-params.md
- raw API field reference: references/api-reference.md
- RAM permissions: references/ram-policies.md
Data-plane
Route here when the user asks about:
- connecting to Milvus with Python,
- creating collections or schemas,
- inserting, upserting, querying, deleting, or searching vectors,
- hybrid search, BM25 full-text search, iterators, indexes,
- partitions, databases, users, roles, or privileges,
- Milvus-based RAG or semantic retrieval patterns.
Read:
- collection schema and lifecycle: references/collection.md
- vector CRUD, search, hybrid search, full-text search: references/vector.md
- index types and metrics: references/index.md
- partitions: references/partition.md
- databases: references/database.md
- RBAC: references/user-role.md
- common solution patterns: references/patterns.md
Shared Guardrails
- Decide the plane first. Do not mix control-plane instance operations with data-plane SDK code.
- Confirm destructive actions before execution.
- Validate untrusted user input before passing it into shell commands or code.
- Prefer loading a targeted reference doc instead of keeping large inline examples in this file.
Control-Plane Rules
Required Environment
- Reuse the configured
aliyunprofile. Verify credentials are configured before API calls. - Every
aliyunCLI invocation must include the required User-Agent flag:
aliyun ... --user-agent AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage
- Milvus OpenAPI calls through
aliyunmust include--force.
Preconditions
Before create or major modify operations:
- Confirm
RegionIdwith the user. - Verify VPC and VSwitch resources in that region.
- For create, record
ZoneId,VpcId, andVSwitchId. - If the request is ambiguous, ask whether the user wants dev/test standalone or production HA cluster.
Baseline decision rule:
standalone_prois the default for dev/test.- HA cluster is for production.
- In HA mode,
streaming,data,mix_coordinator, andquerymust use at least 4 CU;proxymust use at least 2 CU.
Detailed templates and field definitions live in references/instance-lifecycle.md and references/create-params.md.
CLI Calling Modes
Use the API's expected parameter mode. Do not improvise.
# get / delete: business params in URL query
aliyun milvus get "/path?RegionId=<region>&instanceId=<id>" --RegionId <region> --force --user-agent AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage
# post / put with request body: business params in --body JSON
aliyun milvus post "/path?RegionId=<region>" --RegionId <region> --body '{...}' --force --user-agent AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage
# post with query-style flags: business params as --Flag value
aliyun milvus post "/path" --RegionId <region> --InstanceId <id> --force --user-agent AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage
Rules:
- Always pass
--RegionId <region>. - For
CreateInstanceandUpdateInstance, use--body. - For query-style POST APIs such as detail, config, network, ACL, and rename operations, use
--Flag value. - Do not put user-provided raw text directly into a shell command unless it has been validated.
Runtime Safety
- Do not download and execute remote scripts or unaudited dependencies during control-plane work.
- Do not use
evalorsourcewith untrusted input. - Set reasonable timeouts on CLI calls. Prefer short timeouts for reads and bounded polling for long-running async operations.
- For list APIs, do not trust
totalblindly; inspect the returned array. - Read the full error message before retrying. Automatic retry is appropriate for throttling, not for arbitrary failures.
Forbidden Operations
- Instance deletion (DeleteInstance) is strictly forbidden through this Skill. If the user requests to delete/release a Milvus instance, do not execute the Milvus delete command through
aliyunCLI. Instead, instruct the user to delete the instance via the Alibaba Cloud Milvus Console.
Destructive Operations
Require explicit confirmation before:
- modifying instance config,
- disabling public network access.
Use this template:
About to execute:
<API>, Target:<InstanceId>, Impact:<Description>. Continue?
For config change and network troubleshooting flows, read references/operations.md or references/instance-lifecycle.md first.
Output Style
- Summarize instance lists as a compact table.
- Highlight
instanceId,instanceName,status,dbVersion,ha,paymentType, and connection endpoints when relevant. - Convert timestamps to readable time.
- Use
--cli-queryorjqto trim noisy payloads when useful.
Data-Plane Rules
Connection First
Before writing any pymilvus code, ask for:
- deployment type: Milvus Lite, self-hosted standalone/cluster, or Alibaba Cloud managed instance,
- URI or endpoint,
- authentication method and credentials if needed,
- database name if not using
default.
Do not assume connection parameters. Use Milvus Lite only when the user explicitly wants local embedded mode.
Minimal connection shape:
from pymilvus import MilvusClient
PYMILVUS_GRPC_OPTIONS = {
"grpc.primary_user_agent": "AlibabaCloud-Agent-Skills/alibabacloud-milvus-manage"
}
client = MilvusClient(
uri="<USER_URI>",
token="<USER_TOKEN>",
grpc_options=PYMILVUS_GRPC_OPTIONS,
)
- Every
MilvusClient(...)andconnections.connect(...)example must passgrpc_options=PYMILVUS_GRPC_OPTIONS. - Do not emit
pymilvusSDK connection code withoutgrpc_options=PYMILVUS_GRPC_OPTIONS.
For async usage, schema details, and deployment-specific patterns, load the relevant reference doc.
Data Safety And Correctness
- Never generate fake or placeholder vectors. Always use a real embedding model.
- The query embedding model must match the model used to create stored vectors.
- Vector dimensions must exactly match the collection schema.
- A collection must be loaded before search or query.
- Confirm destructive operations such as
drop_collection,drop_database, or large deletes before executing. - Prefer
AUTOINDEXunless the user has explicit performance requirements.
Minimal Workflow
For most SDK tasks:
- load references/collection.md for schema and collection operations,
- load references/vector.md for insert/search/query/delete patterns,
- load references/index.md if the user cares about index type, metric, or tuning,
- add partition/database/RBAC references only if the task actually needs them.
Common Patterns
- quick prototype collection: references/collection.md
- vector CRUD and similarity search: references/vector.md
- hybrid search or full-text search: references/vector.md
- RAG / semantic retrieval patterns: references/patterns.md
- index tuning: references/index.md
Suggested Response Flow
If control-plane
- Confirm region and target instance scope.
- Read the matching control-plane reference.
- Run the command with the correct parameter mode.
- Report the key fields, next state, and any follow-up wait conditions.
If data-plane
- Ask for connection details first.
- Read only the references needed for the requested SDK task.
- Write or explain
pymilvuscode with real embeddings, real connection placeholders, andgrpc_options=PYMILVUS_GRPC_OPTIONS. - Call out schema, load-state, index, and dimension pitfalls if they matter.
Reference Map
- references/getting-started.md: first Milvus instance from scratch
- references/instance-lifecycle.md: create, inspect, scale, rename, release
- references/operations.md: config, network, ACL, inspection, troubleshooting
- references/create-params.md: create body fields and component templates
- references/api-reference.md: raw API signatures and return fields
- references/collection.md: schema and collection lifecycle
- references/vector.md: insert, search, hybrid search, BM25, iterators
- references/index.md: index types and metric guidance
- references/partition.md: partition operations
- references/database.md: database operations
- references/user-role.md: users, roles, privileges
- references/patterns.md: RAG and semantic search patterns
- references/ram-policies.md: IAM/RAM policies
Scan to contact