返回 Skill 列表
extension
分类: 其它需要 API Key

Qwen Vision

利用通义千问视觉API(阿里云灵积)分析图像和视频,支持图像理解、OCR及视觉推理。

person作者: perchoulihubclawhub

Qwen Vision

Analyze images and videos using Alibaba Cloud's Qwen Vision API (通义千问视觉模型).

Usage

Analyze an image:

uv run {baseDir}/scripts/analyze_image.py --image "/path/to/image.jpg" --prompt "请描述这张图片" --api-key sk-xxx

With custom model:

uv run {baseDir}/scripts/analyze_image.py --image "/path/to/image.jpg" --model qwen-vl-max-latest --api-key sk-xxx

API Key

Get your API key from:

  • models.providers.bailian.apiKey in ~/.openclaw/openclaw.json
  • Or skills."qwen-image".apiKey in ~/.openclaw/openclaw.json
  • Or DASHSCOPE_API_KEY environment variable
  • Or https://dashscope.console.aliyun.com/

Models

| Model | Description | |-------|-------------| | qwen-vl-max-latest | Latest max model (default) | | qwen-vl-plus-latest | Faster, cost-effective |

Prompt Examples

| Task | Prompt | |------|--------| | Describe | "请详细描述这张图片的内容" | | OCR | "提取图片中的所有文字" | | Count | "数一下图中有多少个物体" | | Analyze | "分析这张图表的数据趋势" | | Identify | "这是什么地方/物品?" |

Notes

  • Supports JPG, PNG, GIF, WebP, BMP formats
  • Images are encoded as base64 and sent via API
  • Response time varies by image size and complexity