Qwen Vision

Analyze images and videos using Alibaba Cloud's Qwen Vision API (通义千问视觉模型).

Usage

Analyze an image:

uv run {baseDir}/scripts/analyze_image.py --image "/path/to/image.jpg" --prompt "请描述这张图片" --api-key sk-xxx

With custom model:

uv run {baseDir}/scripts/analyze_image.py --image "/path/to/image.jpg" --model qwen-vl-max-latest --api-key sk-xxx

API Key

Get your API key from:

models.providers.bailian.apiKey in ~/.openclaw/openclaw.json
Or skills."qwen-image".apiKey in ~/.openclaw/openclaw.json
Or DASHSCOPE_API_KEY environment variable
Or https://dashscope.console.aliyun.com/

Models

| Model | Description | |-------|-------------| | qwen-vl-max-latest | Latest max model (default) | | qwen-vl-plus-latest | Faster, cost-effective |

Prompt Examples

| Task | Prompt | |------|--------| | Describe | "请详细描述这张图片的内容" | | OCR | "提取图片中的所有文字" | | Count | "数一下图中有多少个物体" | | Analyze | "分析这张图表的数据趋势" | | Identify | "这是什么地方/物品？" |

Notes

Supports JPG, PNG, GIF, WebP, BMP formats
Images are encoded as base64 and sent via API
Response time varies by image size and complexity