JinDaGe - opticmcp MCP Details

article

README

🚀 OpticMCP

OpticMCP是一个模型上下文协议（MCP）服务器，为AI助手提供相机/视觉工具。它可以连接相机并捕获图像，供大语言模型（LLMs）使用。

🚀 快速开始

运行要求

Python 3.10 及以上版本
系统连接有USB相机

运行方式

从 PyPI 安装（推荐）

pip install optic-mcp

或者使用 uv：

uv pip install optic-mcp

安装完成后，若从 PyPI 安装，可使用以下命令启动 MCP 服务器：

optic-mcp

或者使用 uvx（无需安装）：

uvx optic-mcp

从源代码运行

# 克隆仓库
git clone https://github.com/Timorleiderman/OpticMCP.git
cd OpticMCP

# 使用 uv 安装依赖
uv sync

# 运行服务
uv run optic-mcp

✨ 主要特性

OpticMCP旨在成为AI助手的通用相机接口，支持以下类型的相机：

USB 相机 ✅
IP/网络相机 ✅ - 支持 RTSP、HLS、MJPEG 流
屏幕捕获 ✅ - 支持桌面/显示器捕获
HTTP 图像 ✅ - 从 URL 下载图像
QR/条形码解码 ✅ - 解码 QR 码和条形码
图像分析 ✅ - 支持提取元数据、统计信息、生成直方图、提取主色调
图像比较 ✅ - 支持使用 SSIM、MSE、感知哈希、可视化差异等方法
检测 ✅ - 支持人脸检测、运动检测、边缘检测
树莓派相机（计划中） - CSI 相机模块
移动相机（计划中） - 手机相机集成

具体功能

USB 相机

list_cameras - 扫描并列出所有可用的 USB 相机
save_image - 捕获一帧并直接保存到文件

相机流

start_stream - 开始将相机流传输到本地 HTTP 服务器（MJPEG 格式）
stop_stream - 停止相机流传输
list_streams - 列出所有活动的相机流

多相机仪表盘

start_dashboard - 启动一个动态仪表盘，以响应式网格布局显示所有活动的相机流
stop_dashboard - 停止仪表盘服务器

RTSP 流

rtsp_save_image - 从 RTSP 流中捕获并保存一帧
rtsp_check_stream - 验证 RTSP 流并获取其属性

HLS 流（HTTP 实时流）

hls_save_image - 从 HLS 流中捕获并保存一帧
hls_check_stream - 验证 HLS 流并获取其属性

MJPEG 流

mjpeg_save_image - 从 MJPEG 流（常见于 IP 相机、ESP32 - CAM）中捕获一帧
mjpeg_check_stream - 验证 MJPEG 流的可用性

屏幕捕获

screen_list_monitors - 列出所有可用的显示器
screen_save_image - 捕获显示器的全屏截图
screen_save_region - 捕获屏幕的特定区域

HTTP 图像

http_save_image - 从任何 URL 下载并保存图像
http_check_image - 检查 URL 是否指向有效的图像

QR/条形码解码（需要 libzbar）

decode_qr - 从图像中解码 QR 码
decode_barcode - 解码条形码（EAN、UPC、Code128 等）
decode_all - 从图像中解码所有 QR 码和条形码
decode_and_annotate - 解码并保存带有边界框的注释图像

图像分析

image_get_metadata - 提取图像元数据，包括 EXIF 数据
image_get_stats - 计算图像的亮度、对比度、清晰度
image_get_histogram - 生成颜色直方图，可选择可视化
image_get_dominant_colors - 使用 K - means 聚类提取主色调

图像比较

image_compare_ssim - 使用结构相似性指数（SSIM）比较图像
image_compare_mse - 使用均方误差（MSE）比较图像
image_compare_hash - 使用感知哈希（phash、dhash、ahash）比较图像
image_get_hash - 为图像生成感知哈希
image_diff - 创建可视化差异，突出显示两幅图像的不同之处
image_compare_histograms - 通过颜色直方图比较图像

检测

detect_faces - 使用 Haar 级联或深度神经网络（DNN）检测人脸
detect_faces_save - 检测人脸并保存带有边界框的注释图像
detect_motion - 比较两帧图像以检测运动
detect_edges - 使用 Canny、Sobel 或 Laplacian 方法检测图像边缘
detect_objects - 使用 MobileNet SSD 检测常见物体

📦 安装指南

从 PyPI 安装（推荐）

pip install optic-mcp

或者使用 uv：

uv pip install optic-mcp

从源代码安装

# 克隆仓库
git clone https://github.com/Timorleiderman/OpticMCP.git
cd OpticMCP

# 使用 uv 安装依赖
uv sync

💻 使用示例

运行 MCP 服务器

从 PyPI 安装后运行

optic-mcp

使用 `uvx` 运行（无需安装）

uvx optic-mcp

从源代码运行

uv run optic-mcp

MCP 配置

Claude Desktop

将以下内容添加到你的 Claude Desktop 配置文件中： macOS：~/Library/Application Support/Claude/claude_desktop_config.json Windows：%APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "optic-mcp": {
      "command": "uvx",
      "args": ["optic-mcp"]
    }
  }
}

OpenCode

将以下内容添加到你的 opencode.json 文件中（在项目目录的 .opencode/ 或全局的 ~/.opencode/ 中）：

{
  "mcp": {
    "optic-mcp": {
      "type": "local",
      "command": ["uvx", "optic-mcp"]
    }
  }
}

其他 MCP 客户端

使用 uvx（推荐 - 无需安装）：

{
  "mcpServers": {
    "optic-mcp": {
      "command": "uvx",
      "args": ["optic-mcp"]
    }
  }
}

使用 pip 安装：

{
  "mcpServers": {
    "optic-mcp": {
      "command": "optic-mcp"
    }
  }
}

从源代码运行：

{
  "mcpServers": {
    "optic-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/OpticMCP", "optic-mcp"]
    }
  }
}

📚 详细文档

工具使用说明

list_cameras

扫描可用的 USB 相机（索引 0 - 9）并返回其状态。

[
  {
    "index": 0,
    "status": "available",
    "backend": "AVFOUNDATION",
    "description": "Camera 0 (AVFOUNDATION)"
  }
]

save_image

捕获一帧并保存到磁盘。 参数：

file_path (str) - 图像保存路径
camera_index (int, 默认值: 0) - 要捕获图像的相机索引

返回值： 包含文件路径的成功消息

流工具

将相机流传输到本地 HTTP 服务器，以便在任何浏览器中实时查看。

start_stream

开始将相机流传输到本地 HTTP 服务器。流使用广泛支持的 MJPEG 格式。 参数：

camera_index (int, 默认值: 0) - 要流传输的相机索引
port (int, 默认值: 8080) - 流服务的端口

返回值： 包含流 URL 和状态的字典

{
  "status": "started",
  "camera_index": 0,
  "port": 8080,
  "url": "http://localhost:8080",
  "stream_url": "http://localhost:8080/stream"
}

使用方法：

在浏览器中打开 http://localhost:8080 以使用简单 UI 查看流
使用 http://localhost:8080/stream 获取原始 MJPEG 流（可嵌入其他应用程序）

stop_stream

停止相机流传输。 参数：

camera_index (int, 默认值: 0) - 要停止流传输的相机索引

返回值： 包含状态的字典

list_streams

列出所有活动的相机流。 返回值： 包含活动流信息（包括 URL 和端口）的列表

仪表盘工具

start_dashboard

启动一个动态的多相机仪表盘服务器。仪表盘会自动检测所有活动的相机流，并以响应式网格布局显示它们。 参数：

port (int, 默认值: 9000) - 仪表盘服务的端口

返回值： 包含仪表盘 URL 和状态的字典

{
  "status": "started",
  "port": 9000,
  "url": "http://localhost:9000"
}

使用方法：

使用 start_stream 启动一个或多个相机流。
使用 start_dashboard 启动仪表盘。
在浏览器中打开 http://localhost:9000。
仪表盘每 3 秒自动更新一次，以检测新的/已移除的流。

stop_dashboard

停止仪表盘服务器。 返回值： 包含状态的字典

RTSP 工具

注意： RTSP 功能尚未在真实的 RTSP 硬件/流上进行测试。虽然已实现，但可能需要针对特定相机供应商进行调整。

rtsp_save_image

从 RTSP 流中捕获一帧并保存到磁盘。 参数：

rtsp_url (str) - RTSP 流 URL（例如，rtsp://ip:554/stream）
file_path (str) - 图像保存路径
timeout_seconds (int, 默认值: 10) - 连接超时时间

返回值： 包含文件路径的成功消息

rtsp_check_stream

验证 RTSP 流并返回流信息。 参数：

rtsp_url (str) - 要验证的 RTSP 流 URL
timeout_seconds (int, 默认值: 10) - 连接超时时间

返回值： 包含流状态和属性（宽度、高度、帧率、编解码器）的字典

HLS 工具

hls_save_image

从 HLS 流中捕获一帧并保存到磁盘。 参数：

hls_url (str) - HLS 流 URL（通常以 .m3u8 结尾）
file_path (str) - 图像保存路径
timeout_seconds (int, 默认值: 30) - 连接超时时间

返回值： 包含文件路径的成功消息

hls_check_stream

验证 HLS 流并返回流信息。 参数：

hls_url (str) - 要验证的 HLS 流 URL
timeout_seconds (int, 默认值: 30) - 连接超时时间

返回值： 包含流状态和属性（宽度、高度、帧率、编解码器）的字典

MJPEG 工具

mjpeg_save_image

从 MJPEG 流（常见于 IP 相机、ESP32 - CAM、Arduino 相机）中捕获一帧。 参数：

mjpeg_url (str) - MJPEG 流 URL（例如，http://camera/video.mjpg）
file_path (str) - 图像保存路径
timeout_seconds (int, 默认值: 10) - 连接超时时间

返回值： 包含状态、文件路径和字节大小的字典

mjpeg_check_stream

验证 MJPEG 流 URL。 参数：

mjpeg_url (str) - 要验证的 MJPEG 流 URL
timeout_seconds (int, 默认值: 10) - 连接超时时间

返回值： 包含状态、URL 和内容类型的字典

屏幕捕获工具

screen_list_monitors

列出所有可用的显示器。 返回值： 包含显示器 ID、尺寸和位置的列表

screen_save_image

捕获显示器的全屏截图。 参数：

file_path (str) - 图像保存路径
monitor (int, 默认值: 0) - 显示器索引（0 表示所有显示器组合）

返回值： 包含状态、文件路径和尺寸的字典

screen_save_region

捕获屏幕的特定区域。 参数：

file_path (str) - 图像保存路径
x (int) - 左上角的 X 坐标
y (int) - 左上角的 Y 坐标
width (int) - 宽度（像素）
height (int) - 高度（像素）

返回值： 包含状态、文件路径和区域详细信息的字典

HTTP 图像工具

http_save_image

从 URL 下载图像并保存到磁盘。 参数：

url (str) - 图像 URL（http:// 或 https://）
file_path (str) - 图像保存路径
timeout_seconds (int, 默认值: 30) - 连接超时时间

返回值： 包含状态、文件路径、字节大小和内容类型的字典

http_check_image

使用 HEAD 请求验证图像 URL。 参数：

url (str) - 要验证的图像 URL
timeout_seconds (int, 默认值: 10) - 连接超时时间

返回值： 包含状态、内容类型和字节大小的字典

QR/条形码工具

注意： 这些工具需要 libzbar 系统库。在 macOS 上使用 brew install zbar 安装，在 Linux 上使用 apt install libzbar0 安装。

decode_qr

从图像文件中解码 QR 码。 参数：

file_path (str) - 图像文件路径

返回值： 包含是否找到、数量和代码列表的字典

decode_barcode

从图像文件中解码条形码（EAN、UPC、Code128 等）。 参数：

file_path (str) - 图像文件路径

返回值： 包含是否找到、数量和代码列表的字典

decode_all

从图像文件中解码所有 QR 码和条形码。 参数：

file_path (str) - 图像文件路径

返回值： 包含是否找到、数量和代码列表的字典

decode_and_annotate

解码代码并保存带有边界框的注释图像。 参数：

file_path (str) - 输入图像路径
output_path (str) - 注释输出图像路径

返回值： 包含是否找到、数量、输出路径和代码列表的字典

图像分析工具

image_get_metadata

从图像文件中提取元数据，包括尺寸、格式和 EXIF 数据。 参数：

file_path (str) - 图像文件路径

返回值： 包含宽度、高度、格式、模式、文件大小字节和 EXIF 字典的字典

{
  "width": 1920,
  "height": 1080,
  "format": "JPEG",
  "mode": "RGB",
  "file_size_bytes": 245678,
  "exif": {"Make": "Canon", "Model": "EOS R5", ...}
}

image_get_stats

计算图像的基本统计信息，包括亮度、对比度和清晰度。 参数：

file_path (str) - 图像文件路径

返回值： 包含亮度（0 - 1）、对比度（0 - 1）、清晰度和是否为灰度图像的字典

{
  "brightness": 0.65,
  "contrast": 0.42,
  "sharpness": 2.35,
  "is_grayscale": false
}

image_get_histogram

计算每个通道（R、G、B）的颜色直方图，可选择可视化。 参数：

file_path (str) - 图像文件路径
output_path (str, 可选) - 保存直方图可视化的路径

返回值： 包含通道（r、g、b 数组，每个数组 256 个值）和输出路径（如果提供）的字典

image_get_dominant_colors

使用 K - means 聚类提取主色调。 参数：

file_path (str) - 图像文件路径
num_colors (int, 默认值: 5) - 要提取的颜色数量（1 - 20）

返回值： 包含颜色的 RGB 值、十六进制代码和百分比的列表

{
  "colors": [
    {"rgb": [64, 128, 192], "hex": "#4080C0", "percentage": 35.2},
    {"rgb": [255, 255, 255], "hex": "#FFFFFF", "percentage": 28.1}
  ]
}

图像比较工具

image_compare_ssim

使用结构相似性指数（SSIM）比较两幅图像。 参数：

file_path_1 (str) - 第一幅图像的路径
file_path_2 (str) - 第二幅图像的路径
threshold (float, 默认值: 0.95) - 相似度阈值

返回值： 包含 SSIM 分数（-1 到 1）、是否相似和阈值的字典

{
  "ssim_score": 0.9823,
  "is_similar": true,
  "threshold": 0.95
}

image_compare_mse

使用均方误差（MSE）比较两幅图像。 参数：

file_path_1 (str) - 第一幅图像的路径
file_path_2 (str) - 第二幅图像的路径

返回值： 包含 MSE、是否相同和归一化 MSE（0 - 1）的字典

image_compare_hash

使用感知哈希比较两幅图像。 参数：

file_path_1 (str) - 第一幅图像的路径
file_path_2 (str) - 第二幅图像的路径
hash_type (str, 默认值: "phash") - 哈希类型："phash"、"dhash" 或 "ahash"

返回值： 包含哈希 1、哈希 2、距离、是否相似和哈希类型的字典

{
  "hash_1": "8f0f0f0f0f0f0f0f",
  "hash_2": "8f0f0f0f0f0f0f0f",
  "distance": 0,
  "is_similar": true,
  "hash_type": "phash"
}

image_get_hash

为单幅图像生成感知哈希。 参数：

file_path (str) - 图像文件路径
hash_type (str, 默认值: "phash") - 哈希类型："phash"、"dhash" 或 "ahash"

返回值： 包含哈希（十六进制字符串）和哈希类型的字典

image_diff

创建可视化差异，突出显示两幅图像的不同之处。 参数：

file_path_1 (str) - 参考图像的路径
file_path_2 (str) - 比较图像的路径
output_path (str) - 保存差异可视化的路径
threshold (int, 默认值: 30) - 像素差异阈值（0 - 255）

返回值： 包含状态、输出路径、差异百分比和差异像素数的字典

{
  "status": "success",
  "output_path": "/path/to/diff.png",
  "diff_percentage": 12.5,
  "diff_pixels": 25600
}

image_compare_histograms

通过颜色直方图比较两幅图像。 参数：

file_path_1 (str) - 第一幅图像的路径
file_path_2 (str) - 第二幅图像的路径
method (str, 默认值: "correlation") - 方法："correlation"、"chi_square"、"intersection"、"bhattacharyya"

返回值： 包含分数、方法和是否相似的字典

检测工具

detect_faces

使用 Haar 级联或深度神经网络（DNN）检测图像中的人脸。 参数：

file_path (str) - 图像文件路径
method (str, 默认值: "haar") - 检测方法："haar"（快速）或 "dnn"（准确）

返回值： 包含是否找到、数量和人脸列表（包含 x、y、宽度、高度和置信度（仅 DNN））的字典

{
  "found": true,
  "count": 2,
  "faces": [
    {"x": 120, "y": 80, "width": 150, "height": 150},
    {"x": 400, "y": 100, "width": 140, "height": 140, "confidence": 0.95}
  ]
}

detect_faces_save

检测人脸并保存带有边界框的注释图像。 参数：

file_path (str) - 输入图像路径
output_path (str) - 保存注释图像的路径
method (str, 默认值: "haar") - 检测方法："haar" 或 "dnn"

返回值： 包含是否找到、数量、输出路径和人脸列表的字典

detect_motion

比较两帧图像以检测它们之间的运动。 参数：

file_path_1 (str) - 第一幅（较早）图像的路径
file_path_2 (str) - 第二幅（较晚）图像的路径
threshold (float, 默认值: 25.0) - 像素差异阈值（0 - 255）

返回值： 包含是否检测到运动、运动百分比、运动区域列表和变化像素数的字典

{
  "motion_detected": true,
  "motion_percentage": 15.3,
  "motion_regions": [
    {"x": 200, "y": 150, "width": 80, "height": 120}
  ],
  "changed_pixels": 31250
}

detect_edges

使用各种方法检测图像中的边缘。 参数：

file_path (str) - 输入图像路径
output_path (str) - 保存边缘检测输出的路径
method (str, 默认值: "canny") - 方法："canny"、"sobel" 或 "laplacian"

返回值： 包含状态、输出路径和方法的字典

{
  "status": "success",
  "output_path": "/path/to/edges.png",
  "method": "canny"
}

detect_objects

使用 MobileNet SSD 检测常见物体。 参数：

file_path (str) - 图像文件路径
confidence_threshold (float, 默认值: 0.5) - 最小置信度（0 - 1）

返回值： 包含是否找到、数量和物体列表的字典

注意： 需要预训练的 MobileNet SSD 模型文件。如果模型不可用，将返回空结果。

{
  "found": true,
  "count": 3,
  "objects": [
    {"class": "person", "confidence": 0.92, "x": 50, "y": 100, "width": 200, "height": 400},
    {"class": "car", "confidence": 0.87, "x": 300, "y": 250, "width": 180, "height": 120}
  ]
}

🔧 技术细节

OpenCV + MCP 兼容性

OpenCV 会将调试消息打印到标准错误输出（stderr），这会破坏 MCP 的标准输入输出（stdio）通信。此服务器在导入 cv2 之前在文件描述符级别抑制了标准错误输出，以防止此问题。

📄 许可证

本项目采用 MIT 许可证。

贡献说明

欢迎贡献代码！请参阅 CONTRIBUTING.md 获取指南。

路线图

[x] v0.1.0 - 通过 OpenCV 支持 USB 相机
[x] v0.2.0 - 支持 IP 相机（RTSP 和 HLS 流）
[x] v0.3.0 - 具有实时流的多相机仪表盘
[x] v0.4.0 - 屏幕捕获、MJPEG 流、HTTP 图像、QR/条形码解码
[x] v0.5.0 - 图像分析和比较工具（元数据、统计信息、SSIM、哈希、差异）
[x] v0.6.0 - 检测工具（人脸检测、运动检测、边缘检测）
[ ] v0.7.0 - 相机配置（分辨率、格式等）
[ ] v0.8.0 - 视频录制功能

opticmcp

README

🚀 OpticMCP

🚀 快速开始

运行要求

运行方式

从 PyPI 安装（推荐）

从源代码运行

✨ 主要特性

具体功能

USB 相机

相机流

多相机仪表盘

RTSP 流

HLS 流（HTTP 实时流）

MJPEG 流

屏幕捕获

HTTP 图像

QR/条形码解码（需要 libzbar）

图像分析

图像比较

检测

📦 安装指南

从 PyPI 安装（推荐）

从源代码安装

💻 使用示例

运行 MCP 服务器

从 PyPI 安装后运行

使用 uvx 运行（无需安装）

从源代码运行

MCP 配置

Claude Desktop

OpenCode

其他 MCP 客户端

📚 详细文档

工具使用说明

list_cameras

save_image

流工具

start_stream

stop_stream

list_streams

仪表盘工具

start_dashboard

stop_dashboard

RTSP 工具

rtsp_save_image

rtsp_check_stream

HLS 工具

hls_save_image

hls_check_stream

MJPEG 工具

mjpeg_save_image

mjpeg_check_stream

屏幕捕获工具

screen_list_monitors

screen_save_image

screen_save_region

HTTP 图像工具

http_save_image

http_check_image

QR/条形码工具

decode_qr

decode_barcode

decode_all

decode_and_annotate

图像分析工具

image_get_metadata

image_get_stats

image_get_histogram

image_get_dominant_colors

图像比较工具

image_compare_ssim

image_compare_mse

image_compare_hash

image_get_hash

image_diff

image_compare_histograms

检测工具

detect_faces

使用 `uvx` 运行（无需安装）