返回 MCP 目录
public公开dns本地运行

智能文档提取 - ComIDP

ComIDP MCP Server 是一款轻量级的模型上下文协议(MCP)服务器,旨在实现 ComIDP 与 AI 聊天机器人的无缝集成,提供非结构化文档处理功能,例如解析文档或者从 PDF 文件中提取数据。

article

README

Intelligent Document Extraction - ComIDP

Supported Feature: Intelligent Document Extraction

ComIDP Intelligent Document Extraction automatically extracts key information from your uploaded unstructured documents, like PDF, converts it into structured data, and supports batch processing to significantly improve document handling efficiency.

In the future, we will support more document formats (e.g., JPG, PNG etc.) and integrate with other ComIDP tools for advanced processing

What is ComIDP MCP Server

ComIDP MCP Server is a lightweight Model Context Protocol (MCP) server designed for seamless integrating ComIDP with AI chatbots, providing unstructured document processing functionalities, such as extracting data from PDF files. The service returns results in structured plain-text format, enabling downstream processing or archival.

2.4

License

This project is licensed under the Apache License 2.0. Please contact us for a trial license key.

ComIDP MCP Server for Claude Desktop

Setup

  1. Dependencies:
  • Ensure you have the following dependencies installed:

    • Python 3.10 or higher

    • pip (Python package installer)

    • uv

      pip install uv 
      
  • Create a virtual environment and install the required packages:

    • Windows:
    cd comidp-mcp\\src
    python -m venv .venv
    .venv\\Scripts\\activate
    pip install -r requirements.txt
    
    • Linux / MacOS:
    cd comidp-mcp/src
    python -m venv .venv
    source .venv/bin/activate
    pip install -r requirements.txt
    
  1. Configure Claude Desktop

    To configure the integration with Claude Desktop, you need to edit the claude_desktop_config.json file.

    If the file does not already exist, you can create and open it directly from Claude Desktop by following these steps:

    1. Open Claude Desktop.

    2. Click the Claude icon in the top-left corner of the window.

    3. Navigate to File → Settings → Developer → Edit Config.

    This will automatically open (or create) the claude_desktop_config.json file in your system's default editor.

    Once the file is open, you can add your configuration for the comidp-mcp tool as needed. Then you can add a comidp-mcp server configuration in mcpServers section of the file. Here is an example configuration:

    {
        "mcpServers": { 
            "comidp-mcp": {     
                "command": "uv", 
                "args": [
                        "run", 
                        "PATH/TO/comidp-mcp/src/virtual environment python",
                        "PATH/TO/comidp-mcp/src/comidp_tools.py"
                ],
                "env": {
                    "IDPKEY": "your_idp_key_here"
                }
            }
        }
    }
    
    • Note:
      1. The virtual environment python path should point to the Python executable in your virtual environment. It should look like
        • For Windows C:\\path\\to\\comidp-mcp\\.venv\\Scripts\\python.exe.
        • For Linux/MacOS /path/to/comidp-mcp/.venv/bin/python.
      2. All paths should be absolute paths.
      3. Replace your_idp_key_here with your actual IDPKEY API key.
  2. Restart Claude Desktop.

API Reference

Data extraction

def data_extraction(filenames: list, save_dir_path: str = "output", key: str = "", err_msg_lang: str = "en") -> Dict[str, str]:
    """
    Extract data from PDF files and save to TXT files in the specified folder.

    Params:
        filenames: A list of PDF file paths.
        save_dir_path: Folder where the result TXT files will be saved.
        key: The API key for IDPKEY. Required on the first call.
        err_msg_lang: Optional language code for error messages (e.g., 'zh' or 'en'). Defaults to 'en'.

    Returns:
        A dictionary mapping each input file path to its corresponding output TXT file path.
        If an error occurs, the value will be an error message.
    """

def data_extraction_from_folder(folder: str, save_dir_path: str, recursive: bool = False, key: str = "", err_msg_lang: str = "en") -> Dict[str, str]:
    """
    Extract data from PDF files in a folder and save to TXT files in the specified folder.

    Params:
        folder: Path to the folder containing PDF files.
        save_dir_path: Path to the folder where the result files will be saved.
        key: The API key for IDPKEY. Required on the first call.
        recursive: If true, recursively search subdirectories for PDF files.
        err_msg_lang: Optional language code for error messages (e.g., 'zh' or 'en'). Defaults to 'en'.

    Returns:
        A dictionary mapping each input file path to its corresponding output TXT file path.
        If an error occurs, the value will be an error message.
    """

Support

If you encounter any issues or need support, please open an issue or contact our R&D team.

help

运行方式说明

cloud

托管运行

托管运行通常表示这个 MCP Server 由服务方环境承载,用户一般按页面提供的连接方式或授权流程接入,不需要在本地长期启动一个 MCP 进程

  1. 打开服务方连接页
  2. 完成授权或复制端点
  3. 在 MCP 客户端中连接
terminal

本地运行 / 其它方式

本地运行通常需要用户在自己的电脑或服务器上安装依赖,把 server_config 复制到 MCP 客户端,并按 env_schema 补齐环境变量、密钥或其它配置

  1. 复制 server_config
  2. 安装所需依赖
  3. 补齐环境变量后重启客户端