MXCP Expert Skill

MXCP is an enterprise framework for building production-ready AI tools with SQL and Python.

MXCP Mindset

Internalize these before implementing anything:

MXCP is opinionated - There's ONE right way to do most things. Don't invent patterns.
If it's common, MXCP provides it - Auth, testing, data access, policies. Check before building.
Schema docs are truth - When unsure about syntax, read the schema doc. Don't guess.
Validate constantly - Run mxcp validate after every file change. Errors compound.
Read before writing - 2 minutes reading docs saves 20 minutes debugging.

Pre-Implementation Checklist

Before writing ANY YAML or code:

[ ] Read common-mistakes.md - saves 90% of debugging time
[ ] Read the relevant schema doc (tool.md, resource.md, or prompt.md)
[ ] Check if MXCP already provides this feature (see Capabilities table)
[ ] Know the required fields and valid types

MXCP Capabilities

| Category | Features | When to Use | |----------|----------|-------------| | Endpoints | Tools, Resources, Prompts | Tools=actions/queries, Resources=data by URI, Prompts=message templates | | Languages | SQL, Python | SQL=database/simple, Python=complex logic/APIs | | Data Access | DuckDB (local files, HTTP, S3, PostgreSQL, MySQL, SQLite) | Connect to any data source via DuckDB extensions | | Data Transform | dbt (seeds, SQL models, Python models) | Clean, test, materialize static data | | Security | OAuth, CEL policies, audit logs | Authentication + authorization | | Quality | validate, test, lint, evals | Ensure correctness and LLM usability | | Deployment | stdio, streamable-http | Local dev (stdio), production (HTTP) |

Reference Documentation

| Category | Key References | |----------|----------------| | Getting Started | quickstart, hello-world | | Endpoints | sql-endpoints, python-endpoints | | Schemas | tool, resource, prompt | | Quality | testing, validation, linting | | Security | authentication, policies | | Operations | configuration, deployment | | Reference | cli, sql, python, type-system | | Integrations | dbt, duckdb, excel |

Quick Reference: What Docs to Read

| When implementing... | Read first | |---------------------|------------| | Any YAML | common-mistakes.md | | Tools, Resources, Prompts | tool.md, resource.md, prompt.md | | Authentication/Authorization | authentication.md, policies.md | | Tests | testing.md | | Data access (files, DBs) | duckdb.md | | Data transformation | dbt.md | | Python endpoints | python.md | | SQL endpoints | sql.md |

Implementation Methodology

Follow this methodology for every MXCP project. Run mxcp validate after EVERY file change.

Step 0: Project Setup

mkdir my-project && cd my-project
uv venv && source .venv/bin/activate
uv pip install mxcp
mxcp init --bootstrap
mxcp validate  # Verify setup

Step 1: Task Analysis & Data Ingestion

Analyze the task first:

What is the user trying to accomplish?
Is data ingestion needed? What format (CSV, Excel, API, database)?
Is the data properly structured or does it need transformation?
What questions will users need answered? (Design schema accordingly)

Decision: Ingest or query directly?

| Data Characteristic | Approach | Why | |---------------------|----------|-----| | Static/one-time (loaded once) | Ingest with dbt | Data quality tests, transformations, persistence | | Dynamic/changing (files updated) | DuckDB direct read | Always reads latest data, no sync needed |

Ingestion approaches (for static data):

| Scenario | Approach | |----------|----------| | Simple CSV, static reference data | mxcp dbt seed | | Excel, complex transformations | dbt Python models |

Direct read approaches (for dynamic data):

-- DuckDB reads files directly - always gets latest data
SELECT * FROM read_csv_auto('data/sales.csv');
SELECT * FROM read_parquet('data/*.parquet');
SELECT * FROM read_json_auto('https://api.example.com/data.json');

Connect to external databases via DuckDB:

-- PostgreSQL (requires postgres extension)
ATTACH 'postgresql://user:pass@host:5432/db' AS pg (TYPE postgres);
SELECT * FROM pg.public.users;

-- MySQL (requires mysql extension)
ATTACH 'host=localhost user=root database=mydb' AS mysql (TYPE mysql);
SELECT * FROM mysql.orders;

See duckdb.md for S3, HTTP auth, and secret management.

After ingestion (if using dbt), verify:

mxcp dbt test                    # Data quality tests
mxcp query "SELECT * FROM table LIMIT 5"  # Manual verification

Step 2: Implementation

Choose endpoint type based on use case:

| Use Case | Endpoint Type | Example | |----------|---------------|---------| | Query data, perform actions | Tool | get_customer, create_order | | Access data by URI/path | Resource | employee://{id}/profile | | Reusable message templates | Prompt | data_analysis with Jinja2 |

Choose implementation language:

| Scenario | Language | Reference | |----------|----------|-----------| | Database queries, aggregations, file reading | SQL | sql-endpoints.md | | Complex logic, external APIs, ML, file processing | Python | python-endpoints.md |

Development cycle for each endpoint:

# 1. Create the YAML definition
mxcp validate                    # Fix errors immediately

# 2. Create the implementation (SQL or Python)
mxcp validate                    # Validate again

# 3. Manual verification
mxcp run tool NAME --param key=value

# 4. Add tests and run
mxcp test

Python code requirements:

Modular, maintainable code
Each module independently testable
Use pytest for Python logic testing

Step 3: Metadata Quality

Tools will be used by LLMs. Ensure clear metadata:

name: Descriptive, follows snake_case
description: Clear purpose, when to use, what it returns
parameters: Each has description, correct type, examples
return: Documented structure with property descriptions

tool:
  name: search_customers
  description: |
    Search customers by name or email. Returns matching customer records
    with contact info and account status. Use for customer lookups.
  parameters:
    - name: query
      type: string
      description: Search term (matches name or email, case-insensitive)
      examples: ["john", "smith@example.com"]

Step 4: Validation

Run after every file change:

mxcp validate
mxcp validate --debug  # For detailed errors

Step 5: Linting

Check metadata quality for LLM consumption:

mxcp lint

Address all warnings about descriptions, examples, and documentation.

Step 6: Evals (Only if Requested)

Create evals only if the user explicitly asks:

mxcp evals  # AI behavior testing

Step 7: Security & Features (Only if Requested)

Implement only if the user requests authentication, policies, or observability:

Authentication: Configure in ~/.mxcp/config.yml (see Security Features section)
Policies: Add CEL expressions to tool definitions
Observability: Configure OpenTelemetry

Test security with simulated user context:

mxcp run tool NAME --param key=value \
  --user-context '{"role": "admin", "email": "test@example.com"}'

Step 8: Deployment (Only if Requested)

Implement only if the user explicitly asks for deployment:

| Transport | Use Case | Command | |-----------|----------|---------| | stdio | Local dev, Claude Desktop | mxcp serve (default) | | streamable-http | Production, web clients | mxcp serve --transport streamable-http --port 8000 |

See Deployment for Docker, systemd, production setup.

Definition of Done

A project is complete when:

[ ] mxcp validate passes with no errors
[ ] mxcp test passes with all tests green
[ ] mxcp dbt test passes (if using dbt)
[ ] mxcp lint shows no critical issues
[ ] Manual verification with mxcp run confirms expected behavior
[ ] Security tested with --user-context (if auth/policies configured)

Testing Requirements

| Test Type | Must Verify | Reference | |-----------|-------------|-----------| | MXCP endpoint | Valid inputs, edge cases (nulls, boundaries), error handling | testing.md | | dbt data | not_null, unique, relationships, accepted_values | dbt.md | | Python modules | Unit tests with pytest | - |

Critical: Use the Default Database

MXCP automatically creates and manages a DuckDB database. Do not configure a custom database path unless the user explicitly asks for it.

When you run mxcp init, MXCP creates:

Database at data/db-default.duckdb (or data/db-{profile}.duckdb)
All tables, seeds, and dbt models go into this database automatically

Use the default (no database configuration needed):

# mxcp-site.yml - Minimal config
mxcp: 1
project: my-project
profile: default
# Database is automatically created at data/db-default.duckdb

Only configure duckdb.path if the user explicitly requests it (e.g., shared database, specific location, read-only mode). Do not proactively add database configuration.

Common Mistakes

Root cause of most errors: implementing without reading docs first.

Before implementing, always:

Read the relevant schema doc (tool.md, resource.md, prompt.md)
Check common-mistakes.md for known pitfalls
Run mxcp validate after every change

Valid types: string, number, integer, boolean, array, object

SQL syntax: Verify DuckDB-specific syntax in duckdb.md. See common-mistakes.md for pitfalls.

Project Structure

mxcp-project/
├── mxcp-site.yml       # Project configuration (required)
├── tools/              # Tool definitions (.yml)
├── resources/          # Resource definitions (.yml)
├── prompts/            # Prompt definitions (.yml)
├── sql/                # SQL implementations
├── python/             # Python implementations
├── evals/              # LLM evaluation tests
└── data/               # Database files (db-default.duckdb)

Directory rules:

Tools MUST be in tools/*.yml
Resources MUST be in resources/*.yml
Prompts MUST be in prompts/*.yml
SQL files in sql/, referenced via relative paths
Python files in python/, referenced via relative paths

Golden Path: Complete Tool Example

This shows a complete, correct tool with all required fields and tests:

# tools/get_customer.yml
mxcp: 1
tool:
  name: get_customer
  description: Get customer by ID. Returns customer profile with contact info.
  parameters:
    - name: customer_id
      type: integer
      description: The customer's unique identifier
  return:
    type: object
    properties:
      id: {type: integer}
      name: {type: string}
      email: {type: string}
  source:
    file: ../sql/get_customer.sql
  tests:
    - name: existing_customer
      arguments: [{key: customer_id, value: 1}]
      result_contains: {id: 1}
    - name: not_found
      arguments: [{key: customer_id, value: 99999}]
      result: null

-- sql/get_customer.sql
SELECT id, name, email FROM customers WHERE id = $customer_id

SQL vs Python: Use SQL for queries/aggregations. Use Python (language: python) for complex logic, APIs, ML.

Security Features

CRITICAL: Use MXCP built-in security. NEVER write custom authentication code.

| Feature | Built-in Solution | Reference | |---------|-------------------|-----------| | Authentication | OAuth in ~/.mxcp/config.yml | authentication.md | | Access Control | CEL policies in YAML | policies.md | | User Context | SQL: get_username(), get_user_email() | sql.md | | External APIs | SQL: get_user_external_token() | authentication.md | | Audit Logs | Built-in logging | auditing.md |

Supported OAuth providers: GitHub, Google, Atlassian, Salesforce, Keycloak

CLI Quick Reference

# Project
mxcp init --bootstrap        # Create new project
mxcp list                    # List all endpoints

# Quality
mxcp validate                # Check structure
mxcp test                    # Run tests
mxcp lint                    # Check metadata
mxcp evals                   # AI behavior tests

# Running
mxcp serve                   # Start MCP server
mxcp run tool NAME --param k=v   # Run tool manually

# Database
mxcp query "SELECT 1"        # Execute SQL

# Operations
mxcp drift-snapshot          # Create baseline
mxcp drift-check             # Detect changes
mxcp log --since 1h          # Query audit logs

Troubleshooting

mxcp validate --debug        # Detailed validation errors
mxcp run tool NAME --debug   # Debug tool execution
mxcp list                    # See available endpoints

Common issues: YAML syntax, missing required fields, invalid types, file paths.

Project Templates

Complete runnable examples in assets/project-templates/. Start with:

python-demo - Python endpoint patterns
covid_owid - Data workflow with dbt

cp -r assets/project-templates/python-demo my-project
cd my-project
mxcp validate && mxcp test

See Configuration for mxcp-site.yml and config.yml options.