金大哥 - semantic-frame MCP 详情

article

README

🚀 语义框架（Semantic Frame）

语义框架（Semantic Frame）可对数值数据进行高效的语义压缩。 它能将原始数值数据（如NumPy、Pandas、Polars格式）转换为适合大语言模型（LLM）处理的自然语言描述。你无需向AI代理发送数千个数据点，只需发送一个50字左右的语义摘要即可。

🚀 快速开始

分析单个序列

from semantic_frame import describe_series
import numpy as np

# 适用于NumPy数组
data = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
result = describe_series(data, context="Daily Sales")
print(result)
# "The Daily Sales data shows a rapidly rising pattern with moderate variability..."

分析数据框

from semantic_frame import describe_dataframe
import pandas as pd

df = pd.DataFrame({
    'cpu': [40, 42, 41, 95, 40, 41],
    'memory': [60, 61, 60, 60, 61, 60],
})

results = describe_dataframe(df, context="Server Metrics")
print(results['cpu'].narrative)
# "The Server Metrics - cpu data shows a flat/stationary pattern..."

获取结构化输出

result = describe_series(data, output="full")

print(result.trend)          # TrendState.RISING_SHARP
print(result.volatility)     # VolatilityState.MODERATE
print(result.anomalies)      # [AnomalyInfo(index=4, value=500.0, z_score=4.2)]
print(result.compression_ratio)  # 0.95

用于API的JSON输出

result = describe_series(data, output="json")
# Returns dict ready for JSON serialization

✨ 主要特性

高效压缩：将原始数值数据转换为简洁的自然语言描述，实现95%以上的token缩减。
避免幻觉：提供确定性分析，降低大语言模型的幻觉风险。
多数据类型支持：支持NumPy、Pandas、Polars等多种数据类型。
多框架集成：可与Anthropic Claude、LangChain、CrewAI、MCP等框架集成。
高级工具支持：支持Anthropic的高级工具使用功能，实现复杂代理工作流中的高效工具编排。
交易模块：提供专门的交易分析工具，适用于交易代理、投资组合经理和金融应用。

📦 安装指南

pip install semantic-frame

或者使用uv进行安装：

uv add semantic-frame

💻 使用示例

基础用法

from semantic_frame import describe_series
import pandas as pd

data = pd.Series([100, 102, 99, 101, 500, 100, 98])
print(describe_series(data, context="Server Latency (ms)"))

输出：

The Server Latency (ms) data shows a flat/stationary pattern with stable
variability. 1 anomaly detected at index 4 (value: 500.00).
Baseline: 100.00 (range: 98.00-500.00).

高级用法

# 使用Claude进行分析
import anthropic
from semantic_frame.integrations.anthropic import get_anthropic_tool, handle_tool_call

client = anthropic.Anthropic()
tool = get_anthropic_tool()

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[tool],
    messages=[{"role": "user", "content": "Analyze this sales data: [100, 120, 115, 500, 118]"}]
)

# 处理响应中的工具调用
for block in response.content:
    if block.type == "tool_use" and block.name == "semantic_analysis":
        result = handle_tool_call(block.input)
        print(result)

📚 详细文档

支持的数据类型

NumPy：np.ndarray
Pandas：pd.Series、pd.DataFrame
Polars：pl.Series、pl.DataFrame
Python：list

分析功能

| 功能 | 方法 | 输出 | |---------|--------|--------| | 趋势 | 线性回归斜率 | RISING_SHARP、RISING_STEADY、FLAT、FALLING_STEADY、FALLING_SHARP | | 波动性 | 变异系数 | COMPRESSED、STABLE、MODERATE、EXPANDING、EXTREME | | 异常值 | Z - 分数 / IQR自适应 | 每个异常值的索引、值和z - 分数 | | 季节性 | 自相关 | NONE、WEAK、MODERATE、STRONG | | 分布 | 偏度 + 峰度 | NORMAL、LEFT_SKEWED、RIGHT_SKEWED、BIMODAL、UNIFORM | | 阶跃变化 | 基线偏移检测 | NONE、STEP_UP、STEP_DOWN | | 数据质量 | 缺失值百分比 | PRISTINE、GOOD、SPARSE、FRAGMENTED |

LLM集成

系统提示注入

from semantic_frame.interfaces import format_for_system_prompt

result = describe_series(data, output="full")
prompt = format_for_system_prompt(result)
# Returns formatted context block for system prompts

LangChain工具输出

from semantic_frame.interfaces import format_for_langchain

output = format_for_langchain(result)
# {"output": "narrative...", "metadata": {...}}

多列代理上下文

from semantic_frame.interfaces import create_agent_context

results = describe_dataframe(df)
context = create_agent_context(results)
# Combined narrative for all columns with attention flags

框架集成

Anthropic Claude（原生工具使用）

pip install semantic-frame[anthropic]

import anthropic
from semantic_frame.integrations.anthropic import get_anthropic_tool, handle_tool_call

client = anthropic.Anthropic()
tool = get_anthropic_tool()

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[tool],
    messages=[{"role": "user", "content": "Analyze this sales data: [100, 120, 115, 500, 118]"}]
)

# 处理响应中的工具使用
for block in response.content:
    if block.type == "tool_use" and block.name == "semantic_analysis":
        result = handle_tool_call(block.input)
        print(result)

LangChain

pip install semantic-frame[langchain]

from semantic_frame.integrations.langchain import get_semantic_tool

tool = get_semantic_tool()
# Use as a LangChain BaseTool in your agent

CrewAI

pip install semantic-frame[crewai]

from semantic_frame.integrations.crewai import get_crewai_tool

tool = get_crewai_tool()
# Use with CrewAI agents

MCP（模型上下文协议）

pip install semantic-frame[mcp]

运行MCP服务器：

mcp run semantic_frame.integrations.mcp:mcp

为MCP客户端（如ElizaOS、Claude Desktop、Claude Code等）提供describe_data工具。

Claude Code

在Claude Code中添加Semantic Frame作为原生工具：

# 安装MCP依赖
pip install semantic-frame[mcp]

# 将MCP服务器添加到Claude Code
claude mcp add semantic-frame -- uv run --project /path/to/semantic-frame mcp run /path/to/semantic-frame/semantic_frame/integrations/mcp.py

# 重启Claude Code，然后验证连接
claude mcp list
# semantic-frame: ... - ✓ Connected

配置完成后，让Claude分析数据，它将自动使用describe_data工具。

高级工具使用（测试版）

功能特性

| 特性 | 优势 | API | |---------|---------|-----| | 输入示例 | 提高18%的参数准确性 | 默认包含 | | 工具搜索 | 支持1000多个工具，不造成上下文膨胀 | defer_loading=True | | 编程调用 | 通过代码执行进行批量分析 | allowed_callers=["code_execution"] |

快速开始（高级）

import anthropic
from semantic_frame.integrations.anthropic import get_advanced_tool, handle_tool_call

client = anthropic.Anthropic()
tool = get_advanced_tool()  # 启用所有高级功能

response = client.beta.messages.create(
    betas=["advanced-tool-use-2025-11-20"],
    model="claude-sonnet-4-5-20250929",
    max_tokens=4096,
    tools=[
        {"type": "tool_search_tool_regex_20251119", "name": "tool_search"},
        {"type": "code_execution_20250825", "name": "code_execution"},
        tool,
    ],
    messages=[{"role": "user", "content": "Analyze all columns in this dataset..."}]
)

配置选项

from semantic_frame.integrations.anthropic import (
    get_anthropic_tool,          # 标准（包含示例）
    get_tool_for_discovery,      # 用于工具搜索
    get_tool_for_batch_processing,  # 用于代码执行
    get_advanced_tool,           # 启用所有功能
)

MCP批量分析

from semantic_frame.integrations.mcp import describe_batch

# 一次分析多个序列
result = describe_batch(
    datasets='{"cpu": [45, 47, 95, 44], "memory": [60, 61, 60, 61]}',
)

完整文档请参阅 docs/advanced-tool-use.md。

使用案例

加密货币交易

btc_prices = pd.Series(hourly_btc_prices)
insight = describe_series(btc_prices, context="BTC/USD Hourly")
# "The BTC/USD Hourly data shows a rapidly rising pattern with extreme variability.
#  Step up detected at index 142. 2 anomalies detected at indices 89, 203."

DevOps监控

cpu_data = pd.Series(cpu_readings)
insight = describe_series(cpu_data, context="CPU Usage %")
# "The CPU Usage % data shows a flat/stationary pattern with stable variability
#  until index 850, where a critical anomaly was detected..."

销售分析

sales = pd.Series(daily_sales)
insight = describe_series(sales, context="Daily Revenue")
# "The Daily Revenue data shows a steadily rising pattern with weak cyclic pattern
#  detected. Baseline: $12,450 (range: $8,200-$18,900)."

IoT传感器数据

temps = pl.Series("temperature", sensor_readings)
insight = describe_series(temps, context="Machine Temperature (C)")
# "The Machine Temperature (C) data is expanding with extreme outliers.
#  3 anomalies detected at indices 142, 156, 161."

交易模块（v0.4.0）

专为交易代理、投资组合经理和金融应用提供专业的语义分析。

交易工具

| 工具 | 描述 | |------|-------------| | describe_drawdown | 对权益曲线回撤进行严重程度分析 | | describe_trading_performance | 计算胜率、夏普比率、盈利因子等指标 | | describe_rankings | 多代理/策略比较 | | describe_anomalies | 在盈亏（PnL）上下文中增强异常检测 | | describe_windows | 多时间框架趋势对齐分析 | | describe_regime | 市场状态检测（牛市/熊市/横盘） | | describe_allocation | 投资组合分配建议 ⚠️ |

快速示例

from semantic_frame.trading import (
    describe_trading_performance,
    describe_drawdown,
    describe_regime,
    describe_allocation,
)

# 交易绩效分析
pnl = [100, -50, 75, -25, 150, -30, 80]
result = describe_trading_performance(pnl, context="My Bot")
print(result.narrative)
# "My Bot shows good performance with 57.1% win rate. Profit factor: 2.53..."

# 回撤分析
equity = [10000, 10500, 10200, 9800, 9500, 10100]
result = describe_drawdown(equity, context="Strategy")
print(result.narrative)
# "Strategy max drawdown: 9.5% (moderate). Currently recovering..."

# 市场状态分析
returns = [0.01, 0.015, 0.02, -0.01, 0.025, 0.018]  # 每日收益率
result = describe_regime(returns, context="BTC")
print(result.narrative)
# "BTC is in a strong bullish regime. Conditions favor trend-following..."

# 投资组合分配（⚠️ 仅用于教育目的，不构成金融建议）
assets = {"BTC": [40000, 42000, 44000], "ETH": [2500, 2650, 2800]}
result = describe_allocation(assets, method="risk_parity")
print(result.narrative)
# "Suggested allocation: BTC (55%), ETH (45%). Risk: high..."

MCP集成

所有交易工具均可通过MCP使用：

semantic-frame-mcp

可用工具：describe_drawdown、describe_trading_performance、describe_rankings、describe_anomalies、describe_windows、describe_regime、describe_allocation

📖 完整交易文档 | 快速参考

API参考

`describe_series(data, context=None, output="text")`

分析单个数据序列。

参数：

data：输入数据（NumPy数组、Pandas Series、Polars Series或列表）
context：数据的可选标签（出现在描述中）
output：格式 - "text"（字符串）、"json"（字典）或"full"（SemanticResult）

返回值： 以请求格式返回语义描述。

`describe_dataframe(df, context=None)`

分析数据框中的所有数值列。