返回 Skill 列表
extension
分类: 开发与工程无需 API Key

metaxy

当用户要求“定义一个特性”、“创建BaseFeature类”、“跟踪特性版本”、“设置元数据存储”、“字段级依赖”、“FieldSpec”、“FeatureDep”、“运行metaxy CLI”、“metaxy迁移”,或者需要关于metaxy特性定义、版本控制、元数据存储、CLI命令或测试模式的指导时,应使用此技能。

person作者: jakexiaohubgithub

Metaxy

Metaxy is a metadata layer for multi-modal Data and ML pipelines that manages and tracks feature versions, dependencies, and data lineage across complex computational graphs.

Core Concepts

Feature Definitions

To define a feature, create a class inheriting from mx.BaseFeature with a FeatureSpec metaclass argument:

import metaxy as mx


class MyFeature(
    mx.BaseFeature,
    spec=mx.FeatureSpec(
        key="my/feature",
        id_columns=["sample_id"],
        fields=["embedding", "score"],
    ),
):
    sample_id: str
    embedding: list[float]
    score: float

To add dependencies between features, use the deps parameter with FeatureDep. To specify field-level dependencies (for partial data dependencies processing), use FieldSpec with FieldDep or FieldsMapping.

Data Versioning

Metaxy automatically tracks sample versions and propagates changes through the dependency graph. To trigger recomputation when code changes, set code_version on FieldSpec:

fields = [
    mx.FieldSpec(key="embedding", code_version="2"),  # Bump to invalidate downstream
]

Metadata Stores

To configure a metadata store, create a metaxy.toml file or use programmatic configuration:

with mx.MetaxyConfig(stores={"dev": mx.DeltaMetadataStore(root_path="/tmp/metaxy")}).use() as config:
    store = config.get_store("dev")

Supported backends: DuckDB, ClickHouse, BigQuery, LanceDB, Delta Lake.

Feature Graph

To visualize and manage the feature dependency graph, use the CLI:

mx graph render            # Terminal visualization
mx push --store dev        # Push graph to store

CLI

Metaxy provides a CLI (metaxy or mx alias) for managing features, metadata, and migrations:

mx list features --verbose     # List features with dependencies
mx graph render                # Visualize feature graph
mx metadata status --all-features  # Check metadata freshness (expensive!)
mx migrations apply            # Apply pending migrations
mx mcp                         # Start MCP server for AI assistants

Testing

To test features in isolation, use context managers to avoid polluting the global registry:

import pytest
import metaxy as mx
from metaxy.metadata_store.delta import DeltaMetadataStore


@pytest.fixture
def metaxy_env(tmp_path):
    with mx.FeatureGraph().use():
        store = DeltaMetadataStore(root_path=tmp_path / "delta_test")
        with mx.MetaxyConfig(stores={"test": store}).use() as config:
            yield config

Examples

For complete code examples, see:

  • examples/feature-definitions.md - Feature classes with dependencies and field-level deps
  • examples/configuration.md - TOML and programmatic configuration
  • examples/metadata-stores.md - Store operations
  • examples/testing.md - Test isolation patterns
  • examples/cli.md - CLI command reference

Documentation

For comprehensive documentation: https://anam-org.github.io/metaxy/

Key pages:

  • Quickstart: https://anam-org.github.io/metaxy/guide/overview/quickstart/
  • Feature Definitions: https://anam-org.github.io/metaxy/guide/learn/feature-definitions/
  • Data Versioning: https://anam-org.github.io/metaxy/guide/learn/data-versioning/
  • Metadata Stores: https://anam-org.github.io/metaxy/guide/learn/metadata-stores/
  • CLI Reference: https://anam-org.github.io/metaxy/reference/cli/
  • API Reference: https://anam-org.github.io/metaxy/reference/api/