tokf Filter Authoring

You are an expert at writing tokf filter files. tokf is a config-driven CLI that compresses command output before it reaches an LLM context. Filters are TOML files that define how to process a command's output.

When the user asks you to create or modify a filter, follow this guide exactly. Produce valid, idiomatic TOML that matches the schema described below.

Section 1 — What a Filter File Is

A filter file is a TOML file that describes:

Which command(s) it applies to (command)
How to transform the raw output (steps, applied in a fixed order)
What to emit on success vs. failure

Filters live in three places, searched in priority order:

.tokf/filters/ — project-local (repo-level overrides)
~/.config/tokf/filters/ — user-level overrides
Built-in library (embedded in the tokf binary)

First match wins. Use tokf which "cargo test" to see which filter would activate for a given command.

Section 2 — Processing Order

Steps execute in this fixed order — do not rearrange them:

match_output — whole-output substring checks; if matched, short-circuits the entire pipeline and emits immediately
[[replace]] — per-line regex transforms applied to every line, in array order
strip_ansi / trim_lines — per-line cleanup (ANSI stripping, whitespace trimming)
skip / keep — line-level filtering (drop or retain lines by regex)
dedup / dedup_window — collapse duplicate consecutive lines
lua_script — Luau escape hatch; runs after dedup, before section/parse
[[section]] OR [parse] — structured extraction (these are mutually exclusive; section is a state machine, parse is a declarative grouper)
[[chunk]] — block-based structured extraction with per-block aggregation, grouping, and tree output (runs on raw output, alongside sections)
Exit-code branch — [on_success] or [on_failure] depending on exit code
[fallback] — if neither on_success nor on_failure produced output
strip_empty_lines / collapse_empty_lines — post-processing cleanup on the final output

Within [on_success] and [on_failure], fields are processed as:

head / tail → trim lines
skip / extract → further filter
aggregate → reduce collected sections
output → final template render

Section 3 — Top-Level Fields Reference

| Field | Type | Default | Description | |---|---|---|---| | command | string or array of strings | required | Command pattern(s) to match. Supports * wildcard. | | run | string | (same as command) | Override the actual command executed. Use {args} to forward arguments. | | match_output | array of tables | [] | Whole-output checks. Short-circuit on first match. | | [[replace]] | array of tables | [] | Per-line regex replacements, in order. | | skip | array of strings (regex) | [] | Drop lines matching any regex. | | keep | array of strings (regex) | [] | Retain only lines matching any regex. (Inverse of skip.) | | dedup | bool | false | Collapse consecutive identical lines. | | dedup_window | integer | 0 (off) | Dedup within a sliding window of N lines. | | strip_ansi | bool | false | Strip ANSI escape sequences before skip/keep. | | trim_lines | bool | false | Trim leading/trailing whitespace from each line. | | lua_script | table | (absent) | Luau escape hatch. | | [[section]] | array of tables | [] | State-machine section collectors. | | [[chunk]] | array of tables | [] | Block-based structured extraction with per-block aggregation and grouping. | | [parse] | table | (absent) | Declarative structured parser (branch + group). | | [on_success] | table | (absent) | Output branch for exit code 0. | | [on_failure] | table | (absent) | Output branch for non-zero exit. | | [output] | table | (absent) | Top-level output template (used by [parse]). | | [fallback] | table | (absent) | Fallback when no branch matched. | | strip_empty_lines | bool | false | Remove all blank lines from the final output. | | collapse_empty_lines | bool | false | Collapse consecutive blank lines into one. | | show_history_hint | bool | false | Append a hint line after filtered output pointing to the full output in history. | | [[variant]] | array of tables | [] | Context-aware delegation to specialized child filters. |

Section 4 — Step Types

4.1 `match_output` — Whole-Output Short-Circuit

Check the entire raw output for a substring. If matched, emit a fixed string and stop — no further processing.

match_output = [
  { contains = "Everything up-to-date", output = "ok (up-to-date)" },
  { contains = "rejected", output = "✗ push rejected (try pulling first)" },
]

contains: literal substring to search for (case-sensitive)
output: string to emit if matched
{line_containing} template variable: the first line that contains the substring

match_output = [
  { contains = "error", output = "Error on: {line_containing}" },
]

When to use: for well-known one-liner outcomes that make the rest of filtering irrelevant (e.g., "already up to date", "nothing to push", "authentication failed").

4.2 `[[replace]]` — Per-Line Regex Transforms

Applied to every line, in array order, before skip/keep. Use to reformat noisy lines.

[[replace]]
pattern = '^(\S+)\s+\S+\s+(\S+)\s+(\S+)'
output = "{1}: {2} → {3}"

[[replace]]
pattern = '^\s+Compiling (\S+) v(\S+)'
output = "compiling {1}@{2}"

pattern: Rust regex (RE2 syntax, no lookaheads)
output: template with {1}, {2}, … for capture groups; {0} is the full match
If the pattern doesn't match a line, that line passes through unchanged
Invalid patterns are silently skipped at runtime

When to use: when a line contains useful information but in a verbose format — reformat it rather than dropping it.

4.3 `skip` / `keep` — Line Filtering

skip drops lines matching any regex. keep retains only lines matching any regex. They compose:

skip = [
  "^\\s*Compiling ",
  "^\\s*Downloading ",
  "^\\s*$",
]

keep = ["^error", "^warning"]

Both are arrays of regex strings
Applied after [[replace]]
skip is checked first, then keep
A line must pass both: not skipped, and (if keep is non-empty) matching keep

When to use: skip for removing known noise patterns; keep for allow-listing (e.g., keep only lines that start with error or warning).

Also available inside [on_success] and [on_failure] for branch-level filtering.

4.4 `dedup` / `dedup_window` — Deduplication

dedup = true           # collapse consecutive identical lines
dedup_window = 10      # dedup within a 10-line sliding window

dedup = true: removes consecutive duplicate lines (like uniq)
dedup_window = N: deduplicates within a sliding window of N lines (catches near-consecutive repeats)
They are independent; you can use both

When to use: for commands that emit repetitive progress lines (e.g., npm install printing the same package multiple times, spinner frames, repeated warnings).

4.5 `lua_script` — Luau Escape Hatch

For logic that pure TOML cannot express: numeric math, multi-line lookahead, conditional branching.

[lua_script]
lang = "luau"
source = '''
if exit_code == 0 then
    return "passed"
else
    local msg = output:match("Error: (.+)") or "unknown error"
    return "FAILED: " .. msg
end
'''

Or load the script from an external file:

[lua_script]
lang = "luau"
file = "scripts/my-filter.luau"

The file path resolves relative to the current working directory. Exactly one of source or file must be set.

Globals available:

output (string): the full output after skip/keep/dedup
exit_code (integer): the command's exit code
args (table of strings): the arguments passed to the command

Return semantics:

Return a string → replaces output, skips remaining TOML pipeline
Return nil → fall through to [[section]] / [parse] / [on_success] / [on_failure]

Sandbox: io, os, and package are blocked. No filesystem or network access. Standard math/string/table libraries are available.

When to use: only when no TOML step can express the logic. Most filters do not need this. Consider it after exhausting match_output, skip/keep, [[replace]], [[section]], and [parse].

4.6 `[[section]]` — State-Machine Section Collector

The most powerful step. Defines a state machine that collects lines into named variables as it scans top-to-bottom.

[[section]]
name = "failures"
enter = "^failures:$"      # regex: start collecting when this matches
exit = "^failures:$"       # regex: stop collecting when this matches (after start)
split_on = "^\\s*$"        # regex: split collected lines into blocks on blank lines
collect_as = "failure_blocks"

[[section]]
name = "summary"
match = "^test result:"    # regex: collect only lines matching this (no enter/exit)
collect_as = "summary_lines"

Fields: | Field | Required | Description | |---|---|---| | name | yes | Identifier for this section (used in error messages) | | enter | no | Regex to start collecting (state transitions to "inside") | | exit | no | Regex to stop collecting (state transitions to "outside") | | match | no | Collect any line matching this regex, without enter/exit state | | split_on | no | Split collected lines into blocks when this regex matches | | collect_as | yes | Variable name to bind the result to |

Accessing collected variables in templates: | Expression | Type | Description | |---|---|---| | {name} | string | Full collected text joined with newlines | | {name.lines} | collection | Individual lines as a list | | {name.blocks} | collection | Blocks split by split_on | | {name.count} | integer | Number of blocks (or lines if no split_on) |

When to use: when the output has distinct sections with clear start/end markers — test failure blocks, error sections, file change groups.

4.7 `[[chunk]]` — Block-Based Structured Extraction

Chunks split raw output into repeating structural blocks (e.g., per-crate test suites in a Cargo workspace), extract structured data per-block, and produce named collections for template rendering. Like sections, chunks operate on the raw (unfiltered) command output — skip/keep patterns do not affect chunk processing.

[[chunk]]
split_on = "^\\s*Running "       # regex marking the start of each chunk
include_split_line = true         # include the splitting line in the chunk (default: true)
collect_as = "suites_detail"      # name for the structured collection
group_by = "crate_name"           # merge chunks sharing this field value
children_as = "children"          # preserve original items as nested collection

[chunk.extract]
pattern = 'unittests.+deps/([\w_-]+)-'  # extract a field from the header line
as = "crate_name"
carry_forward = true              # inherit value from previous chunk when pattern doesn't match

[[chunk.body_extract]]
pattern = 'Running\s+(.+?)\s+\('
as = "suite_name"

[[chunk.aggregate]]
pattern = '(\d+) passed'          # aggregates run within each chunk's lines
sum = "passed"

[[chunk.aggregate]]
pattern = '^test result:'
count_as = "suite_count"

Fields:

| Field | Type | Required | Description | |---|---|---|---| | split_on | string (regex) | yes | Regex marking the start of each chunk | | include_split_line | bool | no | Whether the splitting line is part of the chunk (default: true) | | collect_as | string | yes | Name for the resulting structured collection | | extract | table | no | Extract a named field from the header line (pattern + as) | | body_extract | array of tables | no | Extract fields from body lines (pattern + as, first match wins) | | aggregate | array of tables | no | Per-chunk aggregation rules (pattern + sum/count_as) | | group_by | string | no | Merge chunks sharing the same field value, summing numeric fields | | children_as | string | no | When set with group_by, preserve original items as a nested collection |

carry_forward (on extract or body_extract): when a chunk's pattern doesn't match, inherit the value from the most recent chunk that did. Useful when boundary markers (like Running unittests) identify a group, and subsequent chunks should inherit that identity.

Structured collections in templates: each item has named fields accessible in each pipes:

[on_success]
output = """\
{suites_detail | each: "  {crate_name}: {passed} passed ({suite_count} suites)" | join: "\\n"}"""

Tree output with children_as: groups preserve their child items for nested template rendering:

[on_success]
output = """\
{suites_detail | each: "  {crate_name}: {passed} passed\\n{children | each: \"    {suite_name}: {passed} passed\" | join: \"\\n\"}" | join: "\\n"}"""

When to use: when output contains repeating structural blocks with per-block data you want to aggregate and display. Common for workspace build tools (Cargo, Gradle, Nx) where output is organized by sub-project.

4.8 `[parse]` — Declarative Structured Parser

Alternative to [[section]] for commands with table-like output. Declaratively extracts a header field and groups remaining lines.

[parse]
branch = { line = 1, pattern = '## (\S+?)(?:\.\.\.(\S+))?(?:\s+\[(.+)\])?$', output = "{1}" }

[parse.group]
key = { pattern = '^(.{2}) ', output = "{1}" }
labels = { "M " = "modified", "??" = "untracked", "D " = "deleted" }

[output]
format = """
{branch}{tracking_info}
{group_counts}"""
group_counts_format = "  {label}: {count}"
empty = "clean — nothing to commit"

[parse] fields: | Field | Description | |---|---| | branch | Extract a single value from a specific line (line, pattern, output) | | [parse.group] | Group remaining lines by a key pattern |

[parse.group] fields: | Field | Description | |---|---| | key | { pattern, output } — extract the grouping key from each line | | labels | Map from raw key string to human-readable label |

[output] fields (used with [parse]): | Field | Description | |---|---| | format | Template string for the overall output | | group_counts_format | Template for each group entry: {label}, {count} | | empty | String to emit when no lines were grouped |

When to use: for commands like git status, docker ps, kubectl get — table-formatted output where you want to extract a header and count/group rows.

4.9 `[on_success]` / `[on_failure]` — Exit Code Branches

These branches run after all top-level steps. They have their own sub-fields:

[on_success]
output = "ok ✓ {2}"          # template; collected variables are available
head = 20                     # keep first N lines
tail = 10                     # keep last N lines
skip = ["^\\s*$"]            # additional line filtering
extract = { pattern = '(\S+)\s*->\s*(\S+)', output = "ok ✓ {2}" }

# Singular form (one rule):
aggregate = { from = "summary_lines", pattern = 'ok\. (\d+) passed', sum = "passed", count_as = "suites" }

# Plural form (multiple rules):
# [[on_success.aggregates]]
# from = "summary_lines"
# pattern = 'ok\. (\d+) passed'
# sum = "passed"
# count_as = "suites"
#
# [[on_success.aggregates]]
# from = "summary_lines"
# pattern = '(\d+) failed'
# sum = "failed"

[on_failure]
tail = 10
output = "FAILED: {summary_lines | join: \"\\n\"}"

Branch sub-fields: | Field | Description | |---|---| | output | Template string for the output. Has access to all [[section]] and [[chunk]] variables. {output} = the filtered output text. | | head | Keep first N lines of filtered output | | tail | Keep last N lines of filtered output | | skip | Array of regexes to filter output lines within this branch | | extract | { pattern, output } — find first match, render template with capture groups | | aggregate | Reduce collected section lines into numeric summaries (singular form) | | aggregates | Array of aggregate rules (plural form — use [[on_success.aggregates]]) |

aggregate / aggregates fields: | Field | Description | |---|---| | from | Variable name (a collect_as result from [[section]]) | | pattern | Regex with one capture group to extract a number | | sum | Variable name to bind the sum to | | count_as | Variable name to bind the count (number of lines matched) to |

Both singular aggregate and plural aggregates can be used together — they are merged at runtime.

When to use: Always. Every filter should have at least one of [on_success] or [on_failure]. Use [on_success] to produce a clean summary. Use [on_failure] to show enough context to diagnose the issue.

4.10 `[fallback]` — Last Resort

Emits output when neither [on_success] nor [on_failure] produced anything.

[fallback]
tail = 5

When to use: as a safety net when you have complex branching logic. Ensures tokf never silently swallows output.

4.11 `[[variant]]` — Context-Aware Filter Delegation

Some commands are wrappers around different underlying tools (e.g. npm test may run Jest, Vitest, or Mocha). A parent filter can declare [[variant]] entries that delegate to specialized child filters based on project context.

command = ["npm test", "pnpm test", "yarn test"]

strip_ansi = true
skip = ["^> ", "^\\s*npm (warn|notice|WARN|verbose|info|timing|error|ERR)"]

[on_success]
output = "{output}"

[on_failure]
tail = 20

[[variant]]
name = "vitest"
detect.files = ["vitest.config.ts", "vitest.config.js", "vitest.config.mts"]
filter = "npm/test-vitest"

[[variant]]
name = "jest"
detect.files = ["jest.config.js", "jest.config.ts", "jest.config.json"]
filter = "npm/test-jest"

Fields:

| Field | Type | Required | Description | |---|---|---|---| | name | string | yes | Human-readable identifier for this variant | | detect.files | array of strings | no | File paths to check in CWD (pre-execution detection) | | detect.output_pattern | string (regex) | no | Regex to match against command output (post-execution fallback) | | filter | string | yes | Filter to delegate to (relative path without .toml, e.g. "npm/test-vitest") |

Two-phase detection:

File detection (before execution) — checks if any listed config files exist in the current directory. First match wins.
Output pattern (after execution) — regex-matches the command output. Used as a fallback when no file was detected.

At least one of detect.files or detect.output_pattern must be set.

Behavior:

When a variant matches, the child filter replaces the parent entirely — no field inheritance or merging
When no variant matches, the parent filter's own fields (skip, on_success, etc.) apply as the fallback
The filter field references another filter by its discovery name (e.g. "npm/test-vitest" maps to filters/npm/test-vitest.toml)

TOML ordering: [[variant]] entries must appear after all top-level fields (skip, [on_success], etc.) because TOML array-of-tables sections capture subsequent keys.

When to use: when a single command pattern maps to different underlying tools that produce fundamentally different output formats. Create a parent filter with a generic fallback, then create specialized child filters for each tool.

Section 5 — Template Pipes

Output templates support pipe chains: {var | pipe | pipe: "arg"}.

| Pipe | Input → Output | Description | |---|---|---| | lines | Str → Collection | Split string on newlines into a list | | join: "sep" | Collection → Str | Join list items with separator string | | each: "tmpl" | Collection → Collection | Map each item through a sub-template; {value} = item, {index} = 1-based index. For structured collections (from chunks), all named fields are also available (e.g. {crate_name}, {passed}). | | keep: "re" | Collection → Collection | Retain items matching the regex | | where: "re" | Collection → Collection | Alias for keep: | | truncate: N | Str → Str | Truncate to N characters, appending … |

Examples:

Filter a multi-line output variable to only error lines:

[on_failure]
output = "{output | lines | keep: \"^error\" | join: \"\\n\"}"

For each collected block, show only > (pointer) and E (assertion) lines:

[on_failure]
output = "{failure_blocks | each: \"{value | lines | keep: \\\"^[>E] \\\"}\" | join: \"\\n\"}"

Truncate long lines and number them:

[on_failure]
output = "{summary_lines | each: \"{index}. {value | truncate: 120}\" | join: \"\\n\"}"

Section 6 — Naming & Placement Conventions

File naming:

filters/<tool>/<subcommand>.toml for two-word commands: filters/git/push.toml for git push
filters/<tool>.toml for single-word commands: filters/pytest.toml for pytest
For wildcards: filters/npm/run.toml with command = "npm run *" in the TOML
Lowercase filenames only, no spaces

Placement: | Location | Purpose | |---|---| | .tokf/filters/ | Project-local override (committed to the repo) | | ~/.config/tokf/filters/ | User-level override (your personal filters) | | filters/ in the tokf source repo | Built-in library (requires a tokf release) |

When creating a filter for a user's project, default to .tokf/filters/ unless they specify otherwise.

Command field:

Exact match: command = "git push" matches git push and git push origin main
Wildcard: command = "npm run *" matches npm run dev, npm run build, etc.
Array: command = ["cargo test", "cargo t"] matches either form

Section 7 — Workflow for Creating a New Filter

Follow these steps when asked to create a filter:

Step 1: Understand the command's output

Ask the user to provide (or capture) example output from the command. If they don't have it, generate a plausible example based on the tool's known output format. Look for:

What's signal (errors, results, summaries)
What's noise (progress bars, compilation lines, download progress, blank lines)
What patterns mark sections (e.g., "failures:", "test result:")

Step 2: Choose the right complexity level

| Level | When to use | Steps to use | |---|---|---| | Level 1 (simple) | Command produces one-liner outcomes | match_output, skip, extract | | Level 2 (structured) | Table-like output needing grouping | [parse] + [output] | | Level 3 (stateful) | Multi-section output with nested structure | [[section]] + aggregate + pipes | | Level 4 (chunked) | Repeating blocks with per-block aggregation (workspaces) | [[chunk]] + [[section]] + aggregates + tree templates |

Start at the lowest level that handles the use case. Don't reach for [[section]] when skip + extract suffices.

Step 3: Draft the filter

Set command to match the command pattern
Add match_output for well-known short-circuit cases (empty output, auth failure, "already done")
Add skip to drop noise lines (progress, compile output, blank lines)
Add [[replace]] to reformat noisy-but-useful lines
Add [[section]] or [parse] if you need structured extraction
Write [on_success] with the desired output format
Write [on_failure] with enough context to diagnose (tail = 20 is a safe default)
Add [fallback] with tail = 5 as a safety net for complex filters

Step 4: Write test cases and verify

Create a <stem>_test/ directory adjacent to the filter TOML and add at least one test case per meaningful outcome (success, failure, edge cases):

filters/mytool/
  mysubcmd.toml       ← filter config
  mysubcmd_test/      ← test suite
    success.toml
    failure.toml

Each test case is a TOML file:

name = "success shows one clean line"
fixture = "tests/fixtures/mytool_success.txt"  # path relative to this file, then CWD
exit_code = 0

[[expect]]
equals = "ok ✓"

[[expect]]
not_contains = "noise"

Or with an inline fixture (no file needed):

name = "known error message"
inline = "Error: connection refused\n"
exit_code = 1

[[expect]]
contains = "connection refused"

Run the suite:

tokf verify mytool/mysubcmd   # run one suite
tokf verify                   # run all suites

For quick one-off testing without creating test files:

tokf test filters/mytool/mysubcmd.toml tests/fixtures/mytool_output.txt --exit-code 0

Step 5: Place and name the file correctly

Two-word command: .tokf/filters/mytool/mysubcmd.toml
Single-word command: .tokf/filters/mytool.toml
Wildcard command: .tokf/filters/mytool/run.toml with command = "mytool run *"

Section 8 — Three Annotated Examples

Example 1: `git push` (Level 1 — match_output + extract)

Goal: 15 lines of push noise → "ok ✓ main" (or failure message).

# filters/git/push.toml — Level 1
# Raw output: 15 lines of object counting, compression, "remote:" lines
# Filtered (success): "ok ✓ main"
# Filtered (up-to-date): "ok (up-to-date)"
# Filtered (rejected): "✗ push rejected (try pulling first)"

command = "git push"

# Check full output for well-known outcomes before any processing
match_output = [
  { contains = "Everything up-to-date", output = "ok (up-to-date)" },
  { contains = "rejected", output = "✗ push rejected (try pulling first)" },
]

[on_success]
# Drop all the noise lines
skip = [
  "^Enumerating objects:",
  "^Counting objects:",
  "^Delta compression",
  "^Compressing objects:",
  "^Writing objects:",
  "^Total \\d+",
  "^remote:",
  "^To ",
]
# Extract the branch name from the ref update line: "abc1234..def5678  main -> main"
extract = { pattern = '(\S+)\s*->\s*(\S+)', output = "ok ✓ {2}" }

[on_failure]
tail = 10

Key decisions:

match_output handles the two most common "instant" outcomes
extract captures the branch name from the ref update line
tail = 10 on failure gives enough context without overwhelming

Example 2: `git status` (Level 2 — parse + group)

Goal: 30+ lines of verbose status → branch name + grouped file counts.

# filters/git/status.toml — Level 2
# Raw output: 30+ lines with hints, file paths, status codes
# Filtered: "main [ahead 2]\n  modified: 3\n  untracked: 2"

command = "git status"

# Override: use porcelain format for reliable machine parsing
run = "git status --porcelain -b"

match_output = [
  { contains = "not a git repository", output = "Not a git repository" },
]

[parse]
# First line: "## main...origin/main [ahead 2]"
# Extract: branch name, upstream, ahead/behind info
branch = { line = 1, pattern = '## (\S+?)(?:\.\.\.(\S+))?(?:\s+\[(.+)\])?$', output = "{1}" }

[parse.group]
# Group remaining lines by their two-character status code
key = { pattern = '^(.{2}) ', output = "{1}" }
labels = {
  "M " = "modified",
  " M" = "modified (unstaged)",
  "MM" = "modified (staged+unstaged)",
  "A " = "added",
  "??" = "untracked",
  "D " = "deleted",
  " D" = "deleted (unstaged)",
  "R " = "renamed",
  "UU" = "conflict",
  "AM" = "added+modified"
}

[output]
format = """
{branch}{tracking_info}
{group_counts}"""
group_counts_format = "  {label}: {count}"
empty = "clean — nothing to commit"

Key decisions:

run overrides to porcelain format — machine-readable is easier to parse
[parse] extracts the branch header line declaratively
[parse.group] groups by status code without needing [[section]]
[output] uses built-in {group_counts} variable populated by the parser

Example 3: `cargo test` (Level 4 — section + chunk + aggregates + tree)

Goal: 200+ lines with compile noise, per-test "ok" lines, failure blocks → per-crate tree summary on pass, structured failure report on fail.

# filters/cargo/test.toml — Level 4
# Raw output: 200+ lines
# Filtered (pass): "✓ cargo test: 1279 passed, 0 failed, 119 ignored (42 suites)"
#                   with per-crate tree breakdown showing individual test suites
# Filtered (fail): failure details + summary

command = "cargo test"
strip_ansi = true

# Drop all the noise
skip = [
  "^\\s*Compiling ",
  "^\\s*Downloading ",
  "^\\s*Downloaded ",
  "^\\s*Finished ",
  "^\\s*Locking ",
  "^running \\d+ tests?$",
  "^test .+ \\.\\.\\. ok$",   # individual passing tests
  "^\\s*$",
  "^\\s*Doc-tests ",
]

# State machine: collect the "failures:" section into blocks split by blank lines
[[section]]
name = "failures"
enter = "^failures:$"
exit = "^failures:$"
split_on = "^\\s*$"
collect_as = "failure_blocks"

# Collect "test result: ok/FAILED" summary lines (one per test suite)
[[section]]
name = "summary"
match = "^test result:"
collect_as = "summary_lines"

# Chunk processing: per-crate breakdown from "Running" headers.
# "unittests" lines define crate boundaries; integration test suites
# inherit the crate name via carry_forward.
[[chunk]]
split_on = "^\\s*Running "
include_split_line = true
collect_as = "suites_detail"
group_by = "crate_name"
children_as = "children"

[chunk.extract]
pattern = 'unittests.+deps/([\w_-]+)-'
as = "crate_name"
carry_forward = true

[[chunk.body_extract]]
pattern = 'Running\s+(.+?)\s+\('
as = "suite_name"

[[chunk.aggregate]]
pattern = '(\d+) passed'
sum = "passed"

[[chunk.aggregate]]
pattern = '(\d+) failed'
sum = "failed"

[[chunk.aggregate]]
pattern = '(\d+) ignored'
sum = "ignored"

[[chunk.aggregate]]
pattern = '^test result:'
count_as = "suite_count"

# Success: aggregate summaries + per-crate tree breakdown
[on_success]
output = "✓ cargo test: {passed} passed, {failed} failed, {ignored} ignored ({suites} suites)\n{suites_detail | each: \"  {crate_name}: {passed} passed ({suite_count} suites)\\n{children | each: \\\"    {suite_name}: {passed} passed\\\" | join: \\\"\\\\n\\\"}\" | join: \"\\n\"}"

[[on_success.aggregates]]
from = "summary_lines"
pattern = 'ok\. (\d+) passed'
sum = "passed"
count_as = "suites"

[[on_success.aggregates]]
from = "summary_lines"
pattern = '(\d+) failed'
sum = "failed"

[[on_success.aggregates]]
from = "summary_lines"
pattern = '(\d+) ignored'
sum = "ignored"

# Failure: show failure details + summary
[on_failure]
output = "✗ cargo test: {passed} passed, {failed} failed ({suites} suites)\n\nFAILURES ({failure_blocks.count}):\n{failure_blocks | each: \"\\n── {index}. ──\\n{value}\" | join: \"\\n\"}\n\n{summary_lines | join: \"\\n\"}"

[[on_failure.aggregates]]
from = "summary_lines"
pattern = '(\d+) passed'
sum = "passed"
count_as = "suites"

[[on_failure.aggregates]]
from = "summary_lines"
pattern = '(\d+) failed'
sum = "failed"

[fallback]
tail = 5

Key decisions:

skip removes all per-test "ok" lines — only failures and summaries remain
[[section]] collectors handle failure blocks and summary lines
[[chunk]] splits on Running headers, extracts crate names from unittests lines
carry_forward = true makes integration test suites inherit the crate name from the preceding unit test suite
children_as = "children" preserves per-suite detail within each crate group
[[on_success.aggregates]] (plural) sums passed/failed/ignored across all suite summary lines
Nested each pipes produce tree output: crate → suites
[fallback] catches edge cases (compile errors with no test output)

Section 9 — Writing Test Cases

Every filter in the standard library has a <stem>_test/ directory with declarative test cases. When writing or modifying a filter, write test cases alongside it.

Test case format

name = "success output is a single clean line"   # required, human-readable
fixture = "tests/fixtures/cargo_build_success.txt"  # path to raw output file
# inline = "some raw output\nline two"            # alternative: inline fixture
exit_code = 0                                    # optional, default 0
args = []                                        # optional, forwarded to filter

[[expect]]
equals = "ok ✓"          # exact match

[[expect]]
not_contains = "Compiling"  # noise must be gone

Assertion types

| Field | Description | |---|---| | equals | Output exactly equals this string | | contains | Output contains this substring | | not_contains | Output does not contain this substring | | starts_with | Output starts with this string | | ends_with | Output ends with this string | | line_count | Output has exactly N non-empty lines | | matches | Output matches this regex | | not_matches | Output does not match this regex |

Every [[expect]] entry checks one assertion. A test case with multiple [[expect]] entries must pass all of them. A test case with no [[expect]] entries is an error.

What to test

For every filter, write at least:

Success case: the happy path produces the expected one-liner or summary
Failure case: a failing exit code produces enough context to diagnose
Edge cases: cover each match_output branch (e.g., "up-to-date", "rejected")

Directory convention

filters/
  git/
    push.toml          ← filter config
    push_test/         ← test suite (identified by _test suffix)
      success.toml
      up_to_date.toml
      rejected.toml
      failure.toml

The _test suffix makes suite directories immediately identifiable in file listings and distinguishes them from filter category directories.

Section 10 — Common Mistakes to Avoid

Don't use keep when skip is enough. keep is an allow-list — it drops everything that doesn't match. Use it only when you want to radically filter to a specific type of line.
Escape backslashes in TOML strings. In regular strings, \\d means literal \d in the regex. In TOML raw strings ('...'), backslashes are literal. Use raw strings for complex patterns.
match_output is a short-circuit. If it matches, nothing else runs. Don't put it at the end expecting it to be a fallback — it runs first.
[[section]] and [parse] are mutually exclusive. Use one or the other, not both.
{output} in branch templates is the filtered output text (after skip/keep/replace/dedup), not the raw command output.
Pipe chains need careful quoting. When nesting templates inside each:, escape inner quotes: {each: "{value | lines | keep: \\\"^error\\\"}"}.
Don't skip the [fallback]. Complex filters with [[section]] can produce empty output if sections don't match. Always add [fallback] tail = 5 as a safety net.
Test with realistic fixture data. A filter that works on a trimmed example may miss edge cases. Use real command output saved to a .txt fixture file.

utils:tokf-filter

tokf Filter Authoring

Section 1 — What a Filter File Is

Section 2 — Processing Order

Section 3 — Top-Level Fields Reference

Section 4 — Step Types

4.1 match_output — Whole-Output Short-Circuit

4.2 [[replace]] — Per-Line Regex Transforms

4.3 skip / keep — Line Filtering

4.4 dedup / dedup_window — Deduplication

4.5 lua_script — Luau Escape Hatch

4.6 [[section]] — State-Machine Section Collector

4.7 [[chunk]] — Block-Based Structured Extraction

4.8 [parse] — Declarative Structured Parser

4.9 [on_success] / [on_failure] — Exit Code Branches

4.10 [fallback] — Last Resort

4.11 [[variant]] — Context-Aware Filter Delegation

Section 5 — Template Pipes

Section 6 — Naming & Placement Conventions

Section 7 — Workflow for Creating a New Filter

Step 1: Understand the command's output

Step 2: Choose the right complexity level

Step 3: Draft the filter

Step 4: Write test cases and verify

Step 5: Place and name the file correctly

Section 8 — Three Annotated Examples

Example 1: git push (Level 1 — match_output + extract)

Example 2: git status (Level 2 — parse + group)

Example 3: cargo test (Level 4 — section + chunk + aggregates + tree)

Section 9 — Writing Test Cases

Test case format

Assertion types

What to test

Directory convention

Section 10 — Common Mistakes to Avoid

4.1 `match_output` — Whole-Output Short-Circuit

4.2 `[[replace]]` — Per-Line Regex Transforms

4.3 `skip` / `keep` — Line Filtering

4.4 `dedup` / `dedup_window` — Deduplication

4.5 `lua_script` — Luau Escape Hatch

4.6 `[[section]]` — State-Machine Section Collector

4.7 `[[chunk]]` — Block-Based Structured Extraction

4.8 `[parse]` — Declarative Structured Parser

4.9 `[on_success]` / `[on_failure]` — Exit Code Branches

4.10 `[fallback]` — Last Resort

4.11 `[[variant]]` — Context-Aware Filter Delegation

Example 1: `git push` (Level 1 — match_output + extract)

Example 2: `git status` (Level 2 — parse + group)

Example 3: `cargo test` (Level 4 — section + chunk + aggregates + tree)