galdr
Use this skill when an OpenClaw agent needs to analyze music from a YouTube URL or local audio file and produce a grounded listening-experience prompt from measurable audio structure.
galdr is a music perception CLI for AI agents. Its default workflow is ARC: analyze a track into time-ordered listener-state traces, then assemble those traces into a prompt for grounded listening-experience prose. The metrics are evidence. The ARC prompt is the main user-facing output.
Important: skill vs CLI
Installing this skill teaches OpenClaw how to use galdr. It does not install the galdr command itself.
Before starting:
galdr --version
If missing, install the CLI from a trusted source:
pip install galdr
# or from source:
git clone https://github.com/sellemain/galdr.git
cd galdr
pip install -e .
Preferred trusted sources:
If provenance matters, verify the PyPI metadata or install from the source repository before running it.
When to use this skill
Use galdr when the user asks to:
- analyze a song or music video
- describe what makes a track work structurally
- generate a grounded listening experience
- compare two tracks
- extract frames around structural moments in a music video
- create an evidence packet for another model to write from
Do not use galdr for:
- general music trivia
- ordinary recommendation lists
- purely lyrical interpretation without audio structure
- pretending the metrics prove private emotional intent
- downloading copyrighted audio unless the operator has appropriate rights/context
OpenClaw agent contract
Prefer the ARC path unless the user explicitly asks for raw metrics, comparison, debugging, or agent-internal traces.
Default sequence:
- Fetch or listen to the track.
- Analyze it into listener-state traces.
- Assemble the ARC prompt with
--template arc --mode full. - Review the prompt.
- Write the listening experience yourself or pass the prompt to the requested model.
The stream is evidence. Walk the track through time before summarizing. Do not invent emotional claims that the structure does not support.
Core Workflows
YouTube URL → ARC prompt (most common)
# Step 1: fetch audio + context (slug auto-derived from title)
galdr fetch "https://youtu.be/..." --analyze
# galdr prints the slug at the end:
# Slug : artist-song-title
# Next : galdr assemble artist-song-title --template arc --mode full
# Step 2: assemble the prompt locally
galdr assemble artist-song-title --template arc --mode full > prompt.txt
Override auto-derived metadata if needed:
galdr fetch "https://youtu.be/..." --artist "Oliver Anthony" --title "Rich Men North of Richmond" --analyze
If YouTube download behavior is flaky:
galdr doctor
galdr update-deps
galdr doctor reports the active Python executable, yt-dlp command/version, ffmpeg/ffprobe, JavaScript runtimes, and impersonation support. galdr update-deps upgrades yt-dlp[default,curl-cffi] in the same Python environment galdr is using.
Local file → ARC prompt
The analysis command is
galdr listen, notgaldr analyze.
galdr listen track.wav --name my-track
galdr assemble my-track --template arc --mode full > prompt.txt
Raw second-by-second analysis (advanced)
Galdr is strongest when read as a time-ordered listener-state trace. The stream is the primary evidence. Whole-track interpretation comes after walking the track through time.
Start with:
analysis/<slug>/<slug>_stream.jsonanalysis/<slug>/<slug>_perception.jsondocs/PERCEPTION-MODEL.md
Useful extras:
*_harmony_stream.json*_melody_stream.json*_overtone_stream.json*_report.jsongaldr assemble <slug> --mode blind
Reading order:
- Read
PERCEPTION-MODEL.mdfirst. - Treat
*_stream.jsonas the main evidence surface. - Walk the track in order.
- Mark transitions: silence, re-entry, pattern breaks, attention shifts, pressure-state changes, harmonic movement.
- Translate pressure fields into listening language: comes forward, holds, releases, empties. Do not quote LUFS values in experience prose.
- Only then compress upward into a larger interpretation.
Do not:
- jump straight to a whole-song mood summary
- treat summary metrics as more important than the stream
- ignore silence/re-entry structure
- overclaim emotional certainty from structure alone
- quote loudness/LUFS readings as if they were the experience
Minimal recipe:
galdr listen track.wav --name my-track
jq '.[0:12]' analysis/my-track/my-track_stream.json
jq '.summary' analysis/my-track/my-track_perception.json
galdr assemble my-track --mode blind > prompt.txt
Send the ARC prompt to another model
Only do this if the operator explicitly wants model-written prose. Review the assembled ARC prompt before piping it to claude, llm, or any other external model endpoint.
galdr assemble my-track --template arc --mode full | claude
galdr assemble my-track --template arc --mode full | llm
Optional Python agent pattern
import subprocess, re
fetch = subprocess.run(
["galdr", "fetch", url, "--analyze"],
capture_output=True, text=True, check=True
)
slug = re.search(r"Slug\s*:\s*(\S+)", fetch.stdout).group(1)
prompt = subprocess.run(
["galdr", "assemble", slug, "--template", "arc", "--mode", "full"],
capture_output=True, text=True, check=True
).stdout
# Review prompt before sending it to any external model endpoint.
Mode and template flags
| Mode | What's included |
|------|----------------|
| full (default) | metrics + lyrics + background + frames |
| lyrics | metrics + lyrics |
| context | metrics + background |
| blind | metrics only (structural, no cultural context) |
--template arc prepends the default listening-experience rules: tone, format, interpretation bounds, and the instruction to walk the track through time. Omit it only when you want a raw data block.
Interpreting galdr output
ARC is the default output path. The metrics exist to keep that prose grounded: use them as evidence for what changes, returns, releases, locks, or breaks over time.
See references/metrics.md for full metric reference.
Quick read:
patternnear 1.0 → listener is locked; near 0 → constant disruptiontexturenegative → harmonic dominant (warm, tonal); positive → percussive dominantpressure_balancebuilding/releasing/sustaining → heard-pressure shape across the track- Clustered
pattern_breaksat the end → planned release; distributed → varied structure silencedepth below -60dB with re-lock above 0.93 attention → structured withdrawal/return
Writing ARC experience prose yourself
When writing experience prose yourself from galdr evidence, prefer galdr assemble <slug> --template arc --mode full. If you are writing from raw assembled output without the template:
- First-person listener perspective, present tense
- Timestamps only at structural pivots: silences, pattern breaks, major energy shifts
- Translate metrics; describe what they mean, do not quote numbers
- LUFS/pressure values are evidence, not prose; write “pressure comes forward / holds / releases / empties”
- Body anchors such as chest, jaw, sternum sparingly; two or three for the whole piece
- End at the final sound event; no aftermath, no reflection
- Around 800 words, no section headers
Other commands
galdr compare track-a track-b # side-by-side structural comparison
galdr frames slug # extract + describe video frames at structural moments
galdr fetch "url" --no-download # context only (Wikipedia + lyrics), no audio
galdr fetch "url" --censor # sanitize explicit lyrics before saving
galdr doctor # inspect yt-dlp/media runtime health
galdr update-deps # upgrade yt-dlp reliability extras
galdr catalog # list all indexed tracks
galdr catalog --track NAME # summary card for one track
微信扫一扫