Text to Voice Local
Use the bundled scripts to create one stable local text-to-voice path.
Goal
Produce voice from any text file with:
- one canonical input text path,
- one canonical output mp3 path,
- one high-level wrapper for routine use,
- low-level scripts for debugging only.
Canonical paths
Default input text:
tmp/text-to-voice-input.txt
Canonical output:
tmp/voice-mode-latest.mp3
State directory:
skills/text-to-voice-local/state/
State pointer:
skills/text-to-voice-local/state/last-output.txt
Main command
For normal use, run:
scripts/text_to_voice.sh voice <text-file> [voice] [max_direct_chars]
Useful helpers:
scripts/text_to_voice.sh statusscripts/text_to_voice.sh voices
status now also checks runtime dependencies and prints install hints when something is missing.
Examples:
scripts/text_to_voice.sh text
scripts/text_to_voice.sh voice ./tmp/text-to-voice-input.txt
scripts/text_to_voice.sh voice ./tmp/text.txt ru-RU-SvetlanaNeural 280
What the scripts do
scripts/text_to_voice.sh- high-level entrypoint for normal use
scripts/tts_from_file.sh- one text file to one mp3
scripts/tts_from_file_chunked.sh- long text to multiple chunks and final merged mp3
scripts/voice_reply.sh- safe wrapper that updates canonical output and pointer
scripts/voice_reply_latest.sh- always refresh canonical latest mp3
state/text-to-voice.json- stores default voice, max chars, and canonical paths
scripts/edge_tts.js- low-level TTS helper used by the file wrappers
Install notes
Ensure these dependencies exist on the target machine:
nodeffmpegnode-edge-tts
The skill checks these at runtime and, if something is missing, prints suggested install commands instead of failing silently.
Verify:
node -v
ffmpeg -version
node -e "require('node-edge-tts'); console.log('node-edge-tts ok')"
If node-edge-tts is missing:
npm i -g node-edge-tts
Setup steps on another OpenClaw
- Copy this skill folder into the target workspace skills directory.
- Make scripts executable.
- Ensure
tmp/exists. - Put text into a txt file.
- Run the main command.
Minimal setup:
chmod +x skills/text-to-voice-local/scripts/*.sh
mkdir -p tmp
skills/text-to-voice-local/scripts/text_to_voice.sh voice ./tmp/text-to-voice-input.txt
Delivery rule
If the result is sent as Telegram voice, send only the canonical file:
./tmp/voice-mode-latest.mp3
Prefer sending text and voice as separate messages.
Important constraint
Progress printed by shell scripts is useful in terminal diagnostics, but chat-side live progress editing depends on OpenClaw preview streaming, not shell stdout alone.
When to use low-level scripts
Use low-level scripts only for debugging or careful manual control. Default to the high-level wrapper unless there is a reason not to.
Scan to join WeChat group