Setup VoiceMode

Install and configure VoiceMode MCP for voice interactions in the target client.

Steps

Install VoiceMode:

uvx voice-mode-install --yes

Add the MCP server to the target client:

claude mcp add --scope user voicemode -- uvx --refresh voice-mode

Configure local endpoints (Kokoro TTS + Whisper STT):

voicemode config set VOICEMODE_TTS_BASE_URLS http://127.0.0.1:8880/v1
voicemode config set VOICEMODE_STT_BASE_URLS http://127.0.0.1:2022/v1
voicemode config set VOICEMODE_PREFER_LOCAL true
voicemode config set VOICEMODE_ALWAYS_TRY_LOCAL true

This is critical. Without explicit _BASE_URLS, the default includes https://api.openai.com/v1 as fallback, which crashes with OPENAI_API_KEY errors even when local services are running.

Verify installation:

claude mcp list

Test voice mode:

Restart the target client
If the target runtime exposes a VoiceMode tool, use it to verify; otherwise restart the target client and confirm voice input/output there

First Run Note

Kokoro TTS may take 5+ minutes to load on first run while it downloads and initializes the model (~111MB). Check status with:

voicemode service kokoro status

Two MCP restarts required:

After initial setup (step 5)
After Kokoro model finishes downloading

Without the second restart, you may get "OpenAI API key" errors even with local config.

Configuration Options

Edit config with:

voicemode config edit

List all options:

voicemode config list

Key Settings

Setting	Description
`VOICEMODE_PREFER_LOCAL`	Prefer local providers over cloud (true/false)
`VOICEMODE_ALWAYS_TRY_LOCAL`	Always attempt local providers first (true/false)
`VOICEMODE_SAVE_AUDIO`	Save audio files (true/false, default: false)
`VOICEMODE_WHISPER_MODEL`	Whisper model (tiny, base, small, medium, large-v2)
`VOICEMODE_KOKORO_DEFAULT_VOICE`	Default voice (e.g., af_sky)
`OPENAI_API_KEY`	Required only for cloud processing

Provider Options

Local-only (default, recommended): Set VOICEMODE_TTS_BASE_URLS=http://127.0.0.1:8880/v1 and VOICEMODE_STT_BASE_URLS=http://127.0.0.1:2022/v1 (no API key needed)
Cloud-only: Set OPENAI_API_KEY and set URLs to https://api.openai.com/v1
Hybrid (local-first, cloud fallback): Set OPENAI_API_KEY and set URLs to http://127.0.0.1:8880/v1,https://api.openai.com/v1 (TTS) and http://127.0.0.1:2022/v1,https://api.openai.com/v1 (STT)

Troubleshooting

OpenAI API key error: Ensure VOICEMODE_TTS_BASE_URLS and VOICEMODE_STT_BASE_URLS point to local endpoints only (step 3). The PREFER_LOCAL flag alone is NOT sufficient — it does not remove OpenAI from the fallback chain
Kokoro stuck "starting up": Wait 5+ mins on first run, or check logs: voicemode service kokoro logs
macOS M3 crash: Known issue with ggml_metal - use CPU mode
WSL audio issues: Install PulseAudio packages
Slow transcription: Use GPU acceleration or smaller Whisper model

Improved Accuracy (Optional)

The default tiny model is fast but less accurate. For better transcription:

Model	Size	Accuracy	Speed
tiny	75MB	~70%	Fastest
small	466MB	~82%	Fast
medium	1.4GB	~88%	Moderate

voicemode config set VOICEMODE_WHISPER_MODEL small
# or for best accuracy:
voicemode config set VOICEMODE_WHISPER_MODEL medium

Restart Whisper service after changing:

voicemode service whisper restart

macOS Metal GPU Acceleration (Optional)

For significantly faster transcription on Apple Silicon, convert Whisper to Core ML:

Prerequisites

# Install whisper.cpp via Homebrew
brew install whisper-cpp

# Set Whisper directory
WHISPER_DIR=~/.voicemode/services/whisper

Steps

1. Download model

cd $WHISPER_DIR/models
./download-ggml-model.sh medium

2. Install Python dependencies

pip3 install torch coremltools openai-whisper ane_transformers

3. Convert to Core ML

cd $WHISPER_DIR
./models/generate-coreml-model.sh medium

4. Update config

voicemode config set VOICEMODE_WHISPER_MODEL medium

5. Restart Whisper

voicemode service whisper restart

Verification

# Check Core ML model exists
ls -la $WHISPER_DIR/models/ggml-medium-encoder.mlmodelc

When running, logs should show: GPU: Metal, Core ML: Enabled

ac-tools-setup-voice-mode

Setup VoiceMode

Steps

First Run Note

Configuration Options

Key Settings

Provider Options

Troubleshooting

Improved Accuracy (Optional)

macOS Metal GPU Acceleration (Optional)

Prerequisites

Steps

Verification

Links