Introducing Dictamac

Voice captured on Apple Watch processed by Claude Code, enabled by one binary.

Back in January, I posted about wiring my Apple Watch up to Obsidian. I built two MCP servers (apple-voice-memo-mcp, whisper-mcp), an agent processed my recordings on demand, and I felt clever. The stack worked. It was also Node + npx + ffmpeg + whisper.cpp + two MCP servers + Full Disk Access, and every npm install or brew upgrade was a chance for it to drift.

macOS 26 ships SpeechAnalyzer, which I already wrote about for Steno. On-device, fast, no model to download, no Node, no whisper.cpp. I wanted the watch → Obsidian flow rebuilt on that — one signed Swift binary, both the Voice Memos lookup and the transcription, CLI and MCP in the same process.

That’s dictamac.

The Loop That Actually Runs

I press the action button on my Apple Watch Ultra. Big, orange, impossible to miss. Voice Memos starts recording immediately.
I talk: a grocery list, a reminder to email someone, a half-baked product idea, a shower thought about a project.
iCloud syncs the memo to my Mac in the background.
A cron-triggered Claude Code agent fires on a regular cadence. It calls dictamac to list new memos and transcribe them. Then it follows whatever each one said — appends to the right Obsidian file with the right wikilinks, drafts an email, files a task, sends a Slack message, whatever the transcript asked for.

The dictamac call is dictamac --json --voice-memo "<query>". One process, one signed binary, one transcript. The agent doesn’t know or care that there’s a SpeechAnalyzer wrapper, a CloudRecordings.db reader, a TCC permission probe, or a fallback filesystem scanner underneath. It asks for a transcript and gets one.

What dictamac Looks Like

brew install jwulff/tap/dictamac

dictamac path/to/audio.m4a
→ Hello, world, this is a test.

cat audio.m4a | dictamac -
→ (same thing, via stdin)

dictamac --voice-memo "yesterday"
→ (resolves "yesterday" against my Voice Memos library, returns the transcript)

dictamac --list-voice-memos --since 7d --limit 5
→ (reverse-chronological listing, plaintext or --json)

Transcript on stdout, errors on stderr, exit code reflects what happened. The agent pipes it, shells out to it, parses it. No daemon, no state between invocations, no audio files left behind.

The MCP Side

dictamac --mcp flips it into a JSON-RPC stdio server. Three tools:

transcribe_file({path}) — transcribe an audio file by path
transcribe_voice_memo({query}) — find a Voice Memo and transcribe it
list_voice_memos({since, limit}) — list recent Voice Memos with metadata

That’s the bit I wanted most. Plug it into the agent’s tool list, give it a hint about when to use it, done. Whatever errors the CLI emits to stderr, the MCP server emits in its isError: true tool response — same text, byte-for-byte. There’s literally a test that asserts they don’t drift.

What’s Different from the January Stack

The January post had me running two MCP servers, npx-ing them on every Claude Code session, keeping whisper-cpp and ffmpeg healthy via Homebrew, granting Full Disk Access to find Voice Memos, and trusting that the OpenAI Whisper model on disk hadn’t drifted out of compatibility with whatever whisper.cpp had updated to.

dictamac is one brew install. No Node. No whisper.cpp. No ffmpeg. SpeechAnalyzer ships with the OS so there’s no model to manage. The Voice Memos lookup is built in, with the right TCC deep-link for Files & Folders access baked into the error message so a missing permission gets fixed in one click.

The glue has opinions, and they’re all the ones I learned the hard way the first time around:

Stage stdin into a temp file so SpeechAnalyzer (which only takes URLs) can read it, then clean up after.
Find the Voice Memos library across the two paths Apple stores it in, fall back gracefully when one doesn’t exist.
Read CloudRecordings.db directly when present (SQLite, much faster than a filesystem walk), fall back to a recursive *.m4a scan when the schema drifts or the file isn’t there.
Parse a query like “yesterday’s standup” into a date filter plus a fuzzy title match against the index.
Map every failure to a stable POSIX exit code so a shell pipeline knows whether to retry or bail.

All of that was scattered across whisper-mcp + apple-voice-memo-mcp + the shell scripts gluing them together. Now it’s one binary that does the right thing.

Try It

brew install jwulff/tap/dictamac
dictamac --help

Requires macOS 26 (Tahoe). Source: github.com/jwulff/dictamac.

If you want it as an MCP server in Claude Code, add this to ~/.claude/settings.json (or your project’s .mcp.json):

{
  "mcpServers": {
    "dictamac": {
      "command": "/opt/homebrew/bin/dictamac",
      "args": ["--mcp"]
    }
  }
}

Then your agent can transcribe any audio file or Voice Memo on demand. Press the button on your watch, talk, and trust that the thought ends up where it belongs.

– John