<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://johnwulff.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://johnwulff.com/" rel="alternate" type="text/html" /><updated>2026-06-02T05:05:24+00:00</updated><id>https://johnwulff.com/feed.xml</id><title type="html">John Wulff</title><subtitle>Personal site, recipes, and writing from John Wulff.</subtitle><entry><title type="html">Feedback to Shipped Fix, No Touching</title><link href="https://johnwulff.com/2026/05/19/lucentbrief-feedback-loop/" rel="alternate" type="text/html" title="Feedback to Shipped Fix, No Touching" /><published>2026-05-19T00:00:00+00:00</published><updated>2026-05-19T00:00:00+00:00</updated><id>https://johnwulff.com/2026/05/19/lucentbrief-feedback-loop</id><content type="html" xml:base="https://johnwulff.com/2026/05/19/lucentbrief-feedback-loop/"><![CDATA[<p>I’ve been wiring a feedback loop into my hobby apps. A user flags something that’s wrong, types a sentence describing it, and a few hours later the fix is live and they have a personal email from me explaining what changed. No standup, no triage meeting, no decision about whether the fix was worth doing. I never touch the keyboard for any of it.</p>

<p>It’s running in a few of my projects now, and I want it in everything I build.</p>

<p>The cleanest version of it lives in <a href="https://lucentbrief.com">Lucent Brief</a>, the daily personalized news brief I’ve been building for myself and a few friends and family. Every section of the brief has a feedback link in the footer. When something goes sideways — a headline that doesn’t match the article, a podcast voice that invents a guest name — the reader hits it and describes the problem in plain language. The brief itself isn’t the point here; the loop around it is.</p>

<p>That link is the start of the loop.</p>

<h2 id="the-pipeline">The Pipeline</h2>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>reader clicks feedback link
        ↓
   GitHub issue (label: user-feedback)
        ↓
   /issues runs at :12 past every hour
        ↓
   sub-agent in a worktree: fix, PR, email reader
        ↓
   /shepherd runs at :42 past every hour
        ↓
   bot reviewers, CI, auto-merge
        ↓
   GitHub Actions deploys to ECS
        ↓
   /shepherd emails me "✅ deployed"
        ↓
   reader's next brief reflects the fix
</code></pre></div></div>

<p>I run two Claude Code routines on my Mac, each on an hourly launchd timer. They never talk to each other directly; they coordinate through GitHub state — issues, PRs, labels, comments.</p>

<p>It starts on the producer side. When a reader hits that feedback link, a small form takes whatever they type and a background job turns it into a GitHub issue titled <code class="language-plaintext highlighter-rouge">User Feedback from &lt;their_email&gt;</code>, labeled by section. That issue is the work queue.</p>

<h2 id="issues-picks-it-up">/issues Picks It Up</h2>

<p>At :12 past every hour my Mac fires <code class="language-plaintext highlighter-rouge">/issues</code>. It’s a Claude Code slash command running in routine mode under launchd: no user in the loop, no <code class="language-plaintext highlighter-rouge">AskUserQuestion</code> allowed, capped at a 30-minute wall-clock budget so a slow tick doesn’t eat the next one.</p>

<p>Every tick it pulls the open <code class="language-plaintext highlighter-rouge">user-feedback</code> issues, skips any that already carry a <code class="language-plaintext highlighter-rouge">Message-ID:</code> comment (the “already emailed” signal), and dispatches one sub-agent per issue, each in its own worktree so they run in parallel. Each agent’s job is short: read the issue and quote a phrase back so the reporter knows they were read, pick a disposition (fix it, say it’s already fixed, acknowledge it, or explain a wontfix), do it, email the reporter immediately, and leave a <code class="language-plaintext highlighter-rouge">Message-ID:</code> comment so later ticks skip it. For a fix it opens a PR whose body starts with <code class="language-plaintext highlighter-rouge">Closes #N</code> so the issue auto-closes on merge.</p>

<h2 id="why-the-email-goes-out-at-dispatch-not-after-deploy">Why the Email Goes Out at Dispatch, Not After Deploy</h2>

<p>The natural design is “email the reader after the fix actually ships.” That’s how I wrote the protocol originally. It broke.</p>

<p>In April, six user-feedback issues piled up unanswered over five days. <code class="language-plaintext highlighter-rouge">/issues</code> was correctly labeling them, but it was routing them into the “needs clarification” bucket and silently assigning them to me. The fix shipped quietly. The reader never heard a word. By the time I noticed, six readers had concluded — fairly — that nobody was reading their feedback.</p>

<p>So the protocol moved the email obligation to the front of the loop. The agent emails the reader <em>at dispatch time</em> — before the PR is reviewed, merged, or deployed — with a note tailored to what it’s doing: fixing it, still investigating, or explaining why not.</p>

<p>Every variant quotes a phrase from the reader’s message. Every variant ends <code class="language-plaintext highlighter-rouge">Thanks again, / John &amp; the Lucent Brief team</code>. Every variant is CC’d to me so I see what went out.</p>

<p>That <code class="language-plaintext highlighter-rouge">Message-ID:</code> comment only lands after the email actually sends, so a failed send just retries on the next tick. And because it lives on the issue, it survives reverts and force-pushes — the issue is the only state that matters.</p>

<h2 id="shepherd-drives-it-to-merge">/shepherd Drives It to Merge</h2>

<p>Thirty minutes after <code class="language-plaintext highlighter-rouge">/issues</code>, <code class="language-plaintext highlighter-rouge">/shepherd</code> fires. It picks up the open PRs and drives each toward mergeable. Copilot and Codex are required reviewers on Lucent Brief PRs, so it reads their feedback, pushes fixes, re-requests review, and rebases against <code class="language-plaintext highlighter-rouge">main</code> when something else merges in the meantime.</p>

<p>A loop that fixes its own code has to police itself, not just the code, or it runs away. Three rules keep it bounded.</p>

<p>The first is a stuck-detector. Three consecutive <code class="language-plaintext highlighter-rouge">[shepherd]</code> commits with mutating reviewer feedback is the fingerprint of a feedback loop — the bot and the reviewer talking past each other — so the loop halts and escalates to me rather than push a fourth. The second is a reviewer-availability gate. If a required reviewer goes silent for three nudges across three ticks, it’s declared unavailable and the merge gate proceeds without it, so no single bot can hold a PR hostage by never showing up. The third is a quality bar that doesn’t bend: the full RSpec suite has to be green, line coverage has to clear 90%, and every review thread has to be resolved.</p>

<p>When the merge state is clean, every required bot has approved the latest SHA, and the quality bar is green, <code class="language-plaintext highlighter-rouge">/shepherd</code> squash-merges via the API. Low-risk fixes go straight through; anything touching auth, reader data, or the generation prompts waits for me.</p>

<p>Merge triggers GitHub Actions, which builds a Docker image and rolls a new ECS Fargate task. Roughly 8-12 minutes from merge to live. Then <code class="language-plaintext highlighter-rouge">/shepherd</code> verifies the deploy succeeded and emails me one line — what shipped, readable on a lock screen.</p>

<p>That email is how I learn the loop closed. I read it at breakfast, and half the time I had no idea a fix had shipped overnight — the reader’s report, my reply, the PR, the merge, the deploy, all of it while I was asleep.</p>

<h2 id="a-real-bug-start-to-finish">A Real Bug, Start to Finish</h2>

<p>Here’s the one from two weeks ago, end to end. I touched it once: I read the thank-you email after it was over.</p>

<p>The “This Day In History” section rendered this into a reader’s brief:</p>

<blockquote>
  <p>The Hindenburg… wait — that’s banned. In 1954, Roger Bannister Runs the First Sub-Four-Minute Mile at Oxford’s Iffley Road Track.</p>
</blockquote>

<p>The model had recognized that “Hindenburg” overlapped a subject used in an earlier brief, and it narrated its own censorship into the headline mid-sentence before recovering and writing a legitimate event. The reader hit the section’s feedback link and typed:</p>

<blockquote>
  <p>Why are you frequently saying “wait that’s banned” in this section?</p>
</blockquote>

<div class="image-pair">
  <img src="/assets/images/posts/lucentbrief-feedback-loop/brief-leak-headline.jpeg" alt="The This Day In History section as it reached the reader, with the leaked 'wait — that's banned' text in the headline" />
  <img src="/assets/images/posts/lucentbrief-feedback-loop/feedback-form.jpeg" alt="The per-section feedback form holding the reader's question about the 'wait that's banned' text" />
</div>

<p>The feedback form attached the session context — brief run #863, section <code class="language-plaintext highlighter-rouge">sec_8_this_day_in_history</code>, the timestamp — and opened issue #1586, labeled <code class="language-plaintext highlighter-rouge">section:this_day_in_history</code> and <code class="language-plaintext highlighter-rouge">user-feedback</code>.</p>

<p><code class="language-plaintext highlighter-rouge">/issues</code> picked it up and did the work. It traced the leak to the prompt — a list of already-covered subjects sitting under a literal <code class="language-plaintext highlighter-rouge">BANNED subjects</code> header that the model had started reading back out loud — and shipped a two-part fix, softening the prompt and adding a guard that strips the phrase if it ever slips through again, with tests for every variant. Then the bounded-autonomy rules earned their keep: the bot reviewers pushed back a few times, and after three fix-cycle commits the stuck-detector handed the PR to me rather than loop a fourth time. One reviewer went quiet and was declared unavailable, the other approved, and <code class="language-plaintext highlighter-rouge">/shepherd</code> merged.</p>

<div class="image-pair">
  <img src="/assets/images/posts/lucentbrief-feedback-loop/github-issue.jpeg" alt="GitHub issue #1586, User Feedback from john@johnwulff.com, with the brief run, section ID, and the reader's verbatim message" />
  <img src="/assets/images/posts/lucentbrief-feedback-loop/merged-pr.jpeg" alt="GitHub pull request #1609, fix(tdih) strip the meta-leak, marked Merged with 4 files changed across 4 commits" />
</div>

<p>The loop closed with an email back to the reader from <code class="language-plaintext highlighter-rouge">john@johnwulff.com</code>, signed “John &amp; the Lucent Brief team.” It thanked them by name, quoted their question, explained in plain English that the model had been anchoring on the <code class="language-plaintext highlighter-rouge">BANNED subjects</code> header and narrating it, described both halves of the fix, and linked PR #1609. It ended: “Keep the feedback coming — even small stuff like this is useful.”</p>

<p><img src="/assets/images/posts/lucentbrief-feedback-loop/fix-email.jpeg" alt="The email the loop sent back to the reader, subject &quot;Re: 'wait that's banned' in This Day In History — fix is on the way&quot;, with sections explaining what was happening and the two-prong fix" /></p>

<p>Total human time on that bug: zero. I went from a reader’s one-sentence complaint to a thank-you email in my inbox and a fix in production, hands off the wheel the whole way.</p>

<h2 id="two-routines-one-pipeline">Two Routines, One Pipeline</h2>

<p><code class="language-plaintext highlighter-rouge">/issues</code> and <code class="language-plaintext highlighter-rouge">/shepherd</code> never talk to each other. They share no memory or event bus, just a few conventions on top of GitHub state, and each is small enough to reason about alone — if one breaks, the other keeps working until I fix it. I (with Claude Code) built both over a few evenings; the hard part wasn’t the code, it was naming the right handoff signal and writing the protocol down precisely enough that a sub-agent with zero context could follow it.</p>

<p>The loop changed which fixes get made at all. The old reason to say no to a small fix was never that it was hard, it was that intake is expensive: someone has to file it, debate it, prioritize it, schedule it. When approving a change costs thirty seconds instead, you accept far more of them, most of them small, and they compound. A reader writes in, and a few hours later a real fix is live and they have a real reply, and the only thing I did was design the loop. That’s the part I want in everything I build next.</p>

<p>– John</p>]]></content><author><name></name></author><summary type="html"><![CDATA[I've been wiring a feedback loop into my hobby apps that turns a user's plain-language complaint into shipped code — diagnosed, fixed, reviewed, deployed, and answered — with no human in the middle.]]></summary></entry><entry><title type="html">Introducing Dictamac</title><link href="https://johnwulff.com/2026/05/18/dictamac-cli-mcp-transcription/" rel="alternate" type="text/html" title="Introducing Dictamac" /><published>2026-05-18T00:00:00+00:00</published><updated>2026-05-18T00:00:00+00:00</updated><id>https://johnwulff.com/2026/05/18/dictamac-cli-mcp-transcription</id><content type="html" xml:base="https://johnwulff.com/2026/05/18/dictamac-cli-mcp-transcription/"><![CDATA[<p><em>Voice captured on Apple Watch processed by Claude Code, enabled by one binary.</em></p>

<p>Back in <a href="/2026/01/11/voice-memos-to-second-brain/">January</a>, I posted about wiring my Apple Watch up to Obsidian. I built two MCP servers (<a href="https://github.com/jwulff/apple-voice-memo-mcp">apple-voice-memo-mcp</a>, <a href="https://github.com/jwulff/whisper-mcp">whisper-mcp</a>), an agent processed my recordings on demand, and I felt clever. The stack worked. It was also Node + npx + ffmpeg + whisper.cpp + two MCP servers + Full Disk Access, and every <code class="language-plaintext highlighter-rouge">npm install</code> or <code class="language-plaintext highlighter-rouge">brew upgrade</code> was a chance for it to drift.</p>

<p>macOS 26 ships <a href="https://developer.apple.com/documentation/speech/speechanalyzer">SpeechAnalyzer</a>, which I already wrote about for <a href="/2026/01/31/steno-speech-to-text-tui/">Steno</a>. On-device, fast, no model to download, no Node, no whisper.cpp. I wanted the watch → Obsidian flow rebuilt on that — one signed Swift binary, both the Voice Memos lookup and the transcription, CLI and MCP in the same process.</p>

<p>That’s <a href="https://github.com/jwulff/dictamac">dictamac</a>.</p>

<h2 id="the-loop-that-actually-runs">The Loop That Actually Runs</h2>

<ol>
  <li>I press the action button on my Apple Watch Ultra. Big, orange, impossible to miss. Voice Memos starts recording immediately.</li>
  <li>I talk: a grocery list, a reminder to email someone, a half-baked product idea, a shower thought about a project.</li>
  <li>iCloud syncs the memo to my Mac in the background.</li>
  <li>A cron-triggered Claude Code agent fires on a regular cadence. It calls dictamac to list new memos and transcribe them. Then it follows whatever each one said — appends to the right Obsidian file with the right wikilinks, drafts an email, files a task, sends a Slack message, whatever the transcript asked for.</li>
</ol>

<p>The dictamac call is <code class="language-plaintext highlighter-rouge">dictamac --json --voice-memo "&lt;query&gt;"</code>. One process, one signed binary, one transcript. The agent doesn’t know or care that there’s a SpeechAnalyzer wrapper, a CloudRecordings.db reader, a TCC permission probe, or a fallback filesystem scanner underneath. It asks for a transcript and gets one.</p>

<h2 id="what-dictamac-looks-like">What dictamac Looks Like</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>brew <span class="nb">install </span>jwulff/tap/dictamac

dictamac path/to/audio.m4a
→ Hello, world, this is a test.

<span class="nb">cat </span>audio.m4a | dictamac -
→ <span class="o">(</span>same thing, via stdin<span class="o">)</span>

dictamac <span class="nt">--voice-memo</span> <span class="s2">"yesterday"</span>
→ <span class="o">(</span>resolves <span class="s2">"yesterday"</span> against my Voice Memos library, returns the transcript<span class="o">)</span>

dictamac <span class="nt">--list-voice-memos</span> <span class="nt">--since</span> 7d <span class="nt">--limit</span> 5
→ <span class="o">(</span>reverse-chronological listing, plaintext or <span class="nt">--json</span><span class="o">)</span>
</code></pre></div></div>

<p>Transcript on stdout, errors on stderr, exit code reflects what happened. The agent pipes it, shells out to it, parses it. No daemon, no state between invocations, no audio files left behind.</p>

<h2 id="the-mcp-side">The MCP Side</h2>

<p><code class="language-plaintext highlighter-rouge">dictamac --mcp</code> flips it into a JSON-RPC stdio server. Three tools:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">transcribe_file({path})</code> — transcribe an audio file by path</li>
  <li><code class="language-plaintext highlighter-rouge">transcribe_voice_memo({query})</code> — find a Voice Memo and transcribe it</li>
  <li><code class="language-plaintext highlighter-rouge">list_voice_memos({since, limit})</code> — list recent Voice Memos with metadata</li>
</ul>

<p>That’s the bit I wanted most. Plug it into the agent’s tool list, give it a hint about when to use it, done. Whatever errors the CLI emits to stderr, the MCP server emits in its <code class="language-plaintext highlighter-rouge">isError: true</code> tool response — same text, byte-for-byte. There’s literally a test that asserts they don’t drift.</p>

<h2 id="whats-different-from-the-january-stack">What’s Different from the January Stack</h2>

<p>The January post had me running two MCP servers, npx-ing them on every Claude Code session, keeping <code class="language-plaintext highlighter-rouge">whisper-cpp</code> and <code class="language-plaintext highlighter-rouge">ffmpeg</code> healthy via Homebrew, granting Full Disk Access to find Voice Memos, and trusting that the OpenAI Whisper model on disk hadn’t drifted out of compatibility with whatever whisper.cpp had updated to.</p>

<p>dictamac is one <code class="language-plaintext highlighter-rouge">brew install</code>. No Node. No whisper.cpp. No ffmpeg. SpeechAnalyzer ships with the OS so there’s no model to manage. The Voice Memos lookup is built in, with the right TCC deep-link for Files &amp; Folders access baked into the error message so a missing permission gets fixed in one click.</p>

<p>The glue has opinions, and they’re all the ones I learned the hard way the first time around:</p>

<ul>
  <li>Stage stdin into a temp file so SpeechAnalyzer (which only takes URLs) can read it, then clean up after.</li>
  <li>Find the Voice Memos library across the two paths Apple stores it in, fall back gracefully when one doesn’t exist.</li>
  <li>Read <code class="language-plaintext highlighter-rouge">CloudRecordings.db</code> directly when present (SQLite, much faster than a filesystem walk), fall back to a recursive <code class="language-plaintext highlighter-rouge">*.m4a</code> scan when the schema drifts or the file isn’t there.</li>
  <li>Parse a query like “yesterday’s standup” into a date filter plus a fuzzy title match against the index.</li>
  <li>Map every failure to a stable POSIX exit code so a shell pipeline knows whether to retry or bail.</li>
</ul>

<p>All of that was scattered across whisper-mcp + apple-voice-memo-mcp + the shell scripts gluing them together. Now it’s one binary that does the right thing.</p>

<h2 id="try-it">Try It</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>brew <span class="nb">install </span>jwulff/tap/dictamac
dictamac <span class="nt">--help</span>
</code></pre></div></div>

<p>Requires macOS 26 (Tahoe). Source: <a href="https://github.com/jwulff/dictamac">github.com/jwulff/dictamac</a>.</p>

<p>If you want it as an MCP server in Claude Code, add this to <code class="language-plaintext highlighter-rouge">~/.claude/settings.json</code> (or your project’s <code class="language-plaintext highlighter-rouge">.mcp.json</code>):</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"mcpServers"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"dictamac"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"command"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/opt/homebrew/bin/dictamac"</span><span class="p">,</span><span class="w">
      </span><span class="nl">"args"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"--mcp"</span><span class="p">]</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Then your agent can transcribe any audio file or Voice Memo on demand. Press the button on your watch, talk, and trust that the thought ends up where it belongs.</p>

<p>– John</p>]]></content><author><name></name></author><summary type="html"><![CDATA[A macOS 26 SpeechAnalyzer CLI plus MCP server that replaces the Node + whisper.cpp + two-MCP-server stack from my January voice memos post. Press the button on my watch, an agent does the rest.]]></summary></entry><entry><title type="html">Year Two of Beat the Bridge for Abigail</title><link href="https://johnwulff.com/2026/04/28/beat-the-bridge-2026/" rel="alternate" type="text/html" title="Year Two of Beat the Bridge for Abigail" /><published>2026-04-28T00:00:00+00:00</published><updated>2026-04-28T00:00:00+00:00</updated><id>https://johnwulff.com/2026/04/28/beat-the-bridge-2026</id><content type="html" xml:base="https://johnwulff.com/2026/04/28/beat-the-bridge-2026/"><![CDATA[<blockquote>
  <p><a href="https://www2.breakthrought1d.org/site/TR?fr_id=10543&amp;s_participantTrID=10543&amp;pg=team&amp;team_id=373961"><strong>Join us</strong></a> at 8am on Saturday May 9th at Husky Stadium for a 3-mile fun walk to support diabetes research or <a href="https://www2.breakthrought1d.org/site/Donation2?PROXY_ID=14178039&amp;mfc_pref=T&amp;idb=694615257&amp;df_id=25648&amp;PROXY_TYPE=20&amp;25648.donation=form1&amp;FR_ID=10543&amp;FR_ID=10543&amp;PROXY_ID=14178039&amp;PROXY_TYPE=20"><strong>donate in Abigail’s name here</strong></a>.</p>
</blockquote>

<p><img src="/assets/images/posts/beat-the-bridge-2026/team-wulff-2025.jpeg" alt="Team Wulff at the 43rd Annual Beat the Bridge in May 2025, about seventeen people in matching blue Beat the Bridge t-shirts with Abigail in front" class="full-width" /></p>

<p>My daughter Abigail has Type 1 diabetes (T1D). She was diagnosed in late October 2024, two days before Halloween. The photo above is from last May at the 43rd Annual Beat the Bridge for Breakthrough T1D in Seattle. That was six months in. We were still raw, still figuring out the new normal.</p>

<p>Eighteen months in is a different place.</p>

<p>We’ve built a team around Abigail. Seattle Children’s is incredible. Her school has a 504 plan and a group of adults who deeply know her and her diabetes. Family is dialed in. Friends step up. We don’t take any of it for granted.</p>

<p><img src="/assets/images/posts/beat-the-bridge-2026/abigail-diabadass.jpeg" alt="Abigail in a pink &quot;DIABADASS&quot; t-shirt being exuberant, mid-yell, arms out" class="float-right" /></p>

<p>She has not let any of it slow her down. If anything, T1D’s presence taunts us into living more, not less. School, friends, adventures, and lots of laughs. It’s not easy. It’s not always fun. But it’s hopeful and full of love.</p>

<h2 id="the-research-is-moving">The research is moving</h2>

<p>T1D is autoimmune. It has nothing to do with diet or lifestyle. The pancreas stops making insulin, and the body needs it from outside, forever. Unless something changes.</p>

<p>Things are changing. Vertex’s stem-cell-derived islet therapy zimislecel just published trial results in the New England Journal of Medicine: of twelve patients followed for at least a year after a single infusion, all twelve hit normal blood sugar targets and ten are no longer using insulin at all. The pivotal trial is enrolling now, with FDA submission expected this year.</p>

<p>Teplizumab (sold as Tzield) is the first drug that can delay T1D onset by a median of 2.7 years in kids who test positive for the autoantibodies. The FDA just expanded its approval last week to children as young as one. None of this happened by accident. It happened because people funded the research.</p>

<p>That funding isn’t guaranteed. Federal research dollars are under real pressure right now, and the timing matters. We’re closer to meaningful breakthroughs than we’ve ever been. This is exactly the wrong moment for the pipeline to slow down.</p>

<h2 id="beat-the-bridge-2026">Beat the Bridge 2026</h2>

<p><img src="/assets/images/posts/beat-the-bridge-2026/abigail-shirt-by-water.jpeg" alt="Abigail by the water in her blue Beat the Bridge for Breakthrough T1D t-shirt, smiling" class="float-left" /></p>

<p>We’re walking again this year in the 44th Annual Beat the Bridge on Saturday, May 9, 2026, at Husky Stadium. We’re <a href="https://www2.breakthrought1d.org/site/TR?fr_id=10543&amp;s_participantTrID=10543&amp;pg=team&amp;team_id=373961">Team Wulff</a>, the name Abigail picked herself. Her pack means so much to her.</p>

<p>Breakthrough T1D has driven nearly every major advance in T1D care since the 1970s. What we raise here goes to fund research, advocacy, and family support.</p>

<p>To everyone who donated, joined us, or cheered us on last year, thank you. Our community showed up in a way that floored us. We won’t pretend any single contribution is going to cure diabetes, but you changed Abigail’s world.</p>

<p>As her parents, getting to show Abigail that the people in her life see her, that they care, that they have her back, meant so much to us. It’s really hard to be a kid with T1D. The people in her life showing up matters to her, and it matters to us. That’s the whole thing.</p>

<h2 id="join-us">Join Us!</h2>

<p>The 3-mile walk is fun and free, strollers welcome, and starts at 8:15 AM at Husky Stadium on Saturday, May 9. Parking is in lots E1/E18 just north of the stadium, or take the 1 Line to the University of Washington station.</p>

<p><a href="https://www2.breakthrought1d.org/site/TR?fr_id=10543&amp;s_participantTrID=10543&amp;pg=team&amp;team_id=373961"><strong>Sign up for the free walk at beatthebridge.org</strong></a>, then come find us at the start. If you can’t be there in person, <a href="https://www2.breakthrought1d.org/site/Donation2?PROXY_ID=14178039&amp;mfc_pref=T&amp;idb=694615257&amp;df_id=25648&amp;PROXY_TYPE=20&amp;25648.donation=form1&amp;FR_ID=10543&amp;FR_ID=10543&amp;PROXY_ID=14178039&amp;PROXY_TYPE=20"><strong>donate in Abigail’s name</strong></a> instead.</p>

<p>Walk with us. Donate. Share Abigail’s story. Whatever you can do helps. From our family to yours, thank you.</p>

<p>– John, Courtney, and Abigail</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Eighteen months into our daughter Abigail's Type 1 diabetes journey, we're walking the 44th Annual Beat the Bridge for Breakthrough T1D on May 9 with Team Wulff.]]></summary></entry><entry><title type="html">Omnipod 5 Won’t Connect to Dexcom G7 After a Pod Change? Try Wrapping the Old Pod in Foil</title><link href="https://johnwulff.com/2026/04/26/omnipod-5-dexcom-g7-deactivated-pod-rf/" rel="alternate" type="text/html" title="Omnipod 5 Won’t Connect to Dexcom G7 After a Pod Change? Try Wrapping the Old Pod in Foil" /><published>2026-04-26T00:00:00+00:00</published><updated>2026-04-26T00:00:00+00:00</updated><id>https://johnwulff.com/2026/04/26/omnipod-5-dexcom-g7-deactivated-pod-rf</id><content type="html" xml:base="https://johnwulff.com/2026/04/26/omnipod-5-dexcom-g7-deactivated-pod-rf/"><![CDATA[<p><img src="/assets/images/posts/omnipod-dexcom-pod-rf/hero.png" alt="Risograph illustration of a tired dad in pajamas and bunny slippers holding a crumpled foil ball, surrounded by floating Omnipod 5 pods and Dexcom G7 sensors with red radio waves emanating outward" class="full-width" /></p>

<p>My daughter Abigail has Type 1 diabetes (T1D). She wears an Insulet Omnipod 5 insulin pump and a Dexcom G7 continuous glucose monitor (CGM). The pump talks directly to the CGM over a wireless link so it can adjust insulin in response to her glucose. When that link breaks, automated mode stops working and we’re back to manual dosing until it comes back.</p>

<p>A recent pod change went sideways and cost us a pod, a CGM replaced four days early, and a peaceful Saturday night before I figured it out. I couldn’t find anything about it online, so here’s my story in case it helps the next person in my situation.</p>

<h2 id="three-pods-two-cgms">Three Pods, Two CGMs</h2>

<p>There were five devices involved, so to keep things simple I’m going to call them Pods 1, 2, and 3 and Dexcoms 1 and 2.</p>

<ul>
  <li><strong>Pod 1.</strong> End of life. One of the best pods we’d ever had — really good time in range across the three days she wore it. Sad to see it go.</li>
  <li><strong>Pod 2.</strong> First replacement. Activated cleanly and paired with her controller (her iPhone), but never connected to Dexcom 1. It was also dropping its Bluetooth link to the controller intermittently.</li>
  <li><strong>Pod 3.</strong> Second replacement. The one she’s on now.</li>
  <li><strong>Dexcom 1.</strong> The CGM she’d been wearing all week. Still had four days of life left and was happily reporting to her phone the whole time this drama was unfolding.</li>
  <li><strong>Dexcom 2.</strong> The replacement CGM I put on after Pod 3 also failed to pair with Dexcom 1.</li>
</ul>

<h2 id="the-sequence">The Sequence</h2>

<p><img src="/assets/images/posts/omnipod-dexcom-pod-rf/glooko-timeline.png" alt="Glooko timeline of Abigail's afternoon and evening showing glucose readings, carb entries, insulin doses, and the two pod-swap markers around 7 and 8 PM" class="hero-left" /></p>

<p>Pod 1 hit end of life, so I started its deactivation at 6:43 PM. Something weird happened in the middle of it. The screen flashed some kind of warning — I remember it being something to the effect of “out of range,” but in the chaos of a pod change I didn’t catch the exact wording — and then re-presented the deactivate screen. I hit deactivate again. The second attempt finished normally, and I didn’t think much of it at the time.</p>

<p>Then I put Pod 2 on. It activated and paired with the controller. But it never made a connection to Dexcom 1. And while we were waiting, the controller’s link to Pod 2 itself kept dropping in and out.</p>

<p>That’s what made me suspect Pod 2 was the problem. If its radio was misbehaving on the Bluetooth side <em>and</em> on the link to the CGM, the simplest explanation was a bad pod. Cheaper to swap a pod than to swap a CGM that still had four days of life on it, so I started there.</p>

<p>Pod 2’s deactivation at 7:58 PM went totally normal. Both old pods went into the metal trash can in the kitchen.</p>

<p>Then I put Pod 3 on. It paired with the controller cleanly and held its link initially. But it also couldn’t see Dexcom 1.</p>

<p><img src="/assets/images/posts/omnipod-dexcom-pod-rf/dexcom-2-applicator.jpeg" alt="New Dexcom G7 sensor applicator with pairing code 3405 visible" class="float-right" /></p>

<p>At that point I’d burned a pod and the CGM still wasn’t talking to the new pod, so I bit the bullet and replaced Dexcom 1 at 8:05 PM. Now we had Pod 3 and Dexcom 2, and everything worked, briefly.</p>

<p>Abigail went to bed at her usual 8:30 PM. Shortly after that, it fell apart again. Pod 3 stayed solid on the controller, but it started losing Dexcom 2. Dexcom 2 started losing the controller too. The phone did occasionally get readings from the CGM, but the pod got readings from the CGM even less frequently. Nothing was stable.</p>

<p>I tried everything else I could think of. I rebooted the phone. Re-added the CGM in the Dexcom app and re-entered the sensor code in the Omnipod 5 app. No change.</p>

<p>By this time, Abigail’s pod was starting to alarm from not having data in over an hour. Luckily it didn’t wake her up, but it was raising the tension.</p>

<p>I moved Abigail’s phone as close to her as I could without waking her. That helped the CGM-to-phone link a little, but the CGM-to-pod link was already as close as it could get — both on her body — and wasn’t improving.</p>

<p>She slept through all this. I was running out of ideas.</p>

<h2 id="the-1012-pm-walk">The 10:12 PM Walk</h2>

<p><img src="/assets/images/posts/omnipod-dexcom-pod-rf/old-pods.jpeg" alt="The two old Omnipod 5 pods after I fished them out of the trash, lot numbers visible" class="float-left" /></p>

<p>The only thing I hadn’t tried was getting rid of the two old pods sitting in the kitchen trash can. I wasn’t optimistic about it — the metal can was already a partial Faraday cage and they were already across the house from her. But the deactivated pods were the last variable I hadn’t isolated.</p>

<p>I fished Pods 1 and 2 out of the trash. I can’t say definitively which is which in this photo, but I think the one with more corrosion on the case was Pod 1.</p>

<p>I took photos of their lot numbers, wrapped them in a tight ball of aluminum foil, walked them two blocks down the street, and dropped them in a neighbor’s trash can. At 10:12 p.m.</p>

<p><img src="/assets/images/posts/omnipod-dexcom-pod-rf/foil-walk.gif" alt="Doorbell camera view of me walking out with the foil-wrapped pods" class="hero" /></p>

<p>Everything started working before I got back home. Pod 3 locked onto Dexcom 2. Dexcom 2 locked onto the controller. Automated mode came back on. Abigail slept through the whole thing.</p>

<p>If your new Omnipod won’t pair with your Dexcom and you’ve already swapped both, find every recently-deactivated pod in the house, wrap them in foil, and get them as far away as possible. A neighbor’s trash can two blocks away worked for me. Sixty seconds, no parts, no app reinstalls. If it fixes it, you’ve found your interferer. If it doesn’t, you’ve ruled it out cleanly.</p>

<h2 id="my-best-guess">My Best Guess</h2>

<p>I can’t fully isolate which old pod was at fault. Two candidates:</p>

<ol>
  <li><strong>Pod 1’s deactivation got corrupted somehow.</strong> That weird mid-deactivation hiccup may have left it in some abnormal RF state.</li>
  <li><strong>Pod 2 was a defective pod</strong>, with a radio that was misbehaving on every band it touched. That would also explain why it was dropping its Bluetooth link to the controller while it was still active, before I ever deactivated it.</li>
</ol>

<p>Either way, one of those two old pods was loud enough on radio frequency to step on the active pod’s connection to the new CGM from across the house. Loud enough that even the CGM-to-phone Bluetooth was struggling.</p>

<p>Reporting both lot numbers — <strong>PH1M09162521</strong> and <strong>PH1U12102412</strong> — to Insulet (1-800-591-3455). They track lot-level issues, and a deactivated pod that can jam pairing across the house is worth a data point.</p>

<h2 id="related-posts">Related Posts</h2>

<p>I write about diabetes tooling for Abigail a lot here:</p>

<ul>
  <li><a href="/2026/03/25/glucagent-daily-digest/">Glucagent</a> — a daily AI digest email that reads her last 24 hours of glucose, insulin, and carb data and explains what happened.</li>
  <li><a href="/2026/01/18/pixoo-signage/">The Pixoo signage series</a> — a 64x64 LED display in our kitchen showing her live glucose, insulin, and weather.</li>
</ul>

<p>– John</p>]]></content><author><name></name></author><summary type="html"><![CDATA[It cost us a pod and a CGM replaced four days early before I figured it out. The culprit was a deactivated pod broadcasting from a kitchen trash can across the house.]]></summary></entry><entry><title type="html">Steno: Always Listening, Always Queryable</title><link href="https://johnwulff.com/2026/03/28/steno-always-listening/" rel="alternate" type="text/html" title="Steno: Always Listening, Always Queryable" /><published>2026-03-28T00:00:00+00:00</published><updated>2026-03-28T00:00:00+00:00</updated><id>https://johnwulff.com/2026/03/28/steno-always-listening</id><content type="html" xml:base="https://johnwulff.com/2026/03/28/steno-always-listening/"><![CDATA[<p><em>This is Part 2 about Steno, a private speech-to-text tool for macOS. <a href="/2026/01/31/steno-speech-to-text-tui/">Part 1</a> covers building the initial TUI with macOS 26’s SpeechAnalyzer API.</em></p>

<p><img src="/assets/images/posts/steno-always-listening/steno-go-tui.png" alt="Steno Go TUI showing dual-source transcription with topics and level meters" class="full-width" /></p>

<p>Before <a href="https://github.com/jwulff/steno">Steno</a>, the way I’d capture conversations for later analysis was: manually start a recording, transcribe it after the fact, file it somewhere, then feed it into an agent later. It worked, but it was all manual and the transcripts were disconnected from everything else.</p>

<p>Now I fire up Steno and it captures in real time. My agents can query transcripts while they’re being recorded via MCP or direct DB queries, or after the fact, or both. I capture FaceTime conversations with friends, talk through problems on my hobby projects out loud, and all of it is immediately available to synthesize into docs, code, whatever I’m building.</p>

<p>Since the first post, Steno has grown into a two-process setup: a headless Swift daemon that records and transcribes continuously, and a Go terminal client that displays everything. The daemon just sits in the background for days, writing transcript segments into SQLite as they come in. No audio files piling up, just a database that keeps growing and stays queryable across sessions and timespans. I (with Claude Code) built all of it in about two weekends.</p>

<h2 id="both-sides-of-the-conversation">Both Sides of the Conversation</h2>

<p>The first version only heard my microphone. That’s fine for talking to myself, but FaceTime calls and group chats need both sides. Press <code class="language-plaintext highlighter-rouge">a</code> and Steno starts capturing system audio too. The transcript shows <code class="language-plaintext highlighter-rouge">[MIC]</code> for me and <code class="language-plaintext highlighter-rouge">[SYS]</code> for everyone else, with separate level meters.</p>

<p>Getting there took some figuring out. Core Audio Taps looked like the right API — create a process tap, capture the output. But macOS TCC (Transparency, Consent, and Control) doesn’t really work with unbundled CLI tools. The binary never appeared in System Settings, no recording indicator showed up, and macOS just silently delivered empty audio buffers. We spent a couple hours debugging zeros before we understood why.</p>

<p>ScreenCaptureKit turned out to be the answer. It triggers the proper permission dialog, shows the orange recording indicator, and actually delivers audio. Simpler API too. Each audio source gets its own SpeechAnalyzer instance: mic at 16kHz mono, system audio converted down from 48kHz stereo.</p>

<h2 id="always-running-in-the-background">Always Running in the Background</h2>

<p>The original app tied the UI to audio capture, speech recognition, topic extraction, and database storage. If I closed the terminal, the recording stopped. I wanted transcription running all the time without needing a window open.</p>

<p>So I pulled the backend out into a headless daemon. <code class="language-plaintext highlighter-rouge">steno-daemon</code> handles all the heavy lifting: microphone and system audio capture, SpeechAnalyzer transcription, topic extraction via on-device LLMs, and SQLite persistence. It talks to clients over a Unix socket using NDJSON (newline-delimited JSON). I start it once and it just runs, surviving terminal restarts and happily chugging along for days.</p>

<p>Since there’s no audio being saved, just text segments flowing into SQLite, the database stays small and anything that can read SQLite can query it.</p>

<h2 id="a-go-tui">A Go TUI</h2>

<p>With the backend living in its own process, the TUI became a pure display layer. I rewrote it in Go with <a href="https://github.com/charmbracelet/bubbletea">bubbletea</a>, which gives me Elm architecture for terminal apps: model, update, view. It connects to the daemon’s Unix socket for live events and reads topics directly from SQLite in WAL mode.</p>

<p>Running <code class="language-plaintext highlighter-rouge">steno</code> auto-starts the daemon if it isn’t already going, connects to the socket, and shows the live transcript with topics and level meters. One command and I’m up.</p>

<p>Once the Go TUI was solid, I deleted the old Swift one. Deleting 2,559 lines of dead code is always satisfying.</p>

<h2 id="best-language-for-the-job">Best Language for the Job</h2>

<p>The daemon is Swift because it has to be. SpeechAnalyzer, ScreenCaptureKit, and the audio stack are all Apple frameworks. But the TUI and MCP server are Go because Go has better terminal libraries and is just easier to work in for this kind of thing. bubbletea and lipgloss are fantastic, and pure-Go SQLite means I get a static binary with zero C dependencies.</p>

<p>I don’t know either language very well. Building with Claude Code means I can pick the best language for each part of the system without that being a blocker. Two languages, two processes, each doing what it’s best at.</p>

<h2 id="agents-listen-along">Agents Listen Along</h2>

<p>I added an MCP (Model Context Protocol) server so Claude Code and Claude Desktop can query the transcript database directly. Five read-only tools: search across sessions, list sessions, get session details with topics, retrieve full transcripts with time-window filtering, and a database overview.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>steno           <span class="c"># Launch TUI (auto-starts daemon)</span>
steno <span class="nt">--mcp</span>     <span class="c"># Run as MCP stdio server</span>
</code></pre></div></div>

<p>Both modes live in a single Go binary. I added <code class="language-plaintext highlighter-rouge">steno --mcp</code> to my Claude Desktop config and now my agents have ears.</p>

<p>I use it when I’m on a FaceTime call with a friend talking through ideas for a project. I tell Claude Code this is what I’m doing and it “listens along” with the conversation. It queries Steno, pulls the relevant segments, and weaves those ideas into whatever I’m building. No notes to take, no recordings to scrub through.</p>

<p>Same thing when I’m working solo. I’ll talk through a problem out loud, thinking about edge cases and trade-offs. An hour later I ask Claude to turn those thoughts into code or documentation. It finds the segments, knows the context, and gets it done.</p>

<p>Having everything queryable across timespans is super useful. “What have I been thinking about this week?” is a real question with a real answer now.</p>

<h2 id="still-open-source">Still Open Source</h2>

<p>Steno is open source: <a href="https://github.com/jwulff/steno">github.com/jwulff/steno</a>. Requires macOS 26 (Tahoe) and Apple Silicon.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-LO</span> https://github.com/jwulff/steno/releases/latest/download/steno-darwin-arm64.tar.gz
<span class="nb">tar </span>xzf steno-darwin-arm64.tar.gz
<span class="nb">mkdir</span> <span class="nt">-p</span> ~/.local/bin
<span class="nb">mv </span>steno steno-daemon ~/.local/bin/
</code></pre></div></div>

<p>– John</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Steno went from a single Swift app to a two-process daemon architecture with system audio capture and an MCP server. Now my agents can query everything I've said.]]></summary></entry><entry><title type="html">Glucagent: A Daily Diabetes Digest</title><link href="https://johnwulff.com/2026/03/25/glucagent-daily-digest/" rel="alternate" type="text/html" title="Glucagent: A Daily Diabetes Digest" /><published>2026-03-25T00:00:00+00:00</published><updated>2026-03-25T00:00:00+00:00</updated><id>https://johnwulff.com/2026/03/25/glucagent-daily-digest</id><content type="html" xml:base="https://johnwulff.com/2026/03/25/glucagent-daily-digest/"><![CDATA[<p><img src="/assets/images/posts/glucagent/daily-digest.png" alt="Daily digest email showing glucose and insulin data" class="hero narrow-only" /></p>

<p>My daughter Abigail has Type 1 diabetes. Her body doesn’t produce insulin, so we manage it externally: an insulin pump delivers a steady background drip (basal), and she gets additional doses (boluses) at meals based on how many carbs she’s eating. A continuous glucose monitor (CGM) on her arm reads her blood sugar every five minutes and sends it to our phones. The goal is keeping glucose between 70 and 180 mg/dL, a metric called Time in Range (TIR).</p>

<p>I’ve been building tools around this data for a while now. The <a href="/2026/01/18/pixoo-signage/">LED display series</a> shows what’s happening in real time: glucose, trends, weather, short AI observations on a 64x64 pixel grid. It’s great for glancing at while we’re home.</p>

<p>Being a Type 1 diabetes (T1D) caregiver means constantly analyzing data. “Why did she spike after lunch?” “Was that low from too much insulin or not enough food?” “Is this week better or worse than last week?” The display shows what’s happening now. I wanted something that explains what happened yesterday and why. The kind of analysis we’d do ourselves if we had time to sit down with a spreadsheet every morning, backed by all the diabetes knowledge in the best frontier model I can access.</p>

<p>So I built Glucagent. Every morning at 7:15, Courtney and I get an email. Claude reads Abigail’s last 24 hours of glucose, insulin, and carb data from two sources: her <a href="https://www.dexcom.com/">Dexcom</a> G7 CGM for glucose readings every five minutes, and <a href="https://glooko.com/">Glooko</a> for insulin and carb data from her Insulet Omnipod 5 pump. It compares yesterday against the week and writes a report.</p>

<h2 id="what-the-email-looks-like">What the Email Looks Like</h2>

<div class="email-walkthrough-frame">
<iframe src="/assets/posts/glucagent/full-email.html" title="Sample Glucagent daily digest email" onload="
  var h=this.contentDocument.documentElement.scrollHeight;
  this.style.height=h+'px';
  if(window.matchMedia('(min-width:1101px)').matches){
    this.parentElement.style.height=Math.ceil(h*0.4)+'px';
  }
"></iframe>
</div>

<p>Claude’s analysis is broken into five sections. A one-sentence <strong>Quick Take</strong> captures the most important thing about yesterday. <strong>What the Week Shows</strong> puts yesterday in context against the seven-day trend, comparing carb ratios, correction frequency, and basal patterns across days. <strong>Yesterday Deep Dive</strong> walks through the day chronologically, connecting specific insulin doses to specific glucose responses. <strong>Pattern Insight</strong> offers one data-backed observation, like a timing mismatch between insulin action and carb absorption. And <strong>The Human Side</strong> ties encouragement to specific data points so it feels earned, not generic.</p>

<p>The prompt tells Claude to act as an endocrinologist, diabetes coach, and mental health ally. It gets the full 7-day data table, yesterday’s complete timeline of CGM readings and bolus events, the previous day’s analysis for continuity, and sensor health data. Never prescriptive (no dosing advice), but educational.</p>

<h2 id="sensor-integrity-detection">Sensor Integrity Detection</h2>

<p>Not all CGM data is trustworthy. Sensors get noisy, especially in the beginning and end of their 10 day lifespan, or when compressed during sleep. Noisy data can mislead both humans and AI.</p>

<p>I ported Dan Heller’s <a href="https://github.com/argv01/cgm-sensor-integrity">CGM Sensor Integrity Detector</a> algorithm from Python to TypeScript (CC BY-NC 4.0). It uses a rolling 30-minute window to analyze reversals, amplitude, and incoherence ratio in the glucose signal, flagging clusters of noisy readings.</p>

<p><img src="/assets/images/posts/glucagent/sensor-health.png" alt="Sensor health card showing Fair status, 97% clean data, 1 noise event" class="full-width" /></p>

<p>The original v5 thresholds were calibrated for earlier Dexcom models. When I ran them against Abigail’s Dexcom G7 data, they flagged 73% of days as noisy. That’s not useful. So I backtested against 73 days of production data and recalibrated. “Poor” days dropped from 58% to 25%, and “good” days went from 27% to 47%.</p>

<p>The email now includes a Sensor Health card that shows clean data percentage, noise events, and whether noise preceded any lows.</p>

<h2 id="bugs-and-production-realities">Bugs and Production Realities</h2>

<p>Dexcom has a Share API that makes getting glucose data straightforward. Insulin and carb data is harder. Insulet (the pump manufacturer) doesn’t have a public API, but they share data with Glooko, a diabetes data aggregation platform. Glooko doesn’t have a public API either, but it does have a CSV export in its web UI. So Glucagent runs a headless browser on AWS Lambda every four hours, logs into Glooko, triggers a CSV export, intercepts the download, and parses the insulin and carb records out of it. It’s fragile by nature, and debugging Puppeteer on Lambda has been the single biggest time sink of the project.</p>

<h2 id="what-it-costs">What It Costs</h2>

<p>The whole platform runs on AWS for roughly $8-13/month.</p>

<table>
  <thead>
    <tr>
      <th>Service</th>
      <th>What it does</th>
      <th>Cost</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>DynamoDB</td>
      <td>CGM, insulin, and carb storage</td>
      <td>~$2</td>
    </tr>
    <tr>
      <td>Lambda</td>
      <td>Dexcom, Glooko, and digest runs</td>
      <td>~$3</td>
    </tr>
    <tr>
      <td>Bedrock</td>
      <td>Claude Opus daily analysis</td>
      <td>~$2-5</td>
    </tr>
    <tr>
      <td>SES (Simple Email Service)</td>
      <td>Email delivery</td>
      <td>&lt; $1</td>
    </tr>
    <tr>
      <td>S3</td>
      <td>Raw export archival</td>
      <td>&lt; $1</td>
    </tr>
  </tbody>
</table>

<p>I (with Claude Code) built the core platform in about three hours on a Saturday: SST v3 infrastructure, Dexcom ingestion with session caching and dedup, the daily digest with Claude Opus, continuous integration with GitHub Actions, and CloudWatch alarms. The insulin pipeline and sensor integrity work came over the following weeks.</p>

<p>It’s still early. Single patient, email only. A progressive web app and more sophisticated agent features are on the list. But every morning at 7:15, Courtney and I get an email that tells us what happened overnight, connects it to the week’s patterns, and helps us ask better questions at Abigail’s next endo appointment. That’s the whole point.</p>

<p>– John</p>]]></content><author><name></name></author><summary type="html"><![CDATA[A daily email that reads my daughter's glucose, insulin, and carb data and explains what happened yesterday and why. Claude Opus, SST v3, ~$10/month.]]></summary></entry><entry><title type="html">Dashboard Update: From Agent Framework to One Prompt, Saving $1,700/Month</title><link href="https://johnwulff.com/2026/02/21/insights-cost-refactor/" rel="alternate" type="text/html" title="Dashboard Update: From Agent Framework to One Prompt, Saving $1,700/Month" /><published>2026-02-21T00:00:00+00:00</published><updated>2026-02-21T00:00:00+00:00</updated><id>https://johnwulff.com/2026/02/21/insights-cost-refactor</id><content type="html" xml:base="https://johnwulff.com/2026/02/21/insights-cost-refactor/"><![CDATA[<p><img src="/assets/images/posts/pixoo-signage/bedrock-cost-chart.png" alt="AWS Bedrock cost chart showing a cliff on February 11" class="hero" /></p>

<p><em>This is Part 5 in a series about a LED diabetes dashboard I built for my daughter Abigail. It shows her glucose, insulin, and weather on a 64x64 display, with an AI agent that offers short supportive observations. <a href="/2026/01/18/pixoo-signage/">Part 1</a> covers building it. <a href="/2026/01/27/pixoo-signage-insulin-tracking/">Part 2</a> adds insulin tracking. <a href="/2026/02/02/pixoo-signage-insights-agent/">Part 3</a> introduces the AI insights agent. <a href="/2026/02/03/insights-refinements/">Part 4</a> refines it after a stressful night.</em></p>

<p>I used an agent framework where a single prompt would have done the job. The overkill was on track to cost me $1,700/month.</p>

<p>I built the insights agent using <a href="https://aws.amazon.com/bedrock/agents/">AWS Bedrock Agents</a>. Bedrock Agents is a framework for building autonomous AI workflows where the model decides which tools to call, in what order, based on a conversation. It’s great for complex, fluid tasks where the reasoning path isn’t known in advance.</p>

<p>My task was not complex or fluid. I needed to take diabetes data, inject it into a prompt, and get a 30-character insight for an LED display. The same data, the same prompt structure, every time. A one-shot inference call.</p>

<p>Instead, I built a multi-turn agent with four action groups, OpenAPI schemas, IAM roles for each tool, and agent alias versioning. The agent would decide to call <code class="language-plaintext highlighter-rouge">getRecentGlucose</code>, then <code class="language-plaintext highlighter-rouge">getDailyStats</code>, then <code class="language-plaintext highlighter-rouge">getRecentTreatments</code>, then <code class="language-plaintext highlighter-rouge">getInsightHistory</code>, then <code class="language-plaintext highlighter-rouge">storeInsight</code>. Each roundtrip resent the full prompt and accumulated context. Five calls, each one bigger than the last.</p>

<h2 id="one-prompt-instead-of-five">One Prompt Instead of Five</h2>

<p>The fix was simple. Instead of letting the agent orchestrate its own data gathering, each Lambda pre-fetches the data it needs from DynamoDB, formats everything into a single prompt, and calls Claude once via <code class="language-plaintext highlighter-rouge">InvokeModel</code>. One request, one response.</p>

<p>I (with Claude Code) replaced the entire Bedrock Agent framework with a shared <a href="https://github.com/jwulff/signage/blob/185067a/packages/functions/src/diabetes/analysis/invoke-model.ts">invoke-model.ts</a> utility of about 80 lines. Deleting 2,559 lines of code is always satisfying.</p>

<p>Bedrock Agents are super cool. I’ll definitely build more agents on this service where I have more ambiguous tasks. But for a predictable prompt/response cycle where I know exactly what data the model needs, one-shot inference worked way better. Faster, cheaper, and about 80 lines of code instead of 2,559.</p>

<p>Post-deploy results after a full day of production data: zero errors, 32 insights generated, quality unchanged. Cost: roughly $1/day, down from $58/day.</p>

<h2 id="rate-limiting">Rate Limiting</h2>

<p>The architecture wasn’t the only inefficiency. The agent ran on every CGM reading: 288 per day, one every 5 minutes. Most invocations replaced a still-relevant insight minutes later.</p>

<p>I replaced the 5-minute debounce with smart triggers that only generate a new insight when something actually changes: enough time has passed, glucose moves significantly, or glucose crosses a zone boundary. This cut invocations from 288/day to about 36.</p>

<p>I also tried switching from Claude Sonnet 4.5 to Haiku 4.5, figuring short LED insights didn’t need the bigger model. That didn’t go well, at all. Haiku hallucinated constantly. “Steady drop since 8pm” when the data showed 25 minutes of decline. “Great overnight!” when she’d been high for hours. It would confidently describe glucose patterns that weren’t in the data at all. For anything involving medical data, even short observations, grounding matters more than speed. I switched back to Sonnet the same night.</p>

<p>The nice thing: with rate limiting cutting volume by 87%, Sonnet at 36 invocations/day cost less than Haiku at 288/day. I didn’t have to compromise on quality to fix the cost problem.</p>

<table>
  <thead>
    <tr>
      <th>Configuration</th>
      <th>Invocations/day</th>
      <th>Cost/day</th>
      <th>Cost/month</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Bedrock Agent + 5min debounce</td>
      <td>~288</td>
      <td>~$58</td>
      <td>~$1,751</td>
    </tr>
    <tr>
      <td>Bedrock Agent + rate limiting</td>
      <td>~36</td>
      <td>~$7</td>
      <td>~$210</td>
    </tr>
    <tr>
      <td>One-shot inference + rate limiting</td>
      <td>~36</td>
      <td>~$1</td>
      <td>~$28</td>
    </tr>
  </tbody>
</table>

<h2 id="quality-fixes">Quality Fixes</h2>

<p>I fixed several quality issues along the way.</p>

<p><strong>Repetition.</strong> I analyzed 2,477 insights over a week and found a 54% repeat rate. “Best day this week!” appeared 260 times. The agent was copying example phrases from its prompt verbatim. I removed the examples, banned generic praise, and added a storage-layer dedup that rejects exact duplicates within a 6-hour window.</p>

<p><strong>Timezone.</strong> Lambda runs in us-east-1, and the agent was using UTC hours from <code class="language-plaintext highlighter-rouge">Date.getHours()</code>. Noon Pacific is 8 PM UTC. The agent would say “evening going well!” at lunchtime. Fixed with <code class="language-plaintext highlighter-rouge">Intl.DateTimeFormat</code> using <code class="language-plaintext highlighter-rouge">America/Los_Angeles</code>.</p>

<h2 id="what-it-looks-like-now">What It Looks Like Now</h2>

<p>The display looks the same. Same 30-character insights in green, yellow, red, and rainbow. Same supportive voice noticing patterns. Abigail doesn’t know anything changed, which is exactly right.</p>

<p>The original dashboard ran for $4/month. The insights agent blew that up to $1,700/month. Now it’s about $30/month total: the original infrastructure plus one-shot inference at a sensible rate. The codebase is 2,559 lines lighter, the insights are less repetitive, and the agent knows what time zone it’s in.</p>

<p>Still open source on <a href="https://github.com/jwulff/signage">GitHub</a>.</p>

<p>– John</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Replacing AWS Bedrock Agents with one-shot inference for the diabetes insights display. Same quality, 98% cheaper, 2,559 fewer lines of code.]]></summary></entry><entry><title type="html">Dashboard Update: Teaching the Agent to Think Like We Do</title><link href="https://johnwulff.com/2026/02/03/insights-refinements/" rel="alternate" type="text/html" title="Dashboard Update: Teaching the Agent to Think Like We Do" /><published>2026-02-03T00:00:00+00:00</published><updated>2026-02-03T00:00:00+00:00</updated><id>https://johnwulff.com/2026/02/03/insights-refinements</id><content type="html" xml:base="https://johnwulff.com/2026/02/03/insights-refinements/"><![CDATA[<p><em>This is Part 4. <a href="/2026/01/18/pixoo-signage/">Part 1</a> covers building the dashboard. <a href="/2026/01/27/pixoo-signage-insulin-tracking/">Part 2</a> adds insulin tracking. <a href="/2026/02/02/pixoo-signage-insights-agent/">Part 3</a> introduces the AI insights agent.</em></p>

<p><img src="/assets/images/posts/pixoo-signage/insight-landing.png" alt="The display showing &quot;SMOOTH LANDING!&quot; in green" class="hero" /></p>

<p>Yesterday I wrote about adding an AI insights agent to the diabetes dashboard. Getting it to sound human was the hard part. Since then, I’ve been teaching it to think more like we do.</p>

<h2 id="tonight">Tonight</h2>

<p>Abigail broke routine after school. A new dance class and a goldfish snack were all it took to spike high. It happens. We loaded her up with insulin all evening to bring her back down. After she went to bed, it became clear we’d overshot the correction. Also happens. Normally not a huge deal, we just had to wake her up and give her some sugar.</p>

<p>The first wakeup went fine. She took the juice, we waited. But it wasn’t enough. She kept dropping.</p>

<p>By the second wakeup she was low enough to feel it. Disoriented, uncooperative, and <em>very</em> unhappy to be woken up. This is when it gets scary. We had to hold her down and push honey over her screams. Sometimes it’s not gentle coaxing and juice boxes. Sometimes it’s restraining your screaming kid because the alternative is worse.</p>

<p>Eventually the sugar kicked in. She rebounded hard, then leveled off. Actually stable.</p>

<p>After she started to come up, the display said “SMOOTH LANDING!” Only it wasn’t. Not yet. She was barely out of the woods and climbing fast. The insight was wrong, the opposite of useful after a stressful situation.</p>

<h2 id="the-gap">The Gap</h2>

<p>The obvious problem: no intelligence about rate of change. The insights were over-indexed on the current value, not the trend.</p>

<p>After treating a low with sugar, glucose often spikes. Easy to overcorrect. A rise from 78 to 105 in a few readings isn’t a “smooth landing.” It’s a rebound that might overshoot high. The agent was celebrating these rebounds. “Coming up nicely!” when the trajectory suggested she’d blow past 200.</p>

<p>The same problem works in reverse. A glucose reading of 115 and dropping doesn’t tell you much by itself. But 115 dropping <em>and slowing down</em> is good news. That’s a landing. 115 dropping <em>and speeding up</em> is a problem. Same number, opposite actions. The agent would see a drop from 150 to 130 to 120 to 115 and say “Still dropping, eat!” That’s false urgency. The drop is decelerating. She’s leveling off.</p>

<h2 id="what-stable-actually-means">What “Stable” Actually Means</h2>

<p><img src="/assets/images/posts/pixoo-signage/insight-stable.png" alt="The display showing &quot;FINALLY STABLE AGAIN!&quot; in green" class="float-left" /></p>

<p>A subtler problem: the agent didn’t know what stable really means.</p>

<p>It was saying “leveling off nicely!” when glucose was still drifting toward a low. 80 and dropping slowly isn’t stable. It’s still dropping. But the agent saw the number in range and celebrated.</p>

<p>This matters most near the edges. At 78 and still drifting down, even slowly, “leveling off” creates false confidence. Stable means truly flat readings: two to three consecutive readings within ±3 mg/dL. That’s when “FINALLY STABLE AGAIN!” actually means something.</p>

<h2 id="how-the-refinements-happen">How the Refinements Happen</h2>

<p>So I fired up Claude Code and together we refined the <a href="https://github.com/jwulff/signage/blob/45fdd8c/packages/functions/src/diabetes/analysis/stream-trigger.ts">agent prompt</a> to take all of this into account.</p>

<p>The cycle: I see a situation, note what the insight should have been, and Claude iterates on the prompt.</p>

<p>The agent says “Smooth landing!” during a steep rebound. Should have been: “Coming up fast, watch it.” Claude adds guidance about post-low spikes and overshoot risk.</p>

<p>The agent says “Leveling off!” while still drifting down. Should have been: “Still drifting, more?” Claude adds guidance that stable means flat for 2-3 readings.</p>

<p>The agent creates false urgency during a decelerating drop. Should have been: “Leveling off nicely!” Claude adds guidance to factor in acceleration, not just direction.</p>

<p>It’s iterative. The prompt grows more specific with each edge case, with explicit FORBIDDEN examples for each failure mode.</p>

<h2 id="capturing-the-reasoning">Capturing the Reasoning</h2>

<p>To make this cycle easier, I have the agent record its thinking with each insight so I can review it later and iterate accordingly.</p>

<p>Bedrock Agents don’t enforce required parameters, so even with emphatic prompts the agent wasn’t passing reasoning through the tool. But it does explain itself in its conversational response. So I parse that out and store it alongside the insight.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Insight: "[green]Leveling off nicely![/]"
Reasoning: Glucose dropped from high→less high→almost normal over
the past hour, but the rate is decelerating. Each reading shows a
smaller drop than the last. This is a landing pattern, not a
continuing fall. Celebrating the deceleration, not warning about
the direction.
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Insight: "[yellow]Coming up fast![/]"
Reasoning: Post-low rebound in progress. Glucose was 72 twenty
minutes ago, now 118 and still climbing steeply. This +46 rise
suggests overcorrection from treatment. Not celebrating yet
because trajectory points toward overshoot into high territory.
</code></pre></div></div>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Insight: "[yellow]Still drifting, more?[/]"
Reasoning: Glucose at 81, down from 88 and 94. Still dropping
about 6-7 per reading. Not flat yet. Near the low threshold so
caution warranted. Suggesting additional carbs rather than
celebrating stability that hasn't arrived.
</code></pre></div></div>

<p>Now I can review not just what the agent said, but why it said it. Useful for debugging and for understanding when the logic needs refinement.</p>

<h2 id="the-result">The Result</h2>

<p><img src="/assets/images/posts/pixoo-signage/insight-back-steady.png" alt="The display showing &quot;BACK STEADY AFTER THAT!&quot; in green" class="float-right" /></p>

<p>“BACK STEADY AFTER THAT!” acknowledges the journey. The agent saw the red spike in the chart, watched the recovery, and waited for truly flat readings before celebrating. It won’t confuse a rebound for a landing. It won’t panic during a decelerating drop. And it won’t claim stable when the drop keeps going slowly, like tonight.</p>

<p>That’s what I wanted: an agent that thinks about the situation the way we do.</p>

<p>I sure wish I could get real-time insulin data. The pump knows how much insulin is “on board” in these situations, which would help predict where things are headed. But Glooko syncs are delayed several hours. Something I’ll add the minute I can get my hands on real-time pump data. (Please please please give me API access to this, Insulet!)</p>

<p>Still open source on <a href="https://github.com/jwulff/signage">GitHub</a>.</p>

<p><em><a href="/2026/02/21/insights-cost-refactor/">Part 5</a> replaces the agent framework with one-shot inference to save $1,700/month.</em></p>

<p>– John</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Refining the insights agent with rate of change intelligence, post-low rebound awareness, and a real definition of stable.]]></summary></entry><entry><title type="html">Dashboard Update: An AI Friend Who Watches the Numbers</title><link href="https://johnwulff.com/2026/02/02/pixoo-signage-insights-agent/" rel="alternate" type="text/html" title="Dashboard Update: An AI Friend Who Watches the Numbers" /><published>2026-02-02T00:00:00+00:00</published><updated>2026-02-02T00:00:00+00:00</updated><id>https://johnwulff.com/2026/02/02/pixoo-signage-insights-agent</id><content type="html" xml:base="https://johnwulff.com/2026/02/02/pixoo-signage-insights-agent/"><![CDATA[<p><em>This is Part 3. <a href="/2026/01/18/pixoo-signage/">Part 1</a> covers building the dashboard. <a href="/2026/01/27/pixoo-signage-insulin-tracking/">Part 2</a> adds insulin tracking.</em></p>

<p><img src="/assets/images/posts/pixoo-signage/insight-display.png" alt="The display showing &quot;BEEN HIGH A FEW HOURS&quot; insight" class="hero" /></p>

<p>The dashboard has a new feature: an AI that watches the glucose and insulin data, then offers short observations on the display. Not commands or clinical alerts, just a friendly voice that notices patterns we might miss.</p>

<h2 id="why-an-insights-agent">Why an Insights Agent</h2>

<p>Managing T1D means interpreting numbers constantly. Is this a trend or a blip? Did lunch hit harder than expected? Are overnight lows becoming a pattern? We do this interpretation all the time, but we can’t watch every moment (we try though).</p>

<p>I wanted a second set of eyes and some validation. Something that could look at the last few hours (glucose, insulin, carbs) and surface one helpful observation. Not a diagnosis or medical advice, just a nudge, “steady all morning, nice!” or “rising after lunch, pre-bolus next time?”</p>

<h2 id="what-it-feels-like">What It Feels Like</h2>

<p>The display now has a voice. Glancing at it in the morning: “great overnight!” in green, a small celebration. After a rough lunch: “been high a while, bolus?” as a gentle check-in. Three days of tight control: “3 days in range!” cycling through rainbow colors.</p>

<p>It’s not a medical device. It’s not trying to be a doctor. It’s more like a friend who pays attention. Another person on the team that notices the patterns. A supportive cheer for the wins and gentle questions about the struggles.</p>

<h2 id="the-hard-part-making-it-human">The Hard Part: Making It Human</h2>

<p>Thanks to <a href="https://www.anthropic.com/claude-code">Claude Code</a>, getting the agent working was easy. It took most of the day, chugging away with me stopping to give feedback every 30 minutes or so, but it didn’t take much input from me to get going. Getting it to sound human took some tweaking and is where I felt my value to the project.</p>

<p>Claude’s first attempt yielded abbreviations. “Hi 4h avg230 now241”. Readable if you studied it, but robotic. It was building a telegraph, not a friend.</p>

<p>Second attempt: natural language with exact numbers. “Glucose 241, been high 3 hours, consider bolus.” Better, but clinical. Felt like a medical device, not a companion.</p>

<p>I kept giving feedback: more ranges, more questions, some color, gentle. Instead of “241,” say “over 200.” Instead of “need bolus,” ask “bolus?” A caring friend suggests, doesn’t command. The difference between “you should eat” and “hungry?” is everything when things are already suboptimal.</p>

<p><a href="https://github.com/jwulff/signage/blob/7494ca8/packages/functions/src/diabetes/analysis/stream-trigger.ts#L90-L128">The final prompt</a> required explicit FORBIDDEN examples. I literally had to tell Claude that “Hi 4h avg230” was “robotic garbage.” Subtle guidance wasn’t enough but we got there!</p>

<h2 id="colors-for-emotion">Colors for Emotion</h2>

<p>Plain white text couldn’t convey tone. A celebration looked the same as a warning.</p>

<p>I added <a href="https://github.com/jwulff/signage/blob/7494ca8/packages/functions/src/rendering/insight-renderer.ts#L54-L98">color markup</a> the agent can use. Green for celebrations and wins. Yellow for caution when running high. Red for urgent situations like lows. And rainbow for big wins, cycling through colors character by character.</p>

<p>“steady all day!” in green hits different than plain white text. The display becomes expressive.</p>

<p>Colors are muted, about two-thirds brightness, so they don’t compete with the glucose reading. The number is still the star. The insight is a whisper, not a shout.</p>

<h2 id="the-constraints">The Constraints</h2>

<p>The LED display has room for two lines of about 15 characters each. 30 characters total. Every word counts.</p>

<p>The agent often generated 40, 50, even 80 characters despite the prompt. I added a retry loop: if the insight is too long, ask the agent to shorten it. Same session, same context, just “that was 45 chars, please make it 30.” Works most of the time. Force-truncate as a fallback.</p>

<p>Two lines also need to look balanced. “BEEN HIGH A FEW” + “HOURS” looks awkward. “BEEN HIGH A” + “FEW HOURS” looks intentional. The compositor now splits short text near the middle.</p>

<h2 id="how-it-works">How It Works</h2>

<p>I used this as an opportunity to learn <a href="https://aws.amazon.com/bedrock/agents/">AWS Bedrock Agents</a>, something we’re also embracing at work. The agent runs <a href="https://aws.amazon.com/bedrock/claude/">Claude Sonnet 4.5</a> with access to the diabetes data in DynamoDB. It can query glucose readings, insulin doses, carb entries, and computed stats like <a href="https://diabetes.org/about-diabetes/devices-technology/cgm-time-in-range">Time in Range</a>. Then it stores a short insight for the display.</p>

<p><a href="https://github.com/jwulff/signage/blob/7494ca8/packages/functions/src/diabetes/analysis/stream-trigger.ts#L43-L69">DynamoDB Streams</a> trigger the agent when new data arrives. Event-driven, not hourly cron. When Abigail’s CGM sends a new reading or Glooko syncs pump data, the agent runs within seconds.</p>

<h2 id="what-it-costs">What It Costs</h2>

<p>The insights agent changed the economics. The original dashboard ran for about $4/month. With the agent analyzing every CGM reading, costs jumped significantly.</p>

<p>CGM readings arrive every 5 minutes, 288 times per day. Each agent invocation uses Claude Sonnet 4.5 on Bedrock at $3 per million input tokens and $15 per million output tokens. With the prompt, data queries, and responses, that adds up to roughly $40-50/month for the agent alone.</p>

<p>Worth it for us. The supportive nudges and pattern recognition add real value to daily diabetes management. But it’s no longer a $4/month hobby project.</p>

<h2 id="whats-next">What’s Next</h2>

<p>That 10x cost jump needs work. Right now the agent runs on every CGM reading, even at 3am when we’re asleep and the numbers are steady. Smarter would be: only generate new insights when glucose moves significantly, or when it’s been a while, or when we’re probably awake. No need to pay Claude to tell a sleeping house that everything’s fine. (Though I appreciate the sentiment.)</p>

<p>I’ll keep tweaking the prompt too. The voice is close but not perfect. More feedback, more iterations.</p>

<p>Stepping back from the display itself, there’s a bigger opportunity here. The agent already has access to weeks of glucose, insulin, and carb data. Why stop at 30-character insights on an LED?</p>

<p>Morning email reports summarizing how overnight went. Weekly trends with suggestions to try. Coaching on carb ratios and bolus timing. “You’ve been running high after breakfast three days in a row. Want to try a 10% increase in your morning ratio?” The data is there. The reasoning is there. The interface could be anything.</p>

<p>For more complex workflows, AWS has <a href="https://aws.amazon.com/bedrock/agentcore/">AgentCore</a>, a newer framework for building multi-agent systems. Could be fun to explore for coordinating analysis, recommendations, and delivery across different channels.</p>

<p>I’m excited to keep building. AI tools for diabetes management feel like exactly the kind of thing I want to spend time on. Personal, useful, iterative. A fun excuse to keep learning and a real way to help my family.</p>

<p>Still open source on <a href="https://github.com/jwulff/signage">GitHub</a>.</p>

<p><em><a href="/2026/02/03/insights-refinements/">Part 4</a> refines the insights after a stressful night.</em></p>

<p>– John</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Adding a Bedrock-powered insights agent to the family glucose dashboard. Supportive observations, not robotic commands.]]></summary></entry><entry><title type="html">Building a Speech-to-Text TUI with Claude Code</title><link href="https://johnwulff.com/2026/01/31/steno-speech-to-text-tui/" rel="alternate" type="text/html" title="Building a Speech-to-Text TUI with Claude Code" /><published>2026-01-31T00:00:00+00:00</published><updated>2026-01-31T00:00:00+00:00</updated><id>https://johnwulff.com/2026/01/31/steno-speech-to-text-tui</id><content type="html" xml:base="https://johnwulff.com/2026/01/31/steno-speech-to-text-tui/"><![CDATA[<p>macOS Tahoe shipped with <a href="https://developer.apple.com/documentation/speech/speechanalyzer">SpeechAnalyzer</a>, Apple’s new on-device transcription API. It’s fast, handles long-form audio, and runs entirely locally. I spend a lot of time <a href="/2026/01/11/voice-memos-to-second-brain/">talking to myself</a>, so transcription tech is interesting and useful to me. I wanted a testbed to experiment with different patterns: recording audio, transcribing it, storing it in a database for future RAG and analysis into my second brain markdown vault (more on that later). So I built <a href="https://github.com/jwulff/steno">Steno</a>. It’s open source, it’s super fun, and I love it.</p>

<p><img src="/assets/images/posts/steno-speech-to-text-tui/steno-tui.png" alt="Steno TUI showing live transcription and AI analysis" class="full-width" /></p>

<h2 id="why-a-terminal-app">Why a Terminal App</h2>

<p>TUIs are faster to build than GUIs: no layout constraints, no asset pipelines, no Xcode storyboards. Just text and boxes. With Claude Code and SwiftTUI, I had a working interface in minutes. That kind of speed makes building things genuinely fun again.</p>

<p>SpeechAnalyzer is 55% faster than Whisper and handles continuous transcription properly. The old SFSpeechRecognizer was designed for Siri-style dictation, and its <code class="language-plaintext highlighter-rouge">isFinal</code> flag rarely triggered during long recordings, forcing workarounds like stabilization timers. SpeechAnalyzer just works.</p>

<h2 id="what-it-does">What It Does</h2>

<p>Steno runs in your terminal. You see a real-time audio level meter, the current transcript, and a status bar. Partial results appear in yellow as you speak, then turn white when finalized. Everything happens on-device.</p>

<p>Press space to start transcribing, again to stop. The other keys do what you’d expect: <code class="language-plaintext highlighter-rouge">s</code> for settings, <code class="language-plaintext highlighter-rouge">q</code> to quit, <code class="language-plaintext highlighter-rouge">i</code> to cycle inputs, <code class="language-plaintext highlighter-rouge">m</code> to switch models.</p>

<p>The AI piece is optional. If you add your Anthropic API key, Steno will summarize your transcript on demand. It fetches the current list of Claude models from the API and lets you pick one. I default to Haiku for speed.</p>

<h2 id="building-with-claude-code">Building with Claude Code</h2>

<p>This is the part I love. I described what I wanted (real-time transcription in a terminal) and Claude Code helped me build it. We started with SwiftTUI for the interface, wired up AVAudioEngine for audio capture, then connected it to the SpeechAnalyzer API.</p>

<p>We added global keyboard shortcuts, which turned out to be tricky. SwiftTUI handles input through a first-responder system, but I wanted single-keystroke shortcuts that work regardless of focus, and SwiftTUI doesn’t expose the internals needed to intercept keystrokes before they reach the responder chain.</p>

<p>Claude suggested forking SwiftTUI locally. We added a static <code class="language-plaintext highlighter-rouge">globalKeyHandlers</code> dictionary to the Application class. Now the input handler checks for global shortcuts first:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="k">let</span> <span class="nv">handler</span> <span class="o">=</span> <span class="kt">Application</span><span class="o">.</span><span class="n">globalKeyHandlers</span><span class="p">[</span><span class="n">char</span><span class="p">]</span> <span class="p">{</span>
    <span class="nf">handler</span><span class="p">()</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
    <span class="n">window</span><span class="o">.</span><span class="n">firstResponder</span><span class="p">?</span><span class="o">.</span><span class="nf">handleEvent</span><span class="p">(</span><span class="n">char</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The whole patch was about ten lines. Swift 6’s strict concurrency required marking the static dictionary as <code class="language-plaintext highlighter-rouge">nonisolated(unsafe)</code>, but since SwiftTUI’s input handling is already single-threaded, that’s fine.</p>

<h2 id="the-audio-pipeline">The Audio Pipeline</h2>

<p>Getting audio from the microphone to SpeechAnalyzer required some format wrangling. Mics typically output 48kHz stereo. SpeechAnalyzer wants 16kHz mono. The AudioTapProcessor handles the conversion:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>AVAudioEngine → AudioTapProcessor → AsyncStream → SpeechAnalyzer
                     ↓                                  ↓
              (48kHz → 16kHz)                   SpeechTranscriber
                                                       ↓
                                               transcriber.results
</code></pre></div></div>

<p>The processor is isolated from the main actor to satisfy Swift 6’s concurrency requirements. Audio callbacks can’t block on the main thread.</p>

<h2 id="try-it">Try It</h2>

<p>Steno is open source: <a href="https://github.com/jwulff/steno">github.com/jwulff/steno</a></p>

<p>Requirements: macOS 26 (Tahoe) or later.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Download and unzip</span>
curl <span class="nt">-L</span> https://github.com/jwulff/steno/releases/latest/download/steno-macos-arm64.zip <span class="nt">-o</span> steno.zip
unzip steno.zip

<span class="c"># Remove quarantine attribute (required for unsigned binaries)</span>
xattr <span class="nt">-d</span> com.apple.quarantine steno

<span class="c"># Run</span>
./steno
</code></pre></div></div>

<p>On first run, you’ll need to grant microphone and speech recognition permissions. The speech models download automatically if needed.</p>

<p>If you want AI summarization, add your Anthropic API key in the settings screen (press <code class="language-plaintext highlighter-rouge">s</code>).</p>

<p>The best part of building with Claude Code is the velocity: describe a feature, watch it appear, iterate. The forked SwiftTUI approach would have taken me hours to figure out alone, but with Claude we had working keyboard shortcuts in twenty minutes. I forgot how much I missed this feeling of just making things.</p>

<h2 id="whats-next">What’s Next</h2>

<p>Next I want to play with more structured analysis and real-time feedback. Imagine something that listens and proactively researches what’s being discussed, a real-time expert deep diver for spitballing conversations. That’d be neat.</p>

<p><em><a href="/2026/03/28/steno-always-listening/">Part 2</a> adds system audio capture, a headless daemon, and an MCP server so agents can query your transcripts.</em></p>

<p>– John</p>]]></content><author><name></name></author><summary type="html"><![CDATA[A terminal app for real-time transcription using macOS 26's new SpeechAnalyzer API.]]></summary></entry></feed>