Blog

Field notes on building voice & chat agents.

Honest writeups on latency, evaluation, prompt design, and the surprisingly squishy parts of putting AI in front of real customers.

280ms
p50 voice round-trip

Engineering · 12 min read

How we got voice latency under 300ms — and what we still got wrong

An honest breakdown of the latency budget for production voice agents: STT, VAD, reasoning, tool calls, TTS — what helps, what doesn't, and the part everyone gets wrong.

RM

Rafael Mendes · May 8, 2026

eval-harness/voice

Engineering

Open-sourcing our voice evaluation harness

Why benchmarks lie, what we actually measure, and the harness we wrote to keep ourselves honest.

May 1, 2026 · 8 min

"Let me check on that…"

Design

The acknowledgement problem in voice agents

Why agents need to narrate tool calls — and how to do it without sounding scripted.

Apr 22, 2026 · 6 min

94% / 1.4× / +1.2pp

Customer

How Northwind cut tier-1 support cost by 94%

Full case study with metrics, prompt structure, and the migration timeline.

Apr 15, 2026 · 14 min

memory.recall("user_X")

Engineering

Long-term memory without leaking secrets

How we scope memory per-user, per-bot, per-workspace — and the audit trail that comes with it.

Apr 4, 2026 · 9 min

v2.4.0

Release

Hybrid retrieval ships in Hania v2.4

Semantic + BM25 + cross-encoder reranking. Recall up 6pp on internal benchmarks.

Mar 28, 2026 · 5 min

tool.create(...)

Product

No-code tools for non-engineers

Why the next wave of agent platforms will be built by ops teams, not just engineers.

Mar 12, 2026 · 7 min

View all posts