Three weeks ago, I had a problem: cold email is broken. Most AI SDR tools are enterprise-priced ($5K+/month) and treat sales like a broadcast instead of a conversation. They send emails but don't handle replies. Sales teams get stuck reading 100+ responses instead of closing deals.
So I built Sellarion — an AI agent that owns the entire outbound sales funnel autonomously. And I learned something: the bottleneck isn't sending. It's listening.
Here's the complete build story.
The Problem Worth Solving
The market for AI sales tools in 2026 is polarized. On one end: enterprise platforms like 11x, Artisan, and AiSDR charging $5,000–$10,000/month with 12-month lock-in contracts. On the other end: cheap copilots (Instantly, ReachInbox) that automate sending but still require a human to read and respond to every reply.
The gap: there's no affordable, fully autonomous sales agent for SMB founders — the people who need it most but can't justify $120K/year on sales tooling.
The real bottleneck isn't sending emails. It's what happens after — the replies. Most teams get 20–100 inbound responses from a campaign and suddenly it's manual work again. Reply handling has always been the human-in-the-loop moment. We automated it.
The 5-Stage Autonomous Loop
The core architecture is a 5-stage pipeline that runs without human intervention from prospect discovery to conversation close:
Prospect Research
ICP scoring, company intelligence, buying signals — extracted and structured via Claude Haiku
Email Personalization
Context-aware emails built from real research data — not templates, not spin-tags
Automated Sending
Natural cadence scheduling with deliverability protection via Mailgun
Reply Detection + Classification
Webhook-driven inbox monitoring. Every reply classified: interested, objection, not now, unsubscribe, OOO
Autonomous Response
Context-aware reply generation based on classification type, full conversation thread, and prospect research
What I Learned Building This
1. Reply handling is the moat.
Every competitor sends better emails than us by now. But almost none handle replies autonomously. Sales teams are stuck refreshing their inbox because reply handling = human work. We automated it — and that's the actual product differentiation.
2. The reply window is 30 minutes.
Prospects reply during coffee. If you respond in 2 hours, it's forgotten. If you respond in 15 minutes — from an automated system — conversion lifts significantly. This is the unfair advantage of autonomous agents in sales: they're always on.
3. Event-driven beats polling.
Traditional email tools poll inboxes every 5 minutes. We went event-driven via Postmark webhooks: inbound email arrives → classification fires immediately → response generates in under 10 seconds. The difference isn't cosmetic. It's the product.
4. Taxonomy is underrated.
Getting the reply classification right took longer than expected. The difference between "interested" and "objection" determines the entire response strategy. Edge cases are brutal: Chinese spam, Gmail auto-replies in 5 languages, OOO messages that look like genuine replies. We ended up with 6 classification types plus keyword fallbacks for Claude failures.
5. Build logs beat case studies.
I posted daily updates about what I shipped — "today I fixed reply detection for mobile-only auto-replies" got more traction than polished metrics posts. People want to see the fumbles, not just the wins. Build in public unironically works.
The 18 Deliverables: What Got Built
Over 22 days, we shipped 18 distinct deliverables tracked via E2E test suite. All 18 pass. Here's what the loop actually covers:
| Component | Test Method | Status |
|---|---|---|
| Prospect CRUD (create, list, get, update, delete) | DB + HTTP layer | ✓ PASS |
| Campaign management | DB + HTTP layer | ✓ PASS |
| AI research (Phase 3a) — ICP scoring | Real DB records + Claude API | ✓ PASS |
| AI email generation (Phase 3b) — personalization | Real DB records + Claude API | ✓ PASS |
| Research gate enforcement (400 if no research data) | Code + API test | ✓ PASS |
| Batch research (5 concurrent, 10/min rate limit) | Integration test | ✓ PASS |
| Reply detection webhook (Postmark inbound) | DB fixture + schema | ✓ PASS |
| Reply classification — all 6 types | Live DB INSERT + schema validation | ✓ PASS |
| Message-ID deduplication index | pg_indexes query | ✓ PASS |
| Unsubscribe auto-handler (opted_out flag) | Schema + code review | ✓ PASS |
| Response generation — interested type | Live DB UPDATE + Claude API | ✓ PASS |
| Response generation — objection type | Live DB UPDATE + Claude API | ✓ PASS |
| Response generation — not_now type | Live DB UPDATE + Claude API | ✓ PASS |
| Pending response queue management | COUNT queries before/after | ✓ PASS |
| Ineligible type guard (400 for auto_reply/OOO) | Code review + test | ✓ PASS |
| Cost tracking (per-call metadata JSONB) | Schema + real cost records | ✓ PASS |
| Full schema integrity (4 tables, all constraints) | information_schema query | ✓ PASS |
| Waitlist + landing page analytics | DB + HTTP | ✓ PASS |
The Real Numbers
Here's what the economics actually look like in production:
| Stage | Latency | Cost/call |
|---|---|---|
| Phase 3a — Prospect Research | 3–5s | ~$0.0020 |
| Phase 3b — Email Generation | 1–2s | ~$0.0010 |
| Phase 3c — Reply Classification | 0.5–1s | ~$0.0001 |
| Phase 3d — Response Generation | 1.6–2.1s | ~$0.0002 |
| Full Loop (1 prospect) | ~8 seconds | ~$0.003 |
At 1,000 prospects/month, the Claude API costs come to roughly $36/year. Negligible. The pricing model at $149/mo isn't Claude cost recovery — it's the value of the hours saved responding to 200–500 inbound replies manually.
The Honest Part
This is hard. Most days I'm debugging email delivery, watching reply detection fail on edge cases (Chinese spam, Gmail's reply classification, mobile-only auto-replies), wondering if anyone cares.
The email industry is broken on purpose — it's designed to be unreliable so humans stay in the loop. Deliverability scoring, IP warming, DKIM records, Postmark inbound routing — every layer adds friction that makes autonomous sending harder than it looks from the outside.
But that friction is the opportunity. The founders who automate the boring part — reply handling — will win the SMB sales market. Everyone else will keep hiring SDRs at $95K/year to read emails and type responses.
We're still early. 22 days, 18 deliverables, a working product. The next 22 days are about getting real campaigns running, collecting reply quality feedback, and iterating on what the classification edge cases actually look like at scale.
If you're building something similar, I'd love to hear what you learned about reply handling. If you're a founder doing cold outreach and tired of the manual loop, I'd love for you to try Sellarion.