A production voice AI agent costs $0.07–0.33 per minute in infrastructure, plus a $5,000–$100,000+ build cost depending on complexity. At 10,000 minutes/month, platform choice alone changes your run rate by $1,500–$2,800/month. Here is the full breakdown.
Per-minute cost breakdown
Every voice AI call stacks four costs: speech-to-text (STT), LLM inference, text-to-speech (TTS), and telephony. Platform fees sit on top. The table below uses BYOK (bring your own key) pricing where applicable — you supply the API keys, which is typically 30–40% cheaper than letting the platform bundle them.
| Platform | STT | LLM | TTS | Telephony | Platform fee | Total/min |
|---|---|---|---|---|---|---|
Vapi (BYOK) Best for: API-first, fast deployment, outbound SDR, receptionist | $0.02–0.04 | $0.06–0.12 | $0.04–0.08 | $0.007–0.015 | $0.05 | $0.23–0.33/min |
Retell AI (BYOK) Best for: HIPAA, regulated industries, visual flow editor | $0.02–0.04 | $0.06–0.12 | $0.04–0.08 | $0.007–0.015 | $0.07 | $0.21–0.30/min |
LiveKit + OpenAI Realtime Best for: sub-300ms latency, video+voice, 10k+ min/month | Bundled in Realtime | Bundled in Realtime | Bundled in Realtime | $0.007–0.015 | $0.06–0.10 | $0.15–0.20/min |
LiveKit (self-hosted) Best for: 50k+ min/month, data sovereignty, custom pipeline | $0.006–0.015 | $0.03–0.08 | $0.015–0.04 | $0.007–0.015 | $0.02–0.04 | $0.07–0.15/min |
Prices as of mid-2026. LLM costs assume GPT-4o mini or Claude Haiku for most interactions. Premium models (GPT-4o, Claude Sonnet) add $0.04–0.10/min.
Cost crossover analysis
Self-hosted LiveKit requires $3,000–8,000 in upfront infrastructure setup. But above 10,000 minutes/month, the per-minute savings compound fast. Here is the breakeven analysis at different call volumes.
| Monthly volume | Managed (Vapi/Retell) | LiveKit BYOK | Self-hosted LiveKit | Best option |
|---|---|---|---|---|
| 1,000 min | $230–330 | $230–330 | $70–150 | Managed or BYOK (simplicity wins) |
| 5,000 min | $1,150–1,650 | $1,150–1,650 | $350–750 | BYOK (managed overhead not worth it yet) |
| 10,000 min | $2,300–3,300 | $2,300–3,300 | $700–1,500 | LiveKit BYOK ($750–1,000/mo less than Vapi) |
| 25,000 min | $5,750–8,250 | $5,750–8,250 | $1,750–3,750 | Self-hosted (saves $2,000–4,500/mo) |
| 50,000 min | $11,500–16,500 | $11,500–16,500 | $3,500–7,500 | Self-hosted (saves $8,000–9,000/mo) |
The 10,000 minute crossover: Below 10k minutes/month, managed Vapi or Retell is the right choice — less infrastructure complexity, faster iteration. Above 10k minutes/month, switching to LiveKit with self-managed providers saves $1,500–$8,000/month. For clients at 50k+ min/month, self-hosted LiveKit pays back its setup cost within the first month.
ROI Example — Dental Group
$4,800/mo
Previous cost
Third-party answering service
$650/mo
AI agent run cost
Vapi BYOK at ~2,000 min/month
$4,150/mo
Monthly savings
$49,800/year recaptured
3.1×
ROI on build cost
Payback in first 3 months
A 6-location dental group replaced a $4,800/month human answering service with a Vapi voice agent integrated with Dentrix scheduling. The agent runs ~2,000 minutes/month across all locations. Infrastructure cost: $650/month (Vapi BYOK with Claude Haiku + Deepgram + ElevenLabs Turbo). Build cost: $18,000. Payback period: 4.3 months. The agent also captured 340 previously missed after-hours calls in its first 90 days — revenue the answering service was dropping entirely.
Watch out
Most voice AI cost estimates focus on per-minute rates. These five costs are real, frequent, and regularly missing from vendor quotes.
SIP trunk, DID numbers, outbound PSTN rates, Twilio or Telnyx carrier fees
Often forgotten in initial estimates. At 10k min/month adds $70–150/month minimum.
Provider concurrency limits mean burst traffic requires higher-tier plans or queuing infrastructure
A spike in call volume during a campaign can cause queuing latency if you haven't pre-provisioned capacity.
4–8 hours/month of engineering time for system prompt iteration, edge case handling, and regression testing
Voice AI degrades without tuning as your business changes. This is the hidden maintenance tax.
Call recording storage, transcript search, anomaly detection, failed-call alerting
Without monitoring, you learn about broken calls from angry customers, not dashboards.
API schema changes, auth token rotation, field mapping updates as your CRM evolves
CRMs update without notice. Integration maintenance is real work, not set-and-forget.
Build cost
Infrastructure costs (above) are ongoing. Build costs are one-time. Here is what each tier includes and costs.
2–4 weeks
$5,000–$15,000
One use case (e.g. inbound receptionist or outbound SDR). Deployed on your real phone number. Integrated with one CRM or calendar. 50+ test call validation. Go/no-go recommendation with full-build quote.
Not production-hardened. Not multi-location. No monitoring dashboard.
6–12 weeks
$25,000–$75,000
Full production deployment with error handling, monitoring, multi-location support, and custom integrations. Handles real call volume with concurrency planning and fallback paths.
Not suitable for 50k+ min/month enterprise scale without infrastructure discussion.
12–24 weeks
$100,000+
Custom infrastructure (self-hosted LiveKit), multi-agent orchestration, custom LLM fine-tuning, regulatory compliance documentation, SLA-backed uptime, dedicated engineering team during build.
Quoted individually after architecture sessions.
30-minute call. We scope the call volume, platform choice, integrations, and give you a firm build price and monthly run-rate estimate before any work starts.