A step-by-step playbook for restaurants, clinics, gyms, and service providers deploying their first production voice AI agent.
H
Hestur AI
hestur.co
2–3 wks
Typical Deploy
from kick-off to live calls
55–75%
Call Containment
without human agent
$0.23–0.33
Cost All-In
per minute (BYOK)
<400ms
Response Time
avg first-word latency
Who This Guide Is For
This playbook targets service businesses receiving 50+ inbound calls per week: medical clinics, dental practices, physio studios, gyms, restaurants, home service providers, and salons. If your staff spends more than 2 hours per day answering calls that follow a predictable script (bookings, FAQs, directions, hours), a voice AI agent will pay for itself within 60–90 days.
Scope check: Voice AI works well for calls with a defined goal (book, cancel, answer a question, transfer). It does not replace complex relationship calls where context, empathy, and judgment are the product.
Step 1 — Choose Your Platform
Platform
Best For
Cost/Min (BYOK)
Time to Deploy
Vapi
Fast prototypes, API-first teams
$0.05 + provider
1–2 weeks
LiveKit Agents
Scale 10k+ min/month, custom infra
Infra cost only
3–4 weeks
Retell AI
Non-technical teams, visual builder
$0.07 + provider
1 week
Our recommendation for service businesses: Start with Vapi. It has the fastest onboarding, solid BYOK support, and the best webhook ecosystem for CRM write-back. Migrate to LiveKit when your monthly bill exceeds $3–5K.
Step 2 — BYOK Provider Setup
Bring Your Own Keys dramatically cuts per-minute cost. Set up accounts with each provider before building your agent:
Layer
Provider
Cost
Notes
STT (Speech-to-Text)
Deepgram Nova-2
$0.006/min
Best accuracy for phone audio
LLM (Brain)
GPT-4o mini or Claude Haiku
$0.05–0.10/min
Haiku is faster; GPT-4o mini is cheaper
TTS (Voice)
ElevenLabs or PlayHT
$0.02–0.04/min
ElevenLabs has more natural pauses
Platform
Vapi
$0.05/min
Orchestration only, not AI processing
Total BYOK: $0.13–0.21/min vs $0.45–0.60/min bundled — saves $15K–$30K/year at 10K min/month.
Step 3 — Write Your System Prompt
The system prompt is the agent's brain. Most voice agent failures are prompt failures, not infrastructure failures.
1
Define the role and persona
Give the agent a name, role, and business context. "You are Aria, the front desk AI for Oakwood Dental. You help patients schedule appointments, answer clinic questions, and handle appointment changes."
2
List what it CAN do
Explicit capability list prevents hallucination. Include: book appointments, check availability, confirm/cancel bookings, provide directions, answer FAQ, transfer to staff for complex requests.
3
Define hard stops
What the agent must never do: give medical/legal advice, quote prices it doesn't have, confirm appointments it can't verify. On uncertainty: always transfer, never guess.
4
Set the voice style
Service business tone: warm, brief, efficient. "Keep responses under 2 sentences. Do not over-explain. Mirror the patient's energy level — slower with elderly callers, brisk with busy professionals."
5
Add a fallback chain
Define the escalation path explicitly. "If you cannot resolve the caller's request in 2 attempts, say: 'Let me get someone who can help you better' and transfer to the front desk."
Step 4 — Connect Your Calendar
✓Create a dedicated service account with calendar write access (do not use a personal account)
✓For Google Calendar: enable the Calendar API in Google Cloud Console, create a service account, share your booking calendar with it
✓For Acuity / Calendly: generate an API key from account settings
✓For Jane App (clinics): use the Jane API — requires the clinic plan
✓Build a Vapi tool call: GET /availability → shows open slots, POST /appointments → creates booking
✓Test with 5 real booking scenarios before going live
Step 5 — SIP / Phone Number Setup
Connect the agent to a real phone number via Twilio or Vonage:
1
Buy a local number
Use Twilio or Vonage. Local numbers get higher pickup rates than toll-free. $1–2/month per number.
2
Configure call forwarding
Option A: Point the new number directly to Vapi (use as primary number). Option B: Forward your existing number to Vapi during business hours, ring to staff after hours.
3
Test voicemail detection
Vapi has built-in voicemail detection. Tune the `vmDetection` threshold — default is often too aggressive for mobile numbers with custom greetings.
Step 6 — Latency Tuning
Target: first word from agent under 800ms. Over 1s feels unnatural on phone calls.
Issue
Cause
Fix
Slow first response
LLM cold start
Use streaming + smaller model for opening line
Unnatural pauses mid-sentence
TTS chunking
Enable sentence-level streaming in ElevenLabs
Agent interrupts caller
Low endpointing threshold
Increase VAD sensitivity, add 300ms silence buffer
Caller interrupts agent
No barge-in handling
Enable interruption in Vapi, shorten agent responses
Step 7 — CRM Write-Back
After every call, push structured data to your CRM via Vapi webhooks:
Latency profiling, interrupt handling, edge case prompting
Validation
12–14
Shadow mode alongside live calls, staff review, go-live decision
What to Measure in Week 1
Target 55%+
Containment Rate
calls resolved without transfer
Target <3 min
Avg Handle Time
vs 6–8 min with human
Target <30%
Transfer Rate
escalations to staff
Target 4.0+
CSAT
post-call SMS survey
Common mistake: Going live on your main number on day one. Always shadow-mode for at least 5 days — the agent listens on real calls but staff still answers. Catch edge cases before they hit real customers.
Want this implemented for your business?
We scope most projects in 48 hours. Fixed price, 2–4 weeks to deploy.