Voice AI12 min read

AI Voice Agent Setup Guide for Service Businesses

A step-by-step playbook for restaurants, clinics, gyms, and service providers deploying their first production voice AI agent.

Hestur AI

hestur.co

2–3 wks

Typical Deploy

from kick-off to live calls

55–75%

Call Containment

without human agent

$0.23–0.33

Cost All-In

per minute (BYOK)

<400ms

Response Time

avg first-word latency

Who This Guide Is For

This playbook targets service businesses receiving 50+ inbound calls per week: medical clinics, dental practices, physio studios, gyms, restaurants, home service providers, and salons. If your staff spends more than 2 hours per day answering calls that follow a predictable script (bookings, FAQs, directions, hours), a voice AI agent will pay for itself within 60–90 days.

Scope check: Voice AI works well for calls with a defined goal (book, cancel, answer a question, transfer). It does not replace complex relationship calls where context, empathy, and judgment are the product.

Step 1 — Choose Your Platform

Platform	Best For	Cost/Min (BYOK)	Time to Deploy
Vapi	Fast prototypes, API-first teams	$0.05 + provider	1–2 weeks
LiveKit Agents	Scale 10k+ min/month, custom infra	Infra cost only	3–4 weeks
Retell AI	Non-technical teams, visual builder	$0.07 + provider	1 week

Our recommendation for service businesses: Start with Vapi. It has the fastest onboarding, solid BYOK support, and the best webhook ecosystem for CRM write-back. Migrate to LiveKit when your monthly bill exceeds $3–5K.

Step 2 — BYOK Provider Setup

Bring Your Own Keys dramatically cuts per-minute cost. Set up accounts with each provider before building your agent:

Layer	Provider	Cost	Notes
STT (Speech-to-Text)	Deepgram Nova-2	$0.006/min	Best accuracy for phone audio
LLM (Brain)	GPT-4o mini or Claude Haiku	$0.05–0.10/min	Haiku is faster; GPT-4o mini is cheaper
TTS (Voice)	ElevenLabs or PlayHT	$0.02–0.04/min	ElevenLabs has more natural pauses
Platform	Vapi	$0.05/min	Orchestration only, not AI processing

Total BYOK: $0.13–0.21/min vs $0.45–0.60/min bundled — saves $15K–$30K/year at 10K min/month.

Step 3 — Write Your System Prompt

The system prompt is the agent's brain. Most voice agent failures are prompt failures, not infrastructure failures.

Define the role and persona

Give the agent a name, role, and business context. "You are Aria, the front desk AI for Oakwood Dental. You help patients schedule appointments, answer clinic questions, and handle appointment changes."

List what it CAN do

Explicit capability list prevents hallucination. Include: book appointments, check availability, confirm/cancel bookings, provide directions, answer FAQ, transfer to staff for complex requests.

Define hard stops

What the agent must never do: give medical/legal advice, quote prices it doesn't have, confirm appointments it can't verify. On uncertainty: always transfer, never guess.

Set the voice style

Service business tone: warm, brief, efficient. "Keep responses under 2 sentences. Do not over-explain. Mirror the patient's energy level — slower with elderly callers, brisk with busy professionals."

Add a fallback chain

Define the escalation path explicitly. "If you cannot resolve the caller's request in 2 attempts, say: 'Let me get someone who can help you better' and transfer to the front desk."

Step 4 — Connect Your Calendar

✓Create a dedicated service account with calendar write access (do not use a personal account)
✓For Google Calendar: enable the Calendar API in Google Cloud Console, create a service account, share your booking calendar with it
✓For Acuity / Calendly: generate an API key from account settings
✓For Jane App (clinics): use the Jane API — requires the clinic plan
✓Build a Vapi tool call: GET /availability → shows open slots, POST /appointments → creates booking
✓Test with 5 real booking scenarios before going live

Step 5 — SIP / Phone Number Setup

Connect the agent to a real phone number via Twilio or Vonage:

Buy a local number

Use Twilio or Vonage. Local numbers get higher pickup rates than toll-free. $1–2/month per number.

Configure call forwarding

Option A: Point the new number directly to Vapi (use as primary number). Option B: Forward your existing number to Vapi during business hours, ring to staff after hours.

Test voicemail detection

Vapi has built-in voicemail detection. Tune the `vmDetection` threshold — default is often too aggressive for mobile numbers with custom greetings.

Step 6 — Latency Tuning

Target: first word from agent under 800ms. Over 1s feels unnatural on phone calls.

Issue	Cause	Fix
Slow first response	LLM cold start	Use streaming + smaller model for opening line
Unnatural pauses mid-sentence	TTS chunking	Enable sentence-level streaming in ElevenLabs
Agent interrupts caller	Low endpointing threshold	Increase VAD sensitivity, add 300ms silence buffer
Caller interrupts agent	No barge-in handling	Enable interruption in Vapi, shorten agent responses

Step 7 — CRM Write-Back

After every call, push structured data to your CRM via Vapi webhooks:

✓call_ended webhook → parse transcript for intent, entities (name, phone, request type)
✓Write contact + call summary to CRM (Salesforce, HubSpot, or your EHR)
✓Flag calls where agent could not resolve for human review queue
✓Log call duration, containment result, and transfer reason for weekly reporting

2-Week Deployment Timeline

Phase	Days	What Happens
Architecture	1–3	Provider setup, BYOK keys, system prompt v1, tool call specs
Integration	4–8	Calendar API, CRM webhook, phone number, voicemail detection
Tuning	9–11	Latency profiling, interrupt handling, edge case prompting
Validation	12–14	Shadow mode alongside live calls, staff review, go-live decision

What to Measure in Week 1

Target 55%+

Containment Rate

calls resolved without transfer

Target <3 min

Avg Handle Time

vs 6–8 min with human

Target <30%

Transfer Rate

escalations to staff

Target 4.0+

CSAT

post-call SMS survey

Common mistake: Going live on your main number on day one. Always shadow-mode for at least 5 days — the agent listens on real calls but staff still answers. Catch edge cases before they hit real customers.

Want this implemented for your business?

We scope most projects in 48 hours. Fixed price, 2–4 weeks to deploy.

Book a Discovery Call