LHP Telephony Project Report

⚙

System Architecture

Caller (PSTN)

→

Twilio Voice

→

Express API
:3000

↔

OpenAI
GPT-4o / Whisper

React SPA
Vite + Tailwind

↔

Express API

↔

MySQL 8
:3306

MinIO / S3
Recordings

Browser Agent

↔

Twilio Voice SDK

↔

WebSocket /ws

→

Live Calls UI

★

Tech Stack

Frontend

React 18 + TypeScript

Vite (build & dev server)

Tailwind CSS + shadcn/ui

React Router v6

Twilio Voice SDK (browser)

React Query / date-fns / sonner

Backend

Node.js + Express

MySQL 8 (mysql2/promise)

Twilio REST API + TwiML

OpenAI (GPT-4o-mini + Whisper)

WebSocket (ws library)

AWS S3 SDK (MinIO compat)

Infrastructure

Docker Compose (4 services)

Nginx (frontend proxy)

MinIO (dev object storage)

Let's Encrypt / Certbot SSL

Domain: twml.lhpbo.com

Frontend: frontend-dev.lhpbo.com

✓

Completed Features

Feature	Description	Status
Authentication & Roles	JWT-based register/login, admin & agent roles, session restore via `/auth/me`, protected routes	Done
Outbound Calls	Browser-based dialing via Twilio Voice SDK. FloatingDialer keypad with country selector. OutboundCallModal with call states (connecting, ringing, in-call, ended). Auto call-log via `/twilio/call-status`	Done
AI Inbound Handling	Inbound calls on `+15046366772` answered by AI agent. OpenAI GPT-4o-mini generates responses. Multi-turn conversation via `/twilio/ai-respond` loop with speech recognition	Done
Business Hours & Voicemail	Per-day schedule with timezone support. Emergency closed toggle. Calls outside hours redirect to voicemail recording. Voicemail saved as call_log with type "voicemail"	Done
Dual-Phase Recording	Separate recordings for AI phase and human phase. AI recording started async after TwiML response. Human recording starts when agent takes over. Both stored in MinIO/S3	Done
Dual-Phase Transcription	Whisper STT + GPT-3.5 diarization for both phases. AI conversation also saved directly from chat history. Separate `transcript` and `transcript_human` columns	Done
Live Call Monitoring	WebSocket real-time updates. Live transcript feed with auto-scroll. "Listen In" (muted conference) and "Take Over" (unmuted, stops AI recording, starts human recording)	Done
Call Logs & Filters	Full CRUD. 7 filters: All, Inbound, Outbound, Voicemail, Failed, Busy/No Answer, Resolved. Count badges. Type column with icons. Recording availability indicator	Done
Call Detail Page	Dual transcript viewer (AI + Human with speaker bubbles). Dual audio player. Phase indicator banner for transferred calls. Metadata display	Done
Call Grading	Thumbs up/down grading on call detail page. Toggleable (click again to ungrade). Updates `grading_score` and `review_status`	Done
Training Feedback	Free-text correction notes per call. Saved to `correction_notes` column. Intended for AI persona improvement loop	Done
Contact Management	CRUD contacts with name + phone. Alphabetical grouping. Search. Last call info display. Click-to-call outbound	Done
Dashboard Stats	Total calls, inbound count, missed count, voicemail count. Recent 5 calls list	Done
Settings Page	AI greeting & system prompt config. Business hours UI with timezone picker (grouped by region, shows UTC offset). Emergency closed toggle. Local time preview for cross-region users	Done
Auto Webhook Config	`configureInboundNumber()` at startup auto-sets Twilio voice URL and status callback for inbound number	Done
Docker Full Stack	4-service compose: MySQL, MinIO, Backend, Frontend (Nginx). Auto-init DB schema from `init.sql`. Build-time env injection for frontend	Done
FloatingDialer — Recent Tab	Recent calls tab with layout, styling, and data display from call history	Done
FloatingDialer — Contacts Tab	Contacts tab with search and click-to-call integrated into FloatingDialer	Done

✎

In Progress / Partial

Feature	Description & Status	Status
Persona Feedback Loop	Training feedback (correction_notes) is stored per call. Next: aggregate grading + corrections into a persona summary. Auto-inject into AI system prompt for continuous improvement	Ongoing
Conference Controls	Hold, Transfer, and Keypad buttons exist in Live Calls UI but are currently disabled. Core listen & take-over work	Partial

⚠

Recent Bug Fixes (March 2026)

Fixed Recording Error 21220

Twilio "not eligible for recording" — recording was await-ed before TwiML response. Moved to setImmediate with 2s delay so Twilio receives TwiML first

Fixed JSON Double-Parse

mysql2 auto-parses JSON columns into JS objects. Code called JSON.parse() on already-parsed objects → SyntaxError. Added safeParseJson() helper throughout

Fixed transcript_turns=0

Consequence of the JSON double-parse bug — conversation history silently returned []. Fixed by safeParseJson on all conversation_history reads

Fixed Missing Inbound Logs

inbound-status silently bailed when active_calls row was missing. Added fallback from Twilio params + pre-create call_log at call start

Fixed Recording Race Condition

recording-status fired before call_logs row existed. Fixed: cache in active_calls first, retry UPDATE up to 8 times with 3s delay

Fixed AI Transcript Skip Logic

hasAiTranscript compared parsed array to string "[]" — always truthy. Fixed with safeParseJson + array.length > 0 check

★

Persona & Feedback System (Design)

How the AI Persona Improves Over Time

Every inbound call generates grading data (good / needs improvement) and optional correction notes from the reviewer. This feedback is stored per-call in call_logs.grading_score and call_logs.correction_notes. The system will aggregate this data into a living Persona Summary that feeds back into the AI agent's system prompt.

1

Call Happens

AI handles inbound call. Transcript & recording saved

2

Review & Grade

Admin reviews call detail. Gives thumbs up/down + correction text

3

Store Feedback

grading_score + correction_notes saved to DB per call

4

Summarize

Periodic aggregation: collect all corrections → summarize patterns → update persona

5

Inject to Persona

Summary appended to agent_settings.system_prompt. AI behavior improves on next call

Component	Status	Details
Call Grading UI	Done	Thumbs up/down on CallDetailPage, toggleable, saves to DB
Correction Notes UI	Done	Textarea on CallDetailPage, saves to `correction_notes`
Data Storage	Done	`call_logs.grading_score` and `call_logs.correction_notes` columns exist
Feedback Dashboard	Planned	Visual report: % good vs needs improvement, common correction themes, persona version history

⇄

API Endpoints (30+)

Method	Endpoint	Purpose	Auth
POST	/auth/register	Create user account	—
POST	/auth/login	Login → JWT	—
GET	/auth/me	Current user profile	✓
POST	/twilio/token	Generate Voice SDK access token	✓
POST	/twilio/voice	TwiML webhook (outbound + conference join)	—
POST	/twilio/call-status	Outbound call completion callback	—
POST	/twilio/inbound	AI answers inbound call (business hours check)	—
POST	/twilio/ai-respond	AI processes speech & responds (loop)	—
POST	/twilio/inbound-status	Inbound call ended → finalize call_log	—
POST	/twilio/voicemail	Voicemail TwiML prompt	—
POST	/twilio/voicemail-done	Voicemail recording complete	—
POST	/twilio/conference-status	Conference lifecycle events	—
POST	/twilio/recording-status	Recording ready → download, store, transcribe	—
POST	/twilio/transcription-status	Twilio transcription completion	—
GET	/twilio/recording/:sid	Serve recording audio (MinIO/Twilio proxy)	✓
POST	/twilio/transcribe/:id	Retroactive Whisper+GPT transcription	✓
GET	/active-calls	List live AI-handled calls	✓
POST	/active-calls/:callSid/monitor	Listen in or take over call	✓
POST	/active-calls/:callSid/agent-response	Agent accept/reject transfer	✓
GET	/active-users	Online user list	✓
GET	/call-logs	List all call logs	✓
GET	/call-logs/:id	Single call detail	✓
POST	/call-logs	Create call log	✓
PUT	/call-logs/:id	Update call log (grade, feedback)	✓
DELETE	/call-logs/:id	Delete call log	✓
GET	/agent-settings	Get AI agent config	✓
POST	/agent-settings	Create agent settings	✓
PUT	/agent-settings/:id	Update agent settings	✓
GET	/dashboard/stats	Call stats & recent activity	✓
GET	/contacts	List contacts with last call info	✓
POST	/contacts	Create contact	✓
DELETE	/contacts/:id	Delete contact	✓
GET	/health	Health check	—
WS	/ws	Real-time admin updates (call events, transcripts)	msg

▣

Database Schema

users

id, email (unique), password_hash

full_name, role (admin/agent)

created_at, updated_at

call_logs

id, call_sid (unique)

caller_name, caller_number, call_type

disposition, disposition_label, duration

transcript (JSON), recording_url

transcript_human (JSON), recording_url_human

grading_score, correction_notes

review_status, created_at

active_calls

id, call_sid (unique)

caller_number, caller_name, status

conversation_history (JSON)

recording_sid, recording_sid_human

conference_name, started_at, recording_url

agent_settings

id, timezone

system_prompt, greeting_message

business_hours (JSON)

is_emergency_closed, voice_id

contacts

id, name, phone (unique)

created_at

profiles

id, user_id (unique)

full_name, avatar_url

➤

Planned / Next Up

Feature	Description	Priority
Conference Hold & Transfer	Implement Hold and Transfer buttons in Live Calls intervention panel	Medium

⚙

Environment & Config

Variable	Purpose	Scope
VITE_API_URL	Backend URL baked into frontend JS at build time	Frontend (build)
CORS_ORIGIN	Allowed origin for CORS	Backend
JWT_SECRET	JWT signing key	Backend
TWILIO_ACCOUNT_SID	Twilio account identifier	Backend
TWILIO_API_KEY / SECRET	Twilio API credentials	Backend
TWILIO_TWIML_APP_SID	TwiML App for Voice URL routing	Backend
TWILIO_PHONE_NUMBER	Outbound caller ID (+12253503828)	Backend
TWILIO_INBOUND_PHONE_NUMBER	AI inbound number (+15046366772)	Backend
OPENAI_API_KEY	OpenAI for GPT responses + Whisper transcription	Backend
MINIO_*	Object storage credentials & bucket	Backend
DB_HOST / DB_USER / DB_PASSWORD	MySQL connection	Backend

$

Deployment Cost Estimation (Monthly)

Staging Environment

Staging

EC2 — App Server

t3.medium · 2 vCPU · 4 GB RAM · us-east-1

~$30.37

RDS MySQL

db.t3.small · 2 vCPU · 2 GB · 20 GB gp3 storage

~$24.82

S3 Storage

~5 GB recordings · Standard tier

~$0.12

EBS Volume

30 GB gp3 for EC2

~$2.40

Data Transfer

~10 GB outbound (light staging use)

~$0.90

Elastic IP

1 static IP (attached to instance)

$0.00

AWS Subtotal (Staging)

~$58.61/mo

Production Environment

Production

EC2 — App Server

t3.medium · 2 vCPU · 4 GB RAM · us-east-1

~$30.37

RDS MySQL

db.t3.small · 2 vCPU · 2 GB · 50 GB gp3 · Multi-AZ

~$49.64

S3 Storage

~50 GB recordings · Standard tier

~$1.15

EBS Volume

30 GB gp3 for EC2

~$2.40

Data Transfer

~50 GB outbound (recordings + API)

~$4.50

CloudWatch Logs

5 GB ingestion + basic monitoring

~$2.50

AWS Subtotal (Production)

~$90.56/mo

▶ Third-Party Services (Shared Across Environments — Expectation)

~$25.50/mo

Twilio — Phone Numbers

2 numbers · +12253503828 (outbound) + +15046366772 (inbound)

~$2.00

Twilio — Voice Minutes

Estimated ~500 min inbound + 200 min outbound · $0.0085–$0.014/min

~$8.50

Twilio — Recording Storage

~500 recordings · $0.0025/min stored

~$1.75

OpenAI — GPT-4o-mini

AI responses · ~500 calls × ~8 turns avg · $0.15/$0.60 per 1M tokens

~$2.50

OpenAI — Whisper

Transcription · ~500 calls × avg 3 min · $0.006/min

~$9.00

OpenAI — GPT-3.5 Turbo

Diarization post-processing · ~500 transcripts

~$0.75

Domain & SSL

lhpbo.com · Let's Encrypt (free SSL)

~$1.00

Services Subtotal

~$25.50/mo

~$58.61

AWS Staging / mo

~$90.56

AWS Production / mo

~$25.50

Third-Party Services / mo

~$174.67

Total (Staging + Prod + Services)

Note — OpenAI Budget: Current OpenAI API key balance is $20.00. At estimated usage (~500 calls/mo), OpenAI costs run ~$12.25/mo (GPT-4o-mini + Whisper + GPT-3.5 diarization). The $20 balance covers approximately 1.5 months of operation at this volume. Monitor usage via platform.openai.com/usage and top up before depletion to avoid transcription/AI interruption.

◈

Key Discussion Points

★ AI Training Strategy

Persona is the main source of AI behavior — the system prompt in agent_settings defines how the AI agent communicates
Feedback from call reviews is collected and stored per call (grading_score + correction_notes)
Feedback is summarized periodically, not directly injected into the persona
This prevents noise and ensures only validated, pattern-based improvements reach the AI

⚙ RAG (Future Approach)

Vector database (e.g. Pinecone, pgvector) for scalable knowledge retrieval
Enables AI to reference company policies, FAQs, product details without cramming everything into the system prompt
Not required for current scope — the persona-based approach handles present conversation complexity well
Will become necessary when knowledge base exceeds prompt context limits or requires dynamic updates

✎ Persona Management

Persona updated periodically (not in real-time) to maintain stability
Avoid too many inputs to prevent hallucination — a bloated prompt leads to confused, inconsistent responses
Each update should be a curated summary of the most impactful feedback patterns
Keep persona focused: core role, tone, key rules, known edge cases — not a dump of every correction ever received
Version the persona so rollbacks are possible if quality degrades

✓ Feedback & Grading

Every call can be graded: "Good" or "Needs Improvement"
Correction notes provide specific textual feedback on what the AI did wrong and how it should respond instead
Feedback is stored and reviewed before any persona update — human-in-the-loop ensures quality
Grading data enables evaluation metrics: % good over time, improvement trends, problem areas

⚡ AI Capabilities — Current vs Future

Current

Basic conversation — multi-turn speech-to-speech via GPT-4o-mini
Greeting, Q&A, and escalation to human agent
Business hours awareness (voicemail if closed)
Call recording & transcription (AI + human phases)

Future

Scheduling — AI books appointments directly via calendar API
Contact creation — AI adds new contacts from call context
Call summary generation — auto-generate post-call summaries for CRM
API integrations — connect to external systems (CRM, calendar, ticketing)
RAG knowledge retrieval — vector DB for dynamic company knowledge

✓

Decisions Made

✓

Use persona-based approach as the primary AI training method. The system prompt in agent_settings is the single source of truth for AI behavior. This keeps things simple, controllable, and sufficient for current call volume and complexity.

✓

Do NOT inject raw feedback directly into the persona. Raw correction notes are often context-specific, contradictory, or too granular. Injecting them verbatim would bloat the prompt and cause hallucination or inconsistent behavior.

✓

Use summarized feedback for persona updates. A periodic review process: collect all new corrections → identify recurring patterns → distill into concise behavioral rules → update persona. This ensures only high-signal, validated improvements reach the AI.

✓

RAG is deferred to a future phase. The current persona-based approach handles present needs. RAG (vector database for knowledge retrieval) will be implemented when the knowledge base grows beyond what fits in a system prompt or when dynamic, frequently-updated content is needed.

✓

Dual-environment AWS deployment: staging (t3.medium + db.t3.small) for testing, production (t3.medium + db.t3.small Multi-AZ) for live traffic. Both run Docker Compose with the same stack to maintain parity.

✓

OpenAI $20 budget is sufficient for initial launch (~1.5 months at 500 calls/mo). Monitor usage proactively and allocate top-up budget before depletion to avoid service interruption.