CAMLIN SPEECH

Recognition, prompts, and customer input capture

Camlin Speech gives the platform its voice. It handles recognition, prompts, validation, and structured capture across Contact, Avatar, and every channel. When agents author an Interaction in Architect, the speech prompts are part of the draft. Choose the right provider for each journey without rebuilding anything.
5
Speech Providers
Choose the right one
8
Ways to capture input
Spoken to structured
16
Business details
Dates, numbers, IDs

Camlin Speech

Recognition, prompting, and validation

LIVE EXAMPLE
Recognition result

Recognition + prompting
Capture, validate, and respond in one service
Structured output
Business fields and confidence checks
📞
Contact
🧭
Design
🧩
API/Avatar
THE SPEECH STACK

Stop re-tuning STT per channel

Recognition, prompts, TTS, and structured capture live here so Contact, Voice, Avatar, and every channel use the same speech service — not separate integrations.

5 providers, one config

Google, Deepgram, Azure, AWS Transcribe, and Whisper — choose per journey, per channel, or per environment. Stop re-tuning STT every time the channel changes.

Per-journey provider choice
8 recognition modes
Confidence-based retries

16 entity types for structured capture

Dates, phone numbers, account IDs, addresses, and 12 more business field types. Move from free-form recognition into validated data with guided retries.

Dates, numbers, IDs, names
Validation-aware capture
Reusable across journeys

TTS and prompt control

The platform generates spoken responses using the TTS voice, speed, and style configured for each journey. Different Interactions can use different voices.

Per-journey voice settings
Multiple TTS engines
Consistent across channels

Reusable across every channel

The same recognition, prompts, and capture rules work in Contact, Voice, Avatar lip-sync, visual IVR, and chat — avoiding duplicated logic.

Phone, web, avatar, SMS
One tuning point
Shared confidence rules
WHERE SPEECH SHOWS UP

One speech service, every channel

Speech powers phone calls, avatar lip-sync, visual IVR, and chat. The same recognition and prompt rules apply everywhere — one config point.

Contact + Voice

Recognition, TTS, and structured capture power every live call. 5 providers, 16 entity types, and confidence-based retries drive the phone experience.

See Contact + Voice

Architect

When agents draft Interactions, the speech prompts, recognition grammar, TTS settings, and validation rules are authored alongside the service design.

See Architect

Avatar + Visual IVR

The same speech service drives avatar voice, lip-sync timing, and visual IVR input — so recognition stays consistent across screen and phone.

See Avatar + Visual IVR
SPEECH PIPELINE

How speech flows through the platform

Follow spoken or typed input through recognition, validation, and response. Each step connects to a different speech capability.

Caller speaks or types

Audio or text input arrives from the channel — phone, web chat, avatar, or SMS.

The channel determines which speech provider and mode to use based on the Interaction design.

AUTHORED IN ARCHITECT

Agents author speech as part of the service

When agents draft an Interaction, the speech prompts, recognition grammar, TTS settings, and validation rules are part of the draft — not configured separately.
Prompts in context

Agents see the prompt wording, TTS voice, and timing alongside the Interaction it belongs to.

Recognition rules

Grammar, confidence thresholds, retry logic, and capture fields are defined per-Interaction, not globally.

Provider per journey

Different Interactions can use different speech providers — Google for one, Deepgram for another — without splitting the service.

HEAR IT LIVE

Hear live speech recognition

Walk through how Speech handles recognition, TTS, and structured capture across Contact, Voice, and Avatar.
See Contact in context