CAMLIN SPEECH

Recognition, prompts, and customer input capture

Camlin Speech gives the platform its voice. It handles recognition, prompts, validation, and structured capture across Contact, Avatar, and every channel. When agents author an Interaction in Architect, the speech prompts are part of the draft. Choose the right provider for each journey without rebuilding anything.

Speech Providers

Choose the right one

Ways to capture input

Spoken to structured

Business details

Dates, numbers, IDs

Camlin Speech

Recognition, prompting, and validation

LIVE EXAMPLE

Recognition result

Recognition + prompting

Capture, validate, and respond in one service

Structured output

Business fields and confidence checks

📞

Contact

🧭

Design

🧩

API/Avatar

THE SPEECH STACK

Stop re-tuning STT per channel

Recognition, prompts, TTS, and structured capture live here so Contact, Voice, Avatar, and every channel use the same speech service — not separate integrations.

5 providers, one config

Google, Deepgram, Azure, AWS Transcribe, and Whisper — choose per journey, per channel, or per environment. Stop re-tuning STT every time the channel changes.

Per-journey provider choice

8 recognition modes

Confidence-based retries

16 entity types for structured capture

Dates, phone numbers, account IDs, addresses, and 12 more business field types. Move from free-form recognition into validated data with guided retries.

Dates, numbers, IDs, names

Validation-aware capture

Reusable across journeys

TTS and prompt control

The platform generates spoken responses using the TTS voice, speed, and style configured for each journey. Different Interactions can use different voices.

Per-journey voice settings

Multiple TTS engines

Consistent across channels

Reusable across every channel

The same recognition, prompts, and capture rules work in Contact, Voice, Avatar lip-sync, visual IVR, and chat — avoiding duplicated logic.

Phone, web, avatar, SMS

One tuning point

Shared confidence rules

WHERE SPEECH SHOWS UP

One speech service, every channel

Speech powers phone calls, avatar lip-sync, visual IVR, and chat. The same recognition and prompt rules apply everywhere — one config point.

Contact + Voice

Recognition, TTS, and structured capture power every live call. 5 providers, 16 entity types, and confidence-based retries drive the phone experience.

See Contact + Voice

Architect

When agents draft Interactions, the speech prompts, recognition grammar, TTS settings, and validation rules are authored alongside the service design.

See Architect

Avatar + Visual IVR

The same speech service drives avatar voice, lip-sync timing, and visual IVR input — so recognition stays consistent across screen and phone.

See Avatar + Visual IVR

SPEECH PIPELINE

How speech flows through the platform

Follow spoken or typed input through recognition, validation, and response. Each step connects to a different speech capability.

Caller speaks or types

Audio or text input arrives from the channel — phone, web chat, avatar, or SMS.

The channel determines which speech provider and mode to use based on the Interaction design.

AUTHORED IN ARCHITECT

Agents author speech as part of the service

When agents draft an Interaction, the speech prompts, recognition grammar, TTS settings, and validation rules are part of the draft — not configured separately.

Prompts in context

Agents see the prompt wording, TTS voice, and timing alongside the Interaction it belongs to.

Recognition rules

Grammar, confidence thresholds, retry logic, and capture fields are defined per-Interaction, not globally.

Provider per journey

Different Interactions can use different speech providers — Google for one, Deepgram for another — without splitting the service.