Prerequisites
1. Plivo Account
Sign up for Plivo and get your credentials:| Credential | Where to Find |
|---|---|
| Auth ID | Plivo Console |
| Auth Token | Plivo Console |
2. Phone Number
You need a voice-enabled Plivo number to make or receive calls.| Call Type | Number Requirement |
|---|---|
| Inbound | Callers dial your Plivo number, triggers your Answer URL, starts stream |
| Outbound | Your Plivo number is the Caller ID when making calls via API |
- Go to Phone Numbers > Buy Numbers
- Select country and type (local, toll-free, mobile)
- Filter by
voice_enabled = true - Purchase
India Numbers (Additional Requirements)
India Numbers (Additional Requirements)
Indian phone numbers require KYC compliance:
Submit compliance at Compliance Application before purchasing. See Rent India Numbers for details.
| Requirement | Details |
|---|---|
| Account currency | Must be INR |
| KYC documents | Certificate of Incorporation (COI) + GST Certificate |
| Business registration | India-registered businesses only |
3. WebSocket Server
Your server must:- Accept WebSocket connections over
wss:// - Be publicly accessible (use ngrok for local development)
- Handle Plivo’s stream events (start, media, dtmf, stop)
4. AI Service Credentials (Optional)
For voice AI applications, you’ll typically need:- Speech-to-Text: Deepgram, Google Speech, AWS Transcribe
- LLM: OpenAI, Anthropic, Google Gemini
- Text-to-Speech: ElevenLabs, Google TTS, Amazon Polly
How It Works
Plivo streams real-time audio between phone calls and your WebSocket server.Architecture
Step-by-Step Flow
- Call Initiation: A caller dials your Plivo number, or your application initiates an outbound call.
- Answer URL Request: Plivo makes an HTTP request to your configured Answer URL.
-
Stream XML Response: Your server responds with XML containing the
<Stream>element, specifying the WebSocket URL and streaming parameters. - WebSocket Connection: Plivo establishes a WebSocket connection to your specified URL.
-
Start Event: Plivo sends a
startevent containing call metadata (call ID, stream ID, media format). -
Media Streaming:
- Inbound: Plivo continuously sends
mediaevents containing base64-encoded audio chunks from the caller. - Outbound: Your server sends
playAudioevents with base64-encoded audio to be played to the caller.
- Inbound: Plivo continuously sends
-
DTMF Events: When the caller presses keys, Plivo sends
dtmfevents with the digit information. -
Control Events: Your server can send
clearAudioto interrupt playback orcheckpointto track playback progress. - Connection Close: When the call ends or streaming stops, the WebSocket connection closes.
Stream XML
The<Stream> XML element initiates audio streaming for a call. Include it in your Answer URL response.
Basic Syntax
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
bidirectional | boolean | false | Enable two-way audio streaming. When true, you can send audio back to the caller. |
keepCallAlive | boolean | false | Keep the call active after the stream ends. When false, the call ends when streaming stops. |
contentType | string | audio/x-mulaw;rate=8000 | Audio codec and sample rate. See Supported Content Types. |
statusCallbackUrl | string | — | URL for stream status callbacks (started, stopped, failed). |
statusCallbackMethod | string | POST | HTTP method for status callbacks (GET or POST). |
extraHeaders | string | — | Custom headers to include in the start event. Format: key1=value1;key2=value2 |
Supported Content Types
| Content Type | Description | Use Case |
|---|---|---|
audio/x-mulaw;rate=8000 | mu-law codec at 8kHz | Recommended. Standard telephony, lowest latency, best compatibility. |
audio/x-l16;rate=8000 | Linear PCM 16-bit at 8kHz | Higher quality for speech processing. |
audio/x-l16;rate=16000 | Linear PCM 16-bit at 16kHz | High-quality speech recognition. |
Examples
Bidirectional Stream with mu-law Codec
Stream with Status Callbacks and Extra Headers
Stream APIs
Control active streams programmatically via REST API calls.Base URL
Authentication
Use HTTP Basic Authentication with your Plivo Auth ID and Auth Token.Stop a Stream
Endpoint:DELETE /v1/Account/{auth_id}/Call/{call_uuid}/Stream/
Get Stream Details
Endpoint:GET /v1/Account/{auth_id}/Call/{call_uuid}/Stream/
Using the Plivo SDK
Node.js
Python
Stream Status Callbacks
Configure a callback URL to receive notifications about stream lifecycle events.Configuration
Callback Parameters
| Parameter | Type | Description |
|---|---|---|
CallUUID | string | The unique identifier for the call |
StreamID | string | The unique identifier for the stream |
Event | string | The event type: started, stopped, failed |
Timestamp | string | ISO 8601 timestamp of the event |
From | string | The caller’s phone number |
To | string | The called phone number |
Direction | string | Call direction: inbound or outbound |
StatusReason | string | Reason for status (on stopped or failed) |
Duration | number | Stream duration in seconds (on stopped) |
Example Handler
Signature Validation
Plivo signs WebSocket connection requests to verify authenticity. Validate these signatures to ensure requests originate from Plivo.V3 Signature Headers
| Header | Description |
|---|---|
X-Plivo-Signature-V3 | The HMAC-SHA256 signature |
X-Plivo-Signature-V3-Nonce | A unique nonce for this request |
Using the Plivo SDK
Using the Node.js Stream SDK
Theplivo-stream-sdk-node handles signature validation automatically:
validateSignature is enabled, connections with invalid signatures are automatically rejected with a 1008 WebSocket close code.
WebSocket Events
All communication over the WebSocket uses JSON messages. Here are the essential events you need to handle.Events from Plivo (Input)
| Event | Description |
|---|---|
start | Sent once when stream begins. Contains call metadata (callId, streamId, mediaFormat). |
media | Sent continuously. Contains base64-encoded audio chunks (~20ms each). |
dtmf | Sent when caller presses keys. Contains the digit pressed. |
playedStream | Confirmation that audio with a checkpoint finished playing. |
clearedAudio | Confirmation that the audio queue was cleared. |
Events to Plivo (Output)
| Event | Description |
|---|---|
playAudio | Send audio to the caller. Include base64 payload matching stream contentType. |
checkpoint | Mark a point in audio queue. Receive playedStream when reached. |
clearAudio | Clear all queued audio. Use for interruption handling. |
Quick Example
X-Headers
Pass custom metadata from your Stream XML to your WebSocket server.Usage
Parsing
Limits
WebSocket and Stream Limits
| Limit | Value |
|---|---|
| Maximum WebSocket URL length | 2048 characters |
| Maximum concurrent streams per call | 1 |
| Maximum stream duration | Same as call duration |
| Audio buffer size (playback queue) | ~60 seconds of audio |
| Maximum WebSocket message size | 64 KB |
| Recommended audio chunk size | 16 KB base64-encoded or less |
Best Practices
Use mu-law 8000Hz
Why mu-law at 8kHz is recommended:- Native Telephony Format: No transcoding required, lowest latency
- Bandwidth Efficient: Compresses 16-bit audio to 8-bit while maintaining voice quality
- Universal Compatibility: Every STT/TTS service supports mu-law
- Sufficient for Voice: Human speech is well-represented at 8kHz
Minimize Latency
For a responsive Voice AI experience, aim for under 1 second total response time:| Component | Target Latency |
|---|---|
| Speech-to-Text | < 200ms |
| LLM Processing | < 500ms |
| Text-to-Speech | < 200ms |
| Network (round trip) | < 100ms |
| Traffic Source | Recommended Server Location |
|---|---|
| US-focused | US East (Virginia) or US West (Oregon) |
| Europe-focused | Frankfurt or London |
| Asia-Pacific | Singapore or Mumbai |
| Global | Deploy in multiple regions with geographic routing |
Handle Interruptions
Always support user interruption usingclearAudio:
Integration Guides
For complete code examples and step-by-step tutorials:Plivo Stream SDK
Official SDKs for Python, Node.js, and Java with full examples using Deepgram, OpenAI, and ElevenLabs
Pipecat
Build with the Pipecat framework for simplified voice AI pipelines
Next Steps
- Protocol Reference: Complete JSON schemas, TypeScript types, and advanced patterns
- Plivo Stream SDK: Production-ready SDKs with examples
Support
For questions, issues, or feature requests:- Documentation: https://www.plivo.com/docs/
- Support: [email protected]
- GitHub Issues: For SDK-specific issues
Last updated: January 2026