Introduction
Universal Voice AI Provider Switching without vendor lock-in
This is a WebSocket API that allows you to connect with any realtime Voice AI provider like OpenAI Realtime, Gemini Live, and ElevenLabs Conversational Agents (TBD).
OpenAI, Gemini and ElevenLabs all provide their realtime websocket APIs to connect. So you may wonder: how do developers benefit by connecting to RealtimeSwitch websocket API instead of OpenAI realtime or Gemini or ElevenLabs?
The answer is to avoid vendor lock-in, cost distribution, session resumption, reliable failovers without you ever knowing it, and standard API spec that can work with any provider without ever changing the communication code you already have.
Example
If you currently use websocket API of OpenAI, your code is tightly coupled with OpenAI WebSocket specs.
Example to send a session update event you would do something like:
// OpenAI Realtime API - Session Update
ws.send(JSON.stringify({
type: "session.update",
session: {
modalities: ["text", "audio"],
voice: "alloy",
turn_detection: { type: "server_vad" }
}
}));
// And to receive audio response you would write code like below...
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.type === "response.audio.delta") {
// Handle OpenAI audio format
playAudio(data.delta);
}
};
And now imagine, OpenAI is down, or increases cost, or you find out Gemini has launched an affordable model. You will have to write your integration from scratch like this:
// Gemini Live API - Session Update (completely different!)
ws.send(JSON.stringify({
setup: {
model: "models/gemini-2.0-flash-live-001",
generation_config: {
response_modalities: ["AUDIO"]
}
}
}));
// Receive audio response (different events!)
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
if (data.serverContent?.modelTurn?.parts) {
handleGeminiAudio(data.serverContent.modelTurn.parts);
}
};
This demonstrates the fundamental incompatibility between provider-specific WebSocket implementations.
RealtimeSwitch WebSocket API Fixes All This and More
With RealtimeSwitch API, you continue to use any WebSocket API specs of your choice which you have already integrated like OpenAI or Gemini but still switch the underlying provider without ever changing your code.
So your OpenAI code will still work with Gemini models or if you have integrated with Gemini WebSocket specs, it will still work with OpenAI as a provider - without ever changing the application code.
More Features
- In addition to seamless voice AI engine with any choice of WebSocket spec, you get extended sessions and auto reconnects.
- Define custom switch rules that auto switches whenever it detects a provider going slow or outage. So that your customers never see a downtime.