AgentConn

LiveKit Agents

Framework Agnostic Advanced Voice & Audio Open Source

LiveKit Agents is the framework for building real-time voice AI agents. It provides a pipeline for speech-to-text, LLM processing, and text-to-speech with sub-second latency — enabling natural conversational voice interactions.

Input / Output

Accepts

audio-stream text

Produces

audio-response transcript actions

Overview

LiveKit Agents enables building voice AI agents that hold natural, real-time conversations. The framework handles the entire pipeline — capturing audio, transcribing, processing with an LLM, and synthesizing speech — with latency low enough for natural flow.

How It Works

  1. Define agent — Set up STT, LLM, and TTS pipeline
  2. Handle conversations — Process speech in real-time
  3. Execute actions — Voice agent triggers tools
  4. Deploy — Scale on LiveKit infrastructure

Use Cases

  • Customer service — Natural-sounding voice bots
  • Voice assistants — Custom domain-specific assistants
  • Interview agents — Automated screening
  • Language learning — Conversational practice partners

Getting Started

from livekit.agents.voice_assistant import VoiceAssistant
from livekit.plugins import openai, silero

assistant = VoiceAssistant(
    vad=silero.VAD.load(),
    stt=openai.STT(),
    llm=openai.LLM(),
    tts=openai.TTS(),
)

Example

User speaks: "What's the weather in San Francisco?"
Pipeline: STT → LLM (+ weather tool) → TTS → Audio response
Total latency: ~800ms

Alternatives

  • Vapi — Voice AI platform (managed)
  • Retell AI — Voice agent platform
  • Bland AI — Phone call AI agents

Tags

#voice #audio #real-time #speech #conversational-ai

Compatible Agents

AI agents that work well with LiveKit Agents.