{"id":4088,"library":"livekit-plugins-turn-detector","title":"LiveKit Turn Detector Plugin","description":"livekit-plugins-turn-detector provides end-of-utterance detection for LiveKit Agents, leveraging machine learning to differentiate between genuine interruptions and incidental background noises. It is an integral part of LiveKit Agents' adaptive interruption handling introduced in v1.5.0. The current version is 1.5.2, and it typically releases in conjunction with major livekit-agents updates.","status":"active","version":"1.5.2","language":"en","source_language":"en","source_url":"https://github.com/livekit/agents","tags":["livekit","audio","vad","turn-detection","machine-learning","ai-agents"],"install":[{"cmd":"pip install livekit-plugins-turn-detector","lang":"bash","label":"Install Plugin"}],"dependencies":[{"reason":"Core framework this plugin extends.","package":"livekit-agents"},{"reason":"Required for the underlying machine learning model, specifically the 'torch' extra, which is pulled automatically.","package":"transformers","optional":false},{"reason":"Used for audio frame data manipulation.","package":"numpy"},{"reason":"Audio I/O utilities.","package":"soundfile"}],"imports":[{"symbol":"TurnDetector","correct":"from livekit.plugins.turn_detector import TurnDetector"},{"symbol":"TurnStarted","correct":"from livekit.plugins.turn_detector import TurnStarted"},{"symbol":"TurnFinished","correct":"from livekit.plugins.turn_detector import TurnFinished"},{"note":"AudioFrame is part of livekit-agents utilities and should be imported from there.","wrong":"from livekit.plugins.turn_detector.models import AudioFrame","symbol":"AudioFrame","correct":"from livekit.agents.utils import AudioFrame"}],"quickstart":{"code":"import asyncio\nimport numpy as np\nfrom livekit.agents.utils import AudioFrame\nfrom livekit.plugins.turn_detector import TurnDetector, TurnStarted, TurnFinished\n\nasync def quickstart_turn_detector():\n    print(\"Initializing TurnDetector...\")\n    # TurnDetector.create() is an async factory method\n    detector = await TurnDetector.create()\n\n    # Simulate an audio stream (e.g., 16kHz mono audio)\n    sample_rate = 16000\n    num_silent_frames = 50 # 500ms of silence (50 * 10ms frames)\n    num_speech_frames = 100 # 1 second of speech\n    frame_size = int(sample_rate * 0.01) # 10ms frame\n\n    async def simulate_audio():\n        # Silence\n        for _ in range(num_silent_frames):\n            frame = AudioFrame(np.zeros(frame_size, dtype=np.int16), sample_rate, 1)\n            await detector.push_frame(frame)\n            await asyncio.sleep(0.01) # Simulate real-time\n        print(\"Simulated silence.\")\n\n        # Speech (simulated non-zero audio)\n        for i in range(num_speech_frames):\n            t = np.linspace(0, 0.01, frame_size, endpoint=False)\n            sine_wave = (np.sin(2 * np.pi * 440 * t) * 1000).astype(np.int16)\n            frame = AudioFrame(sine_wave, sample_rate, 1)\n            await detector.push_frame(frame)\n            if i == 0:\n                print(\"Simulating speech...\")\n            await asyncio.sleep(0.01)\n\n        # Post-speech silence\n        for _ in range(num_silent_frames):\n            frame = AudioFrame(np.zeros(frame_size, dtype=np.int16), sample_rate, 1)\n            await detector.push_frame(frame)\n            await asyncio.sleep(0.01)\n        print(\"Simulated post-speech silence. Closing detector.\")\n\n        # Signal end of stream\n        await detector.flush()\n\n    # Process events from the detector\n    async def process_events():\n        async for event in detector.detect_turns():\n            if isinstance(event, TurnStarted):\n                print(f\"Turn Started at timestamp {event.timestamp}\")\n            elif isinstance(event, TurnFinished):\n                print(f\"Turn Finished at timestamp {event.timestamp}, duration: {event.duration}s\")\n\n    # Run both concurrently\n    await asyncio.gather(simulate_audio(), process_events())\n    print(\"Quickstart finished.\")\n\nif __name__ == \"__main__\":\n    asyncio.run(quickstart_turn_detector())","lang":"python","description":"Demonstrates how to initialize the `TurnDetector`, feed it simulated audio frames, and listen for `TurnStarted` and `TurnFinished` events. This illustrates the core API for integrating turn detection into an audio processing pipeline."},"warnings":[{"fix":"Review `livekit-agents` documentation on VAD and interruption handling for version 1.5.0+ to understand the new defaults and customization options.","message":"Starting with `livekit-agents` 1.5.0, adaptive interruption handling, powered by this plugin, is enabled by default. This significantly changes the default VAD (Voice Activity Detection) behavior and might override custom VAD configurations if not explicitly managed. Users upgrading from older `livekit-agents` versions should be aware of this behavioral shift.","severity":"gotcha","affected_versions":"livekit-agents >=1.5.0"},{"fix":"Ensure your environment meets the resource requirements for `transformers` and `torch`. If memory or CPU usage is a concern, monitor the agent's performance. Consider installing specific CPU-only versions of `torch` if GPU is not available or desired.","message":"The plugin has significant dependencies, notably `transformers[torch]`. This leads to a large installation size and introduces `torch` as a dependency, which can have performance implications and specific hardware requirements (e.g., GPU for faster inference).","severity":"gotcha","affected_versions":"all"},{"fix":"Upgrade to `livekit-plugins-turn-detector` version 1.5.1 or newer to benefit from relaxed dependency constraints. If upgrading is not possible, carefully manage your `transformers` version to match the requirements.","message":"Versions of `livekit-plugins-turn-detector` prior to 1.5.1 had a stricter upper bound on the `transformers` dependency. This could lead to dependency conflicts if other libraries in your project required a newer or different `transformers` version.","severity":"gotcha","affected_versions":"<1.5.1"},{"fix":"Use this plugin in conjunction with `livekit-agents`. For general-purpose VAD or turn detection outside of LiveKit, consider standalone libraries or direct use of the underlying `transformers` model.","message":"This plugin is specifically designed to work within the LiveKit Agents ecosystem. While the underlying ML model might be general-purpose, the `TurnDetector` class and its event handling are tightly integrated with LiveKit's audio stream processing and agent lifecycle.","severity":"gotcha","affected_versions":"all"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}