Amazon Transcribe Streaming SDK for Python
The `amazon-transcribe` library is an asynchronous Python SDK designed for direct integration with the Amazon Transcribe Streaming service, enabling real-time conversion of audio into text. As of version 0.6.4, this SDK is considered experimental, is no longer actively developed, and is not officially supported by AWS for new projects. It receives infrequent updates and has a stated lack of commitment for ongoing support.
Common errors
-
BadRequestException: Your stream is too big.
cause The audio data stream provided to Amazon Transcribe exceeds the service's size limits or contains malformed audio data.fixBreak your audio stream into smaller, appropriately sized chunks. Ensure your audio format (PCM, 16-bit) and sample rate (e.g., 16000 Hz) match the parameters specified in `start_stream_transcription`. -
LimitExceededException
cause Your client has exceeded one of the Amazon Transcribe limits, typically the audio length limit for a streaming session.fixReview Amazon Transcribe service quotas. If applicable, break your audio input into smaller, separate streaming sessions, or reduce the frequency of requests if rate limits are hit. The AWS SDKs typically include automatic retry mechanisms for rate limit exceptions. -
InternalFailureException
cause A problem occurred internally while Amazon Transcribe was processing the audio, leading to termination of processing.fixThis often indicates a transient server-side issue. Implement robust retry logic with exponential backoff. If the problem persists, gather detailed logs and contact AWS Support. -
HTTP/2 stream is abnormally aborted in mid-communication with result code 2
cause This error can stem from various issues including incorrect audio format, mismatched sample rate, an unusually small chunk size, or underlying network connectivity problems between your client and Amazon Transcribe.fixVerify that your PCM audio is 16-bit, and the `media_sample_rate_hz` parameter matches your audio source. Try increasing the audio chunk size (e.g., to 32 KB per chunk). Check your network connection stability. Ensure proper error handling, especially for initial audio chunks. -
UnrecognizedClientException: The security token included in the request is invalid. (or similar credential errors)
cause The AWS credentials (access key, secret key, session token) used by the SDK are invalid, expired, or not configured correctly.fixEnsure `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_REGION` environment variables are correctly set, or that your AWS CLI configured profile (`~/.aws/credentials`) is valid and accessible. If using temporary credentials, ensure they haven't expired.
Warnings
- breaking This SDK is experimental, no longer actively developed, and is not recommended for new projects. It is provided as-is without a support commitment and is being replaced by other AWS solutions.
- gotcha The standard AWS SDK for Python (Boto3) does NOT support Amazon Transcribe streaming. This `amazon-transcribe` SDK is specifically designed for streaming.
- gotcha The SDK can, in rare cases, suffer from high CPU issues.
- gotcha The `awscrt` dependency, built on C libraries, may require manual compilation on non-standard operating systems if precompiled wheels are not available.
- gotcha Amazon Transcribe only supports one audio stream per WebSocket session. Attempting to use multiple streams simultaneously within a single session will cause the transcription request to fail.
Install
-
pip install amazon-transcribe
Imports
- TranscribeStreamingClient
from amazon_transcribe.client import TranscribeStreamingClient
- TranscriptResultStreamHandler
from amazon_transcribe.handlers import TranscriptResultStreamHandler
- TranscriptEvent
from amazon_transcribe.model import TranscriptEvent
Quickstart
import asyncio
import os
import time
# NOTE: aiofile is not a direct dependency but is commonly used
# for asynchronous file reads in examples. Install with `pip install aiofile`.
# For a minimal example, we'll simulate an audio stream.
# import aiofile
from amazon_transcribe.client import TranscribeStreamingClient
from amazon_transcribe.handlers import TranscriptResultStreamHandler
from amazon_transcribe.model import TranscriptEvent
# Configure AWS credentials from environment variables for quickstart.
# In a real application, consider using AWS CLI config or IAM roles.
# os.environ['AWS_ACCESS_KEY_ID'] = os.environ.get('AWS_ACCESS_KEY_ID', '')
# os.environ['AWS_SECRET_ACCESS_KEY'] = os.environ.get('AWS_SECRET_ACCESS_KEY', '')
# os.environ['AWS_SESSION_TOKEN'] = os.environ.get('AWS_SESSION_TOKEN', '') # Optional
class MyEventHandler(TranscriptResultStreamHandler):
async def handle_transcript_event(self, transcript_event: TranscriptEvent):
results = transcript_event.transcript.results
for result in results:
if not result.is_partial:
for alternative in result.alternatives:
print(f"[Transcription]: {alternative.transcript}")
async def basic_transcribe_stream(client: TranscribeStreamingClient, region: str):
# Simulate a stream of audio bytes (replace with actual audio source)
async def get_audio_stream():
# For a real application, read from microphone or file (e.g., using aiofile)
# For this example, we'll send a few empty bytes to keep the stream open briefly
for _ in range(5):
yield b'\x00' * 1024 # Simulate 1KB of silence/empty data
await asyncio.sleep(0.1)
print("\n--- Audio stream ended ---")
stream = await client.start_stream_transcription(
language_code="en-US",
media_sample_rate_hz=16000,
media_encoding="pcm",
)
# Instantiate our handler and start processing events
handler = MyEventHandler(stream.output_stream)
await asyncio.gather(stream.input_stream.send_from_iterable(get_audio_stream()), handler.handle_events())
async def main():
# Ensure AWS_REGION is set, e.g., in your environment variables
aws_region = os.environ.get('AWS_REGION', 'us-east-1')
print(f"Connecting to Amazon Transcribe in region: {aws_region}")
client = TranscribeStreamingClient(region=aws_region)
await basic_transcribe_stream(client, aws_region)
if __name__ == "__main__":
# Set dummy credentials if not set for local testing, replace with actual for production
if not os.environ.get('AWS_ACCESS_KEY_ID'):
os.environ['AWS_ACCESS_KEY_ID'] = 'AKIAIOSFODNN7EXAMPLE'
os.environ['AWS_SECRET_ACCESS_KEY'] = 'wJalrXUtnFEMI/K7MDENG/bPxRfiorexamplekey'
os.environ['AWS_REGION'] = 'us-east-1'
print("Warning: Using dummy AWS credentials and region. Set environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_REGION for actual use.")
asyncio.run(main())