{"id":501,"library":"google-cloud-speech","title":"Google Cloud Speech-to-Text Python Client","description":"The `google-cloud-speech` Python client library provides seamless integration with the Google Cloud Speech-to-Text API. It allows developers to convert audio to text using powerful neural network models, supporting various languages and audio formats. Currently at version 2.38.0, the library is actively maintained with frequent releases, often monthly or bi-monthly, ensuring ongoing improvements and new features.","status":"active","version":"2.38.0","language":"python","source_language":"en","source_url":"https://github.com/googleapis/google-cloud-python/tree/main/packages/google-cloud-speech","tags":["google-cloud","speech-to-text","transcription","ai","machine-learning"],"install":[{"cmd":"pip install google-cloud-speech","lang":"bash","label":"Install latest stable version"}],"dependencies":[{"reason":"Requires Python 3.9 or higher.","package":"python","optional":false},{"reason":"Core dependency for Google's API clients.","package":"protobuf","optional":false},{"reason":"Enables Pythonic wrappers around protocol buffer messages.","package":"proto-plus","optional":false},{"reason":"Common utilities for Google Cloud API clients.","package":"google-api-core","optional":false}],"imports":[{"note":"The direct import from `google.cloud.speech.client` is from older versions and should be avoided. Use `from google.cloud import speech` and then `speech.SpeechClient()`.","wrong":"from google.cloud.speech.client import SpeechClient","symbol":"SpeechClient","correct":"from google.cloud import speech"}],"quickstart":{"code":"import os\nfrom google.cloud import speech\n\n# Set the path to your service account key file\n# This is typically done via the GOOGLE_APPLICATION_CREDENTIALS environment variable.\n# For local testing, you might set it in code (not recommended for production).\n# os.environ[\"GOOGLE_APPLICATION_CREDENTIALS\"] = \"/path/to/your/keyfile.json\"\n\ndef transcribe_audio(audio_file_path):\n    client = speech.SpeechClient()\n\n    with open(audio_file_path, \"rb\") as audio_file:\n        content = audio_file.read()\n\n    audio = speech.RecognitionAudio(content=content)\n    config = speech.RecognitionConfig(\n        encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n        sample_rate_hertz=16000,\n        language_code=\"en-US\",\n    )\n\n    try:\n        response = client.recognize(config=config, audio=audio)\n        for result in response.results:\n            print(f\"Transcript: {result.alternatives[0].transcript}\")\n    except Exception as e:\n        print(f\"An error occurred: {e}\")\n\n# Example usage (replace with your actual audio file)\nif __name__ == \"__main__\":\n    # Make sure you have an audio file named 'audio.wav' (16-bit, 16000 Hz, mono WAV)\n    # and that GOOGLE_APPLICATION_CREDENTIALS is set up.\n    # For testing, create a dummy WAV file or use a real one.\n    # e.g., using `scipy.io.wavfile.write('audio.wav', 16000, np.zeros(16000, dtype=np.int16))`\n    # Or, for a real test, ensure you have a small audio.wav file.\n    # You must have a service account key file and set the GOOGLE_APPLICATION_CREDENTIALS\n    # environment variable pointing to it, or pass credentials explicitly.\n    # e.g., export GOOGLE_APPLICATION_CREDENTIALS=\"/path/to/your/keyfile.json\"\n\n    # This example assumes a 'test.wav' file exists in the same directory\n    # and is a LINEAR16 (16-bit PCM), 16000 Hz, mono WAV file.\n    # Create a dummy file for demonstration if needed:\n    # import numpy as np\n    # from scipy.io.wavfile import write as write_wav\n    # write_wav('test.wav', 16000, np.zeros(16000, dtype=np.int16))\n\n    # Placeholder for a real audio file path\n    # In a real scenario, ensure this file exists and is correctly formatted.\n    # For this quickstart, you might use a short, simple WAV file.\n    audio_test_file = \"test.wav\"\n    print(f\"Attempting to transcribe: {audio_test_file}\")\n    print(\"Ensure GOOGLE_APPLICATION_CREDENTIALS is set and the file exists and is LINEAR16, 16000 Hz, mono.\")\n    transcribe_audio(audio_test_file)","lang":"python","description":"This quickstart demonstrates how to transcribe a local audio file using the Google Cloud Speech-to-Text client library. It covers client instantiation, reading audio content, configuring recognition settings, and processing the transcription response. Ensure you have a Google Cloud project with the Speech-to-Text API enabled and your `GOOGLE_APPLICATION_CREDENTIALS` environment variable pointing to a service account key file with appropriate permissions."},"warnings":[{"fix":"Refer to the official documentation for the V2 API client library and update your code to use the new V2 models and request structures. Be aware of the `SpeechClient` vs `speech_v2.SpeechClient` instantiation.","message":"The Speech-to-Text V2 API is not a drop-in replacement for V1. It features a modernized interface, new features, and different pricing. Existing V1 code will require modification to use V2.","severity":"breaking","affected_versions":"All versions when migrating from V1 to V2 API endpoints."},{"fix":"Ensure the `GOOGLE_APPLICATION_CREDENTIALS` environment variable is correctly set to the path of your service account JSON key file. Alternatively, explicitly pass `credentials` to the `SpeechClient` constructor. Make sure the service account has the 'Cloud Speech-to-Text User' role.","message":"The most common error is `DefaultCredentialsError`, indicating that the client cannot find valid authentication credentials.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always match the `RecognitionConfig` parameters (like `encoding` and `sample_rate_hertz`) to the actual properties of your audio file. For GCS files, ensure the URI is correctly formatted. Consider using an audio conversion library if your input format is not directly supported or needs normalization.","message":"Incorrect audio file encoding, sample rate, or format (e.g., trying to transcribe an MP3 with `LINEAR16` config) will lead to transcription errors or poor results. For files stored in Google Cloud Storage, the URI must be in `gs://bucket-name/object-name` format.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For very long audio, consider using asynchronous batch processing via `LongRunningRecognize`. For streaming, ensure stable network connectivity and consider breaking audio into shorter segments or adjusting `streaming_limit` if applicable. Simplify audio input by speaking clearly and minimizing background noise.","message":"Streaming transcription for longer audio (especially for certain non-English languages) may encounter intermittent failures around the 4-minute mark due to internal streaming limits or processing complexities.","severity":"gotcha","affected_versions":"All versions, more prevalent with streaming non-English audio."},{"fix":"This is an ongoing issue being tracked. Monitor the GitHub repository for updates and potential workarounds. You might need to adjust your application's logic to handle delayed interim results or consider alternative streaming patterns if real-time interim results are critical.","message":"When using streaming recognition with `interim_results=True` in the V2 API, the `responses_iterator` might block until all requests are done instead of yielding results immediately, which can be unexpected for real-time applications.","severity":"gotcha","affected_versions":"Versions utilizing V2 streaming with `interim_results=True`."}],"env_vars":null,"last_verified":"2026-05-12T14:25:07.594Z","next_check":"2026-06-26T00:00:00.000Z","problems":[{"fix":"Install the specific `google-cloud-speech` library using pip: `pip install google-cloud-speech`.","cause":"The `google-cloud-speech` library or its base `google-cloud` package is not installed in the Python environment being used, or an outdated/incorrect `google-cloud` package was installed.","error":"ModuleNotFoundError: No module named 'google.cloud'"},{"fix":"Set up Application Default Credentials by running `gcloud auth application-default login` or by setting the `GOOGLE_APPLICATION_CREDENTIALS` environment variable to the path of your service account key JSON file. Ensure the associated service account has the 'Cloud Speech-to-Text API User' role.","cause":"The application cannot find valid Google Cloud credentials to authenticate with the Speech-to-Text API, or the authenticated identity lacks the necessary permissions.","error":"The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials."},{"fix":"Ensure the `audio_channel_count` in your `RecognitionConfig` (e.g., `speech.RecognitionConfig(audio_channel_count=2, ...)`) accurately reflects the actual number of channels in your audio file. Convert the audio to mono if desired, or explicitly specify the correct channel count.","cause":"The `RecognitionConfig` sent with the API request specifies a single audio channel (mono) but the provided audio file has multiple channels (e.g., stereo), or vice-versa, leading to a mismatch.","error":"google.api_core.exceptions.InvalidArgument: 400 Must use single channel (mono) audio, but WAV header indicates 2 channels."}],"ecosystem":"pypi","meta_description":null,"install_score":95,"install_tag":"verified","quickstart_score":0,"quickstart_tag":"stale","pypi_latest":null,"install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.76,"mem_mb":24.5,"disk_size":"70.7M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.06,"mem_mb":21.1,"disk_size":"68M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.52,"mem_mb":26.5,"disk_size":"75.7M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.57,"mem_mb":23.3,"disk_size":"73M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.58,"mem_mb":26.3,"disk_size":"67.1M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.06,"mem_mb":22.6,"disk_size":"65M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.56,"mem_mb":26.7,"disk_size":"66.7M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":2.44,"mem_mb":23.6,"disk_size":"64M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.61,"mem_mb":24.2,"disk_size":"70.9M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":1.24,"mem_mb":20.9,"disk_size":"69M"}]},"quickstart_checks":{"last_tested":"2026-04-23","tag":"stale","tag_description":"widespread failures or data too old to trust","results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":1},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":1},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":1},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":1},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":1}]}}