Google Cloud Text-to-Speech

raw JSON →
2.35.0 verified Tue May 12 auth: no python install: verified quickstart: stale

The `google-cloud-texttospeech` client library for Python enables seamless integration with the Google Cloud Text-to-Speech API. It allows developers to convert text into natural-sounding speech using Google's advanced AI technologies, supporting a wide range of voices, languages, and customization options. As of version 2.35.0, it continues to be actively maintained with frequent updates as part of the broader Google Cloud Python client libraries.

pip install google-cloud-texttospeech
error ModuleNotFoundError: No module named 'google.cloud'
cause This error occurs when the `google-cloud-texttospeech` library is not correctly installed in the Python environment, or if an old, deprecated `google-cloud` meta-package is installed instead of the specific client library.
fix
Ensure you have the correct library installed by running: pip install --upgrade google-cloud-texttospeech. If you previously installed google-cloud, uninstall it with pip uninstall google-cloud first.
error DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS
cause The application cannot find the necessary Google Cloud credentials to authenticate with the Text-to-Speech API. This typically means the `GOOGLE_APPLICATION_CREDENTIALS` environment variable is not set or points to an invalid or non-existent service account key file.
fix
Set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the absolute path of your service account JSON key file. On Linux/macOS: export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/key.json". On Windows (Command Prompt): set GOOGLE_APPLICATION_CREDENTIALS="C:\path\to\your\key.json". Alternatively, use gcloud auth application-default login for local development.
error AttributeError: module 'google.cloud.texttospeech' has no attribute 'types'
cause This error usually indicates a version incompatibility where the code is using an older syntax (e.g., `texttospeech.types.SynthesisInput`) that is no longer valid for the installed version of the `google-cloud-texttospeech` library, or vice-versa.
fix
Update the google-cloud-texttospeech library to the latest version (pip install --upgrade google-cloud-texttospeech). If the error persists, check the official client library documentation for the correct way to import and use API types, as the structure might have changed (e.g., texttospeech.SynthesisInput directly instead of texttospeech.types.SynthesisInput).
error Forbidden: 403 POST Cloud Text-to-Speech API has not been used in project # before or it is disabled.
cause The Google Cloud Text-to-Speech API is not enabled for the specific Google Cloud project being used, or the service account lacks the necessary permissions to access it.
fix
Go to the Google Cloud Console, select your project, navigate to 'APIs & Services' > 'Enabled APIs & Services', and ensure that the 'Cloud Text-to-Speech API' is enabled. Also, verify that your service account has roles such as 'Cloud Text-to-Speech User' or a custom role with appropriate permissions.
error RESOURCE_EXHAUSTED: Quota exceeded.
cause Your Google Cloud project has exceeded the allocated quota for the Text-to-Speech API, such as the number of characters synthesized per minute or total bytes per request.
fix
Review your API usage in the Google Cloud Console ('IAM & Admin' > 'Quotas') for the Text-to-Speech API. You can request a quota increase if needed. For requests exceeding content limits, use asynchronous methods or process smaller chunks of text.
breaking Google Cloud Text-to-Speech voices are periodically updated or replaced. Existing voice names in your application might be deprecated or behave differently, potentially causing unexpected audio output or errors.
fix Regularly check the Cloud Text-to-Speech release notes for voice updates. Consider using `client.list_voices()` to programmatically discover available voices and their names, and implement a fallback mechanism for voice selection. Test your application's voice output after significant service updates.
gotcha Incorrect authentication setup is a common issue. If `GOOGLE_APPLICATION_CREDENTIALS` is not set or Application Default Credentials (ADC) are not configured, the client will fail to connect to the API.
fix Ensure you have enabled the Text-to-Speech API in your Google Cloud project and set up authentication. For local development, use `gcloud auth application-default login` or set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable to the path of your service account key file. For production, refer to the ADC documentation.
gotcha Using an invalid `language_code`, `name`, or `ssml_gender` in `VoiceSelectionParams` can lead to `InvalidArgument` errors.
fix Always refer to the official documentation or use the `client.list_voices()` method to get a current list of supported voices, their language codes, and SSML genders. This ensures compatibility and avoids errors due to incorrect voice parameters.
gotcha The minimum supported Python version for `google-cloud-texttospeech` is `3.9`. Using older Python versions can lead to installation issues or runtime errors.
fix Ensure your development and deployment environments use Python 3.9 or newer. Upgrade your Python installation if necessary.
python os / libc status wheel install import disk
3.10 alpine (musl) wheel - 1.87s 69.2M
3.10 alpine (musl) - - 1.84s 68.1M
3.10 slim (glibc) wheel 6.2s 1.13s 67M
3.10 slim (glibc) - - 1.09s 66M
3.11 alpine (musl) wheel - 2.33s 73.8M
3.11 alpine (musl) - - 2.66s 72.7M
3.11 slim (glibc) wheel 5.2s 1.66s 72M
3.11 slim (glibc) - - 1.57s 70M
3.12 alpine (musl) wheel - 2.43s 65.3M
3.12 alpine (musl) - - 2.76s 64.1M
3.12 slim (glibc) wheel 4.3s 2.04s 63M
3.12 slim (glibc) - - 2.52s 62M
3.13 alpine (musl) wheel - 2.30s 65.0M
3.13 alpine (musl) - - 2.71s 63.8M
3.13 slim (glibc) wheel 4.6s 1.86s 63M
3.13 slim (glibc) - - 2.31s 62M
3.9 alpine (musl) wheel - 1.66s 69.2M
3.9 alpine (musl) - - 1.57s 68.1M
3.9 slim (glibc) wheel 7.1s 1.37s 67M
3.9 slim (glibc) - - 1.15s 66M

This quickstart synthesizes a given text into an MP3 audio file. It demonstrates client initialization, defining input text, selecting a voice, configuring audio output, and saving the generated speech. Ensure you have authenticated to Google Cloud, typically via the `GOOGLE_APPLICATION_CREDENTIALS` environment variable or `gcloud auth application-default login`.

import os
from google.cloud import texttospeech_v1 as texttospeech

def synthesize_text(text):
    """Synthesizes speech from the input string of text.

    Ensure GOOGLE_APPLICATION_CREDENTIALS environment variable is set
    or authenticate using `gcloud auth application-default login`.
    """

    # Instantiates a client
    client = texttospeech.TextToSpeechClient()

    # Set the text input to be synthesized
    synthesis_input = texttospeech.SynthesisInput(text=text)

    # Build the voice request, select the language code ('en-US') and the SSML voice gender ('NEUTRAL')
    voice = texttospeech.VoiceSelectionParams(
        language_code="en-US",
        ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL,
    )

    # Select the type of audio file you want returned
    audio_config = texttospeech.AudioConfig(
        audio_encoding=texttospeech.AudioEncoding.MP3
    )

    # Perform the text-to-speech request
    response = client.synthesize_speech(
        input=synthesis_input,
        voice=voice,
        audio_config=audio_config
    )

    # The response's audio_content is binary. Write it to a file.
    output_filename = "output.mp3"
    with open(output_filename, "wb") as out:
        out.write(response.audio_content)
    print(f'Audio content written to file "{output_filename}"')

if __name__ == "__main__":
    # For local development, set the GOOGLE_APPLICATION_CREDENTIALS environment variable.
    # For example: os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/path/to/your/key.json'
    # Or use `gcloud auth application-default login`
    if not os.environ.get('GOOGLE_APPLICATION_CREDENTIALS'):
        print("Warning: GOOGLE_APPLICATION_CREDENTIALS not set. Assuming gcloud ADC is configured.")

    text_to_synthesize = "Hello, world! This is a test of the Google Cloud Text-to-Speech API."
    synthesize_text(text_to_synthesize)