Google Cloud Text-to-Speech

2.35.0 · active · verified Sun Mar 29

The `google-cloud-texttospeech` client library for Python enables seamless integration with the Google Cloud Text-to-Speech API. It allows developers to convert text into natural-sounding speech using Google's advanced AI technologies, supporting a wide range of voices, languages, and customization options. As of version 2.35.0, it continues to be actively maintained with frequent updates as part of the broader Google Cloud Python client libraries.

Warnings

Install

Imports

Quickstart

This quickstart synthesizes a given text into an MP3 audio file. It demonstrates client initialization, defining input text, selecting a voice, configuring audio output, and saving the generated speech. Ensure you have authenticated to Google Cloud, typically via the `GOOGLE_APPLICATION_CREDENTIALS` environment variable or `gcloud auth application-default login`.

import os
from google.cloud import texttospeech_v1 as texttospeech

def synthesize_text(text):
    """Synthesizes speech from the input string of text.

    Ensure GOOGLE_APPLICATION_CREDENTIALS environment variable is set
    or authenticate using `gcloud auth application-default login`.
    """

    # Instantiates a client
    client = texttospeech.TextToSpeechClient()

    # Set the text input to be synthesized
    synthesis_input = texttospeech.SynthesisInput(text=text)

    # Build the voice request, select the language code ('en-US') and the SSML voice gender ('NEUTRAL')
    voice = texttospeech.VoiceSelectionParams(
        language_code="en-US",
        ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL,
    )

    # Select the type of audio file you want returned
    audio_config = texttospeech.AudioConfig(
        audio_encoding=texttospeech.AudioEncoding.MP3
    )

    # Perform the text-to-speech request
    response = client.synthesize_speech(
        input=synthesis_input,
        voice=voice,
        audio_config=audio_config
    )

    # The response's audio_content is binary. Write it to a file.
    output_filename = "output.mp3"
    with open(output_filename, "wb") as out:
        out.write(response.audio_content)
    print(f'Audio content written to file "{output_filename}"')

if __name__ == "__main__":
    # For local development, set the GOOGLE_APPLICATION_CREDENTIALS environment variable.
    # For example: os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/path/to/your/key.json'
    # Or use `gcloud auth application-default login`
    if not os.environ.get('GOOGLE_APPLICATION_CREDENTIALS'):
        print("Warning: GOOGLE_APPLICATION_CREDENTIALS not set. Assuming gcloud ADC is configured.")

    text_to_synthesize = "Hello, world! This is a test of the Google Cloud Text-to-Speech API."
    synthesize_text(text_to_synthesize)

view raw JSON →