YouTube Transcript API
This is a Python API that allows you to retrieve transcripts and subtitles for a given YouTube video. It supports both manually created and automatically generated subtitles, as well as subtitle translation. Unlike older solutions, it does not require a headless browser. The library is actively maintained, currently at version 1.2.4, and receives regular updates to address YouTube's API changes.
Warnings
- breaking The static methods `YouTubeTranscriptApi.get_transcript`, `YouTubeTranscriptApi.get_transcripts`, and `YouTubeTranscriptApi.list_transcripts` were removed in `v1.2.0`. These methods were previously deprecated in `v1.0.0`.
- breaking In `v1.1.0`, the library refactored caption retrieval from scraping the `/watch` HTML to fetching from the Innertube API. This change initially did not support authentication, potentially causing issues for users relying on authenticated requests.
- gotcha YouTube frequently blocks requests from known cloud provider IPs or if a high volume of requests originates from a single IP. This can lead to `IpBlocked` exceptions.
- gotcha Ensure you pass only the YouTube video ID, not the full video URL, to the API methods. For example, for `https://www.youtube.com/watch?v=VIDEO_ID`, the video ID is `VIDEO_ID`.
- gotcha If you encounter `NoTranscriptFound` or `TranscriptsDisabled` exceptions, it means the video either genuinely lacks captions or has them explicitly turned off by the uploader. Some automatically generated captions may also not be available.
Install
-
pip install youtube-transcript-api
Imports
- YouTubeTranscriptApi
from youtube_transcript_api import YouTubeTranscriptApi
- NoTranscriptFound
from youtube_transcript_api import NoTranscriptFound
- TranscriptsDisabled
from youtube_transcript_api import TranscriptsDisabled
- NoManualTranscriptFound
from youtube_transcript_api import NoManualTranscriptFound
- IpBlocked
from youtube_transcript_api import IpBlocked
Quickstart
from youtube_transcript_api import YouTubeTranscriptApi, NoTranscriptFound, TranscriptsDisabled, IpBlocked
video_id = "_dQ8wY76uI8" # Example: 'Never Gonna Give You Up' by Rick Astley
try:
# Instantiate the API client
ytt_api = YouTubeTranscriptApi()
# Fetch the transcript for the video
# By default, it attempts to get English. You can specify languages=['de', 'en'] for priority.
transcript = ytt_api.fetch(video_id)
# Print the full transcript text
full_text = " ".join([item['text'] for item in transcript])
print(f"Transcript for video ID {video_id}:\n{full_text[:500]}...")
# You can also list available transcripts and their languages
# transcript_list = ytt_api.list_transcripts(video_id)
# for t in transcript_list:
# print(f"Available transcript: {t.language} ({'generated' if t.is_generated else 'manual'})")
except NoTranscriptFound:
print(f"No transcript found for video ID: {video_id}")
except TranscriptsDisabled:
print(f"Transcripts are disabled for video ID: {video_id}")
except IpBlocked:
print(f"Your IP was blocked by YouTube. Consider using a proxy.")
except Exception as e:
print(f"An unexpected error occurred: {e}")