{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreicbhmb7eztxdwazsg537d7dnijckpvfkiqko5p6nmaee5hlrexcuy",
"uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3moxxanm2y6v2"
},
"coverImage": {
"$type": "blob",
"ref": {
"$link": "bafkreifjnnvnr3z3trgrt7vs6ul6lkcdze2m47kv7cqswfrmw53g3iahhm"
},
"mimeType": "image/webp",
"size": 64174
},
"path": "/walnutserv/how-to-fetch-youtube-transcripts-for-ai-summarization-and-rag-4137",
"publishedAt": "2026-06-23T17:35:26.000Z",
"site": "https://dev.to",
"tags": [
"python",
"ai",
"api",
"webdev",
"YouTube Transcript API on RapidAPI",
"https://rapidapi.com/wrt/api/get-youtube-transcript"
],
"textContent": "If you're building AI apps that summarize YouTube videos, power RAG over video content, or generate subtitles, you need **reliable transcript text** — not browser scraping that breaks every other week.\n\nThis guide shows a simple REST approach that returns JSON with timestamps, plain text, or raw timed cues.\n\n## Why transcripts matter for AI\n\n * **Summarization** — feed transcript text to GPT/Claude instead of sending video\n * **RAG** — chunk timed cues into a vector database for semantic search\n * **Accessibility** — build caption tools without manual SRT editing\n * **Content indexing** — search across a channel's spoken content\n\n\n\n## Quick start with curl\n\n\n curl \"https://get-youtube-transcript.p.rapidapi.com/transcript?video_id=jNQXAC9IVRw&format=json\" \\\n -H \"X-RapidAPI-Key: YOUR_KEY\" \\\n -H \"X-RapidAPI-Host: get-youtube-transcript.p.rapidapi.com\"\n\n\nReplace `YOUR_KEY` with a key from the YouTube Transcript API on RapidAPI. The Basic plan includes **100 free requests/month**.\n\n## Python example\n\n\n import requests\n\n API_URL = \"https://get-youtube-transcript.p.rapidapi.com/transcript\"\n headers = {\n \"X-RapidAPI-Key\": \"YOUR_KEY\",\n \"X-RapidAPI-Host\": \"get-youtube-transcript.p.rapidapi.com\",\n }\n params = {\"video_id\": \"jNQXAC9IVRw\", \"format\": \"json\"}\n\n data = requests.get(API_URL, headers=headers, params=params, timeout=60).json()\n\n for cue in data[\"transcript\"]:\n print(f\"[{cue['start']:.1f}s] {cue['text']}\")\n\n\n## Response formats\n\nformat | Use case\n---|---\n`json` | LLM pipelines, metadata + timestamps\n`text` | Simple summarization\n`raw` | Subtitle/SRT workflows\n\n## Parameters\n\n * `video_id` — 11-char YouTube ID **or** pass `url` with full YouTube link\n * `languages` — comma-separated codes, e.g. `en,pt`\n\n\n\n## Production tip\n\nFor batch jobs (indexing hundreds of videos), upgrade to the Ultra plan on RapidAPI — 100k requests/month at $9. Test in the playground first with any public video ID.\n\n_Disclosure: I built this API. Feedback welcome — especially on languages, latency, and batch endpoints._\n\n**Try it:** https://rapidapi.com/wrt/api/get-youtube-transcript",
"title": "How to Fetch YouTube Transcripts for AI Summarization and RAG"
}