Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicbhmb7eztxdwazsg537d7dnijckpvfkiqko5p6nmaee5hlrexcuy",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3moxxanm2y6v2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreifjnnvnr3z3trgrt7vs6ul6lkcdze2m47kv7cqswfrmw53g3iahhm"
    },
    "mimeType": "image/webp",
    "size": 64174
  },
  "path": "/walnutserv/how-to-fetch-youtube-transcripts-for-ai-summarization-and-rag-4137",
  "publishedAt": "2026-06-23T17:35:26.000Z",
  "site": "https://dev.to",
  "tags": [
    "python",
    "ai",
    "api",
    "webdev",
    "YouTube Transcript API on RapidAPI",
    "https://rapidapi.com/wrt/api/get-youtube-transcript"
  ],
  "textContent": "If you're building AI apps that summarize YouTube videos, power RAG over video content, or generate subtitles, you need **reliable transcript text** — not browser scraping that breaks every other week.\n\nThis guide shows a simple REST approach that returns JSON with timestamps, plain text, or raw timed cues.\n\n##  Why transcripts matter for AI\n\n  * **Summarization** — feed transcript text to GPT/Claude instead of sending video\n  * **RAG** — chunk timed cues into a vector database for semantic search\n  * **Accessibility** — build caption tools without manual SRT editing\n  * **Content indexing** — search across a channel's spoken content\n\n\n\n##  Quick start with curl\n\n\n    curl \"https://get-youtube-transcript.p.rapidapi.com/transcript?video_id=jNQXAC9IVRw&format=json\" \\\n      -H \"X-RapidAPI-Key: YOUR_KEY\" \\\n      -H \"X-RapidAPI-Host: get-youtube-transcript.p.rapidapi.com\"\n\n\nReplace `YOUR_KEY` with a key from the YouTube Transcript API on RapidAPI. The Basic plan includes **100 free requests/month**.\n\n##  Python example\n\n\n    import requests\n\n    API_URL = \"https://get-youtube-transcript.p.rapidapi.com/transcript\"\n    headers = {\n        \"X-RapidAPI-Key\": \"YOUR_KEY\",\n        \"X-RapidAPI-Host\": \"get-youtube-transcript.p.rapidapi.com\",\n    }\n    params = {\"video_id\": \"jNQXAC9IVRw\", \"format\": \"json\"}\n\n    data = requests.get(API_URL, headers=headers, params=params, timeout=60).json()\n\n    for cue in data[\"transcript\"]:\n        print(f\"[{cue['start']:.1f}s] {cue['text']}\")\n\n\n##  Response formats\n\nformat | Use case\n---|---\n`json` | LLM pipelines, metadata + timestamps\n`text` | Simple summarization\n`raw` | Subtitle/SRT workflows\n\n##  Parameters\n\n  * `video_id` — 11-char YouTube ID **or** pass `url` with full YouTube link\n  * `languages` — comma-separated codes, e.g. `en,pt`\n\n\n\n##  Production tip\n\nFor batch jobs (indexing hundreds of videos), upgrade to the Ultra plan on RapidAPI — 100k requests/month at $9. Test in the playground first with any public video ID.\n\n_Disclosure: I built this API. Feedback welcome — especially on languages, latency, and batch endpoints._\n\n**Try it:** https://rapidapi.com/wrt/api/get-youtube-transcript",
  "title": "How to Fetch YouTube Transcripts for AI Summarization and RAG"
}