Podcast Transcriber handles end-to-end podcast transcription with speaker identification, timestamps, and automatic show notes generation. It distinguishes between multiple speakers, labels them by name when provided, and generates searchable transcripts with chapter markers and key topic tags.
Podcasts are one of the richest sources of unstructured information, yet they’re nearly impossible to search, skim, or reference without a transcript. Podcast Transcriber solves this by producing accurate, speaker-labeled transcripts with timestamps and automatically generated show notes.
The skill uses state-of-the-art speech recognition with speaker diarization — it knows who’s talking and labels each segment accordingly. When you provide speaker names, it maps voices to identities. The output includes full transcripts, chapter markers based on topic shifts, and condensed show notes with key discussion points.
Perfect for podcast producers who need transcripts for SEO and accessibility, or listeners who want to quickly find specific discussions.
# Transcribe a podcast episode
podcast-transcribe --input episode-42.mp3 --speakers "Alice,Bob"
# From RSS feed (latest episode)
podcast-transcribe --feed "https://podcast.example.com/feed.xml"
# Batch transcribe a season
podcast-transcribe --dir ./season-3/ --format srt --output ./transcripts/
Transcribing: episode-42-ai-agents.mp3 (58:32)
Speakers detected: 2
Speaker 1 → "Alice" (host)
Speaker 2 → "Bob" (guest)
[00:00:15] Alice: Welcome back to the show. Today we're
talking about AI agents with Bob from AgentConn.
[00:00:22] Bob: Thanks for having me, Alice. I'm excited
to dive into where the agent ecosystem is heading.
[00:02:45] — Chapter: The Current State of AI Agents
[00:15:30] — Chapter: Multi-Agent Systems
[00:32:10] — Chapter: Security Challenges
[00:48:20] — Chapter: Predictions for 2027
Output: transcript.md, episode-42.srt, show-notes.md
AI agents that work well with Podcast Transcriber.
Transcribe and summarize YouTube videos with timestamps, key points, and chapter-based navigation.
AI-assisted video editing with scene detection, auto-cuts, transitions, and caption generation.
Generate click-worthy video thumbnails with AI-optimized text placement, color contrast, and emotion analysis.