Overview

Podcasts are one of the richest sources of unstructured information, yet they’re nearly impossible to search, skim, or reference without a transcript. Podcast Transcriber solves this by producing accurate, speaker-labeled transcripts with timestamps and automatically generated show notes.

The skill uses state-of-the-art speech recognition with speaker diarization — it knows who’s talking and labels each segment accordingly. When you provide speaker names, it maps voices to identities. The output includes full transcripts, chapter markers based on topic shifts, and condensed show notes with key discussion points.

Perfect for podcast producers who need transcripts for SEO and accessibility, or listeners who want to quickly find specific discussions.

How It Works

Input — Upload audio file or provide podcast RSS feed URL
Transcribe — Speech-to-text with word-level timestamps
Diarize — Identify and label different speakers
Structure — Generate chapters based on topic shifts
Export — Output as transcript, SRT subtitles, or formatted show notes

Use Cases

Podcast production — Generate transcripts for show notes and SEO
Accessibility — Provide text alternatives for audio content
Research — Search and reference specific discussions
Content repurposing — Turn podcast episodes into blog posts or social clips
Meeting recordings — Transcribe multi-person meetings with speaker labels

Getting Started

# Transcribe a podcast episode
podcast-transcribe --input episode-42.mp3 --speakers "Alice,Bob"

# From RSS feed (latest episode)
podcast-transcribe --feed "https://podcast.example.com/feed.xml"

# Batch transcribe a season
podcast-transcribe --dir ./season-3/ --format srt --output ./transcripts/

Example

Transcribing: episode-42-ai-agents.mp3 (58:32)

Speakers detected: 2
  Speaker 1 → "Alice" (host)
  Speaker 2 → "Bob" (guest)

[00:00:15] Alice: Welcome back to the show. Today we're
talking about AI agents with Bob from AgentConn.

[00:00:22] Bob: Thanks for having me, Alice. I'm excited
to dive into where the agent ecosystem is heading.

[00:02:45] — Chapter: The Current State of AI Agents
[00:15:30] — Chapter: Multi-Agent Systems
[00:32:10] — Chapter: Security Challenges
[00:48:20] — Chapter: Predictions for 2027

Output: transcript.md, episode-42.srt, show-notes.md

Alternatives

Otter.ai — Real-time transcription with speaker identification
Whisper — OpenAI’s open-source speech recognition model
Descript — Transcription with text-based audio/video editing

Podcast Transcriber

Input / Output

Accepts

Produces

Overview

How It Works

Use Cases

Getting Started

Example

Alternatives

Tags

Compatible Agents

Claude

ElevenLabs

Similar Skills

YouTube Summarizer

Video Editor

Thumbnail Generator