Batch Audio Transcription

Transcribe meetings, podcasts, interviews, and customer calls at scale using the Runcrate Whisper API. This guide covers single-file transcription, batch processing across a folder, and output format options.

What you’ll build

A pipeline that transcribes multiple audio files using Whisper large-v3 through the Runcrate API. The pipeline reads a folder of recordings, transcribes each one, and writes the results as plain text, SRT subtitles, or structured JSON — your choice.

Single file transcription

curl https://api.runcrate.ai/v1/audio/transcriptions \
  -H "Authorization: Bearer rc_live_YOUR_API_KEY" \
  -F model="openai/whisper-large-v3" \
  -F file=@meeting-2025-05-19.mp3

Batch processing

Transcribe every audio file in a folder and write the results to disk. This example uses the Runcrate Python SDK and outputs SRT subtitle files.

import os
from pathlib import Path
from runcrate import Runcrate

client = Runcrate(api_key="rc_live_YOUR_API_KEY")

audio_dir = Path("./recordings")
output_dir = Path("./transcripts")
output_dir.mkdir(exist_ok=True)

SUPPORTED = {".mp3", ".wav", ".m4a", ".flac", ".ogg", ".webm"}

for audio_file in sorted(audio_dir.iterdir()):
    if audio_file.suffix.lower() not in SUPPORTED:
        continue

    with open(audio_file, "rb") as f:
        result = client.models.transcribe(
            model="openai/whisper-large-v3",
            file=f,
            filename=audio_file.name,
            response_format="srt",
        )

    srt_path = output_dir / f"{audio_file.stem}.srt"
    srt_path.write_text(result.text)
    print(f"Transcribed: {audio_file.name} → {srt_path.name}")

Batch processing with the OpenAI SDK

The same pattern works with the OpenAI Python SDK pointed at the Runcrate API:

import os
from pathlib import Path
from openai import OpenAI

client = OpenAI(
    base_url="https://api.runcrate.ai/v1",
    api_key="rc_live_YOUR_API_KEY",
)

audio_dir = Path("./recordings")
output_dir = Path("./transcripts")
output_dir.mkdir(exist_ok=True)

for audio_file in sorted(audio_dir.glob("*.mp3")):
    transcript = client.audio.transcriptions.create(
        model="openai/whisper-large-v3",
        file=open(audio_file, "rb"),
    )

    txt_path = output_dir / f"{audio_file.stem}.txt"
    txt_path.write_text(transcript.text)
    print(f"Transcribed: {audio_file.name} → {txt_path.name}")

Output formats

Format	Extension	Use case
`text`	`.txt`	Plain transcript — search, summarization, RAG ingestion
`json`	`.json`	Structured output with word-level timestamps
`srt`	`.srt`	Subtitle file for video editors (Premiere, DaVinci, Final Cut)
`vtt`	`.vtt`	Web video subtitles (HTML5 `<track>` element)

Pass the format via the response_format parameter:

result = client.models.transcribe(
    model="openai/whisper-large-v3",
    file=f,
    filename="episode-42.mp3",
    response_format="srt",  # or "text", "json", "vtt"
)

Language hints

Whisper auto-detects the spoken language, but you can improve accuracy on non-English audio by passing a language hint:

result = client.models.transcribe(
    model="openai/whisper-large-v3",
    file=f,
    filename="interview-tokyo.mp3",
    language="ja",  # ISO 639-1 code
)

Use cases

Meeting recordings — transcribe and feed into an LLM for searchable notes and action items.
Podcast episodes — generate full transcripts for show notes, blog posts, and SEO.
Customer support calls — bulk-transcribe for quality analysis and compliance review.
Lecture recordings — produce study materials and make content accessible.
Video content — generate SRT/VTT subtitle files for automatic captioning.

Tips

Supported formats. MP3, WAV, M4A, FLAC, OGG, and WebM.
Large files. For recordings longer than ~2 hours, split into chunks before uploading. Tools like ffmpeg make this easy: ffmpeg -i long-meeting.mp3 -f segment -segment_time 1800 -c copy chunk_%03d.mp3.
SRT for video editing. SRT is the most widely supported subtitle format across video editors and media players. Use VTT only if you need web-native <track> elements.
Combine with chat models. Pipe transcripts into a Runcrate chat model for summarization, action-item extraction, or translation — see the RAG Pipeline guide for the pattern.

​What you’ll build

​Single file transcription

​Batch processing

​Batch processing with the OpenAI SDK

​Output formats

​Language hints

​Use cases

​Tips