Lumen's transcription is powered by Scribe v2 — the same engine ElevenLabs uses for real-time captioning. It handles 90+ languages with state-of-the-art accuracy.
Pick the right format
- Plain text — for notes, summaries, sharing.
- SubRip (.srt) — the universal subtitle format. Use for video editing.
- WebVTT (.vtt) — for HTML5 video players.
- JSON — for programmatic processing.
Timestamps
Sentence-level is enough for editing and summarising. Word-level is what you need for karaoke captions, music videos, or alignment work.
Speaker diarisation
Turn this on for interviews, panels and any audio with multiple voices. Lumen labels each speaker (Speaker 1, Speaker 2…) — you can rename them in the project after.
Punctuation
Always on, unless you're processing transcripts further with your own NLP.