Transcribe accurately in 90+ languages

Lumen's transcription is powered by Scribe v2 — the same engine ElevenLabs uses for real-time captioning. It handles 90+ languages with state-of-the-art accuracy.

Pick the right format

Plain text — for notes, summaries, sharing.
SubRip (.srt) — the universal subtitle format. Use for video editing.
WebVTT (.vtt) — for HTML5 video players.
JSON — for programmatic processing.

Timestamps

Sentence-level is enough for editing and summarising. Word-level is what you need for karaoke captions, music videos, or alignment work.

Speaker diarisation

Turn this on for interviews, panels and any audio with multiple voices. Lumen labels each speaker (Speaker 1, Speaker 2…) — you can rename them in the project after.

Punctuation

Always on, unless you're processing transcripts further with your own NLP.

Transcribe accurately in 90+ languages

Pick the right format

Timestamps

Speaker diarisation

Punctuation

Try it now.