Tutorials › Editing

Lip-sync any audio to any face

Precision vs Speed mode — when to pick each one, plus the small audio prep that makes lip-sync 2× better.

Intermediate 5 min read Updated 2026-05-25

Lip Sync takes a video with a face and an unrelated audio file, and re-renders the mouth shapes so they match the new audio perfectly.

Use cases

You re-recorded narration and need the face to match the new audio.
You want a real person's face to deliver a script they didn't record.
You translated audio and the original mouth shapes don't match the new language.

Precision vs Speed

Precision mode uses HeyGen's avatar-grade lip-sync model. Higher quality, ~2× slower, ~2× the credits. Use for hero content.

Speed mode is for drafts and batch work. Quality is still high — for content that won't be watched on a big screen, it's often indistinguishable.

Audio prep

The cleaner the audio, the better the sync. Run Voice Isolator on your audio first if there's background noise. Aim for a tight, dry voice track.

Captions option

Lumen can burn captions from the new audio directly into the synced video. Good for accessibility and for sound-off feeds.

← Translate a video with perfect lip sync Extend any clip naturally →

Try it now.

The fastest way to learn is to make. Open Make Faceless Videos and try the steps above on your own footage.

Start free in 20 seconds