How to Make Someone Speak/Sing Slower or Faster?

The Problem

You have a voice audio recording and you want to make it fit into a tighter timeline.

Resampling the entire signal, a naive approach, will change the speed, but so does the pitch. Slow it down and you get an underwater effect. Speed it up and you get chipmunks.

So, how do you change the speed while preserving everything about the voice?

Quick Answer

Use an OverLap-Add (OLA) method to split the audio into frames.
Duplicate or drop frames as necessary to change the duration, similarly to an old school PSOLA.

The basics are there, but it requires quite a lot of care and a bit of AI squeezed at the proper joints to be done properly.

How you do it with Pitchmeld

1import soundfile as sf

2import pitchmeld as pm

3audio, sr = sf.read("input.wav")

4slowed = pm.transform(audio, sr, pbf=0.5)

5sf.write("output.wav", slowed, sr)

The pbf parameter is the playback factor (see doc):

0.5 slows to half speed (doubling duration)
1.5 speeds up by 50%
2 doubles the speed (half the duration).

Audio Examples

Below are examples of time stretching applied to both sung and spoken recordings. The original pitch is preserved.

Loading audio samples...

If you want to know more, have a look at the post about transfo