Manipulating audio files with PyDub

Spoken Language Processing in Python

Daniel Bourke

Machine Learning Engineer/YouTube Creator

Turning it down to 11

# Import audio file
wav_file = AudioSegment.from_file("wav_file.wav")
# Minus 60 dB
quiet_wav_file = wav_file - 60
# Try to recognize quiet audio
recognizer.recognize_google(quiet_wav_file)
UnknownValueError:
Spoken Language Processing in Python

Increasing the volume

# Increase the volume by 10 dB
louder_wav_file = wav_file + 10
# Try to recognize
recognizer.recognize_google(louder_wav_file)
this is a wav file
Spoken Language Processing in Python

This all sounds the same

# Import AudioSegment and normalize
from pydub import AudioSegment
from pydub.effects import normalize
from pydub.playback import play
# Import uneven sound audio file
loud_quiet = AudioSegment.from_file("loud_quiet.wav")
# Normalize the sound levels
normalized_loud_quiet = normalize(loud_quiet)
# Check the sound
play(normalized_loud_quiet)
Spoken Language Processing in Python

Remixing your audio files

# Import audio with static at start
static_at_start = AudioSegment.from_file("static_at_start.wav")
# Remove the static via slicing
no_static_at_start = static_at_start[5000:]

# Check the new sound
play(no_static_at_start)
Spoken Language Processing in Python

Remixing your audio files

# Import two audio files
wav_file_1 = AudioSegment.from_file("wav_file_1.wav")
wav_file_2 = AudioSegment.from_file("wav_file_2.wav")
# Combine the two audio files
wav_file_3 = wav_file_1 + wav_file_2

# Check the sound
play(wav_file_3)
# Combine two wav files and make the combination louder
louder_wav_file_3 = wav_file_1 + wav_file_2 + 10
Spoken Language Processing in Python

Splitting your audio

# Import phone call audio
phone_call = AudioSegment.from_file("phone_call.wav")
# Find number of channels
phone_call.channels
2
# Split stereo to mono
phone_call_channels = phone_call.split_to_mono()
phone_call_channels
[<pydub.audio_segment.AudioSegment, <pydub.audio_segment.AudioSegment>]
Spoken Language Processing in Python

Splitting your audio

# Find number of channels of first list item
phone_call_channels[0].channels
1
# Recognize the first channel
recognizer.recognize_google(phone_call_channel_1)
the pydub library is really useful
Spoken Language Processing in Python

Let's code!

Spoken Language Processing in Python

Preparing Video For Download...