Corran Webster : Sound in Python

Sound in Python

2023-02-04

I've been playing around with producing sound in Python using the scientific Python toolset for a number of years. In fact rather than buying a white noise generator for my wife about 10 years ago, I just generated a 15 minute sample of white noise with NumPy and SciPy and threw it on an old iPod we had hanging around - and she's still using it. One of the salesfolk at work described this as "the most Enthought thing" she'd ever heard.

Although the tools have changed over the years, there are some reasonably accessible libraries for Python that let you handle sound at a low level.

NumPy and SciPy

When you're working at a low-level with sound, NumPy and SciPy are likely to be your bread-and-butter: NumPy provides the basic array data structure that will hold the digitized sound data, along with basic noise generation (via numpy.random and standard functions like sin and cos) and frequency tools (via numpy.fft); SciPy provides more sophisticated filtering and signal analysis (via scipy.signal).

But perhaps most importantly, SciPy gives you an easy way to access and write data in .wav file format using scipy.io.wavfile.

For example, to generate noise with a given frequency distribution, you can do this something like:

import numpy as np

def noise(curve):
    # generate random phases for the signal
    phases = np.exp(1j * np.random.uniform(0.0, 2*np.pi, curve.shape))
    spectrum = curve * phases
    return np.fft.irfft(spectrum.real, norm='ortho')

This can be used to generate coloured noise in human-audible range:

def colored_noise(duration, rate, power=0.0):
    samples = int(duration * rate)
    freqs = np.fft.rfftfreq(samples, 1/rate)
    curve = np.zeros(samples // 2 + 1, dtype='float32')
    audible = (freqs >= 20) & (freqs < 20000)  # restrict to human audible range
    curve[audible] = freqs[audible] ** power  # compute curve
    curve *= np.sqrt(0.5 * rate / (curve**2).sum())  # normalize
    return noise(curve)

So you could create 10 seconds of pink noise and save it to a file as follows:

from scipy.io import wavfile

rate = 44100  # standard CD-quality sample rate
samples = colored_noise(10, rate, -0.5)
wavfile.write('noise.wav', rate, samples)

Sounddevice

For real-time access to the speakers and microphones on a computer, I've found SoundDevice to work reasonably well. It is a CFFI-based wrapper around the PortAudio cross-platform audio library and so can be pip-installed without needing further compilation. It can be used without NumPy, but for anything other than unmodified playback and recording you are likely to want to the Python scientific libraries, and Sounddevice understands interoperates with these out of the box.

While it has commands for simple playback and recording, the most useful mode of operation is that you write a callback function which is expected to either accept (for input) or return (for output) a buffer of sound data. This callback then gets called periodically by the audio channel.

For example, we could write a callback that generates pink noise like this:

volume = 0.1

def callback(outdata, frames, time, status):
    outupt[:] = volume * colored_noise(frames / rate, rate, -0.5)

This can then be sent to an output device using:

import sounddevice as sd

device = sd.query_devices(kind='output')
device_id = device['index']
rate = device['default_samplerate']
with sd.OutputStream(
    device=device_id,
    channels=1,
    blocksize=rate,
    callback=callback,
    samplerate=rate,
):
    print("Press enter to stop")
    input()

You can download the code for this blog post.