Python and Sound


Topics

  1. Exploring Sound
  2. Changing the Volume
  3. Splicing Sounds
  4. References
  5. Exercise

Please get the code examples and sounds used in this lab, by clicking here


1. Exploring Sound

In order to understand what we are doing in this lab, it is helpful to go through some terminology and learn about sound and how it is stored. Once you understand a little about sound, we will take a look at JES code for working with sound, and an "explorer" tool, which allows you to play sound and view the values. Finally, to record sound, we will provide a brief overview of Audacity.

1.1 What is Sound?

The answer to the question of, "what is sound?" is really a physics answer. Have you ever struck a tuning fork and placed it in water or dropped something in water? If you have, you would have seen waves. Sounds are waves in air, which are picked up by sensors in our ears. Because sounds are waves, let us take a look at a basic sine wave (which we would hear as pure single tone).

Sine graph

Picture was taken from: http://www.purplemath.com/modules/grphtrig.htm

Some important definitions are:

Questions:

  1. Would the following diagram represent a sound with a higher or lower pitch than the original sine wave diagram?
    DoubleFreq
  2. Would we perceive this sound to be louder or quieter than the original sine wave diagram?

    doubleAmplitude

The sounds that we hear are not typically pure tones: they will be composed of waves of different frequency. When you are looking at sound waves in the following sections, you will see more rough edges than the smooth and regular sine waves. This is the result of several frequencies combining together.

1.2 How is Sound Stored?

If you want to capture a sine wave such as the above, you could use an array. For instance taking a sample at (π/2)t, your array would look something like this:

0 3 0 -3 0 3 0

The resulting "wave" would look very triangular. Ideally, more samples could be taken. However, you can understand the idea of representing the wave.

Two questions come in to play when storing sound:

  1. What is the maximum amplitude to be stored? This will determine how many bytes in memory you will use for each sound sample or array element. For instance, if you want to capture amplitudes from 32 767 (215-1) to -32 768 (-215), then you will need 16 bits (2 bytes) for each element.

  2. How many samples or array elements will you have for every second of recording? For instance, the array above could have more samples to "smooth" the wave. The rate at which samples are collected is the sample rate. Some typical sample rates are below:

Now that you understand a little more about sound, let us dive in with JES.

1.3 JES Code for Working with Sound

You will find that some functions that we use for sound are not part of Python; they only work inside of JES. Examples of JES specific functions are:

If you are not sure which functions are part of Python or which are part of JES, you can look in the JES menu under Help > Understanding Sound. Click on "Sound Functions in JES". You might find some cool stuff in there. (As an aside, playAtRate is not in the Help documentation, but mentioned in the Computing and Programming in Python, a Multimedia Approach book).

The following is a compilation of all of the JES functions from above:

def soundExplore():
  fileName=pickAFile()
  aSound=makeSound(fileName)
  print "Information about aSound", aSound
  print "Getting First Sample:", getSampleValueAt(aSound, 0)
  print "Getting the Length:", getLength(aSound)
  print "Getting Sampling Rate:", getSamplingRate(aSound)
  play(aSound)
  #blockingPlay(aSound)
  playAtRate(aSound,2.0)
  explore(aSound)
  return aSound

Notice that when you run the code and select "always.wav" the following is output:

>>> soundExplore()
Information about aSound Sound file: always.wav number of samples: 15394
Getting First Sample: 513
Getting the Length: 15394
Getting Sampling Rate: 22050.0
Sound file: always.wav number of samples: 15394

You might hear two versions of "always" being played. The one version is twice as fast and sounds like "Mickey Mouse". Try commenting out the play (put a # in front of it) and uncomment the blockingPlay to see what happens.

The other two sounds included in this lab do not work with 2.0 sent as an argument to playAtRate, but you can try them with 0.5 as an argument instead.

What difference do you notice in the sampling rate between "CS325.wav" and "always.wav"?

1.4 MediaTools Explore

After you ran the code above, you will have noticed that an additional window has appeared. This is the"Explorer" tool for sound, which is bundled together with JES as part of the MediaTools application. This is what it should look like for you (with "always.wav"):

Sound Explorer

Notice in the above diagram that the current index and sample value are encircled in red. At index 0, the sample value is 513.

You can listen to excerpts out of a recording such as this. For instance, if you want "way" from "always", you can click and drag between approximately 4296 and 9000. Click on "Play Selection" to hear that range. See below for an example of selecting "way" from the recording.

Explorer Ranges

You might be asking, "how do you know where to choose your selections?". Mostly it comes through experimentation. Look for places where the waveform changes. Small jags in the graph are either background noise (from silence between words) or sounds like "s".

1.5 Audacity

In the exercise for this lab, you will be asked to make your own sound recordings. One way of doing this is by using Audacity. Audacity is a free, easy-to-use audio editor and recorder that works on Windows, Mac OS X and other operating systems. If you would like to try it at home, you can get it here: http://audacity.sourceforge.net/download/

This section is a crash course in Audacity. The focus is on recording, trimming, and exporting.

1.5.1 Recording

The interface for Audacity looks like the following:

Audacity Recording

Before we record, we should adjust the two settings encircled in red above:

  1. Change the sampling rate to 22050
  2. Change to Mono Input Channel

Now, all we have to do is hit the "record" button: Record

When we are done recording, we can click on the "stop" button: Stop Recording

To listen to our recording, we can click on the "play" button: Play

1.5.2 Trimming

Audacity probably caught you unaware! Before you realized that it was recording, it had captured a few seconds of background noise. If you would like to trim your selection:

  1. Click and drag on the waveform to select part of your recording.
  2. Play the selection to make sure that you have everything that you need.
  3. Click on the "Trim" button shown below:

    Trimming

1.5.3 Exporting

To "Save" the file in a format that you can use for JES, you will "Export" your file as a "wav". The steps are:

  1. Under the main menu, select File > Export...
  2. In the dialog box that appears:
  3. Click on the Save button
  4. On the next dialog box, click on OK

1.5.4 Removing Tracks

Once you have recorded something in Audacity, it keeps that recording as a "track". Any additional recordings are added to your sound. This is not ideal if you would like to record something new. To delete old tracks, click on the X in the upper left hand corner of the track as highlighted in red below:

Remove Track


2. Changing the Volume

As discussed in the above sections, sound is a wave of air pressure and it can be sampled and stored in an array. If you change the values stored in the array, the sound will change. One change we can make is to multiply all the elements by some ratio or value. Effectively, this will change the amplitude of the wave. If we change the amplitude, we change the...what?

The question becomes: "how do we travel through all of the elements?". There are two ways; both involve a for loop. The first way is to generate a list of all of the samples, using a function called getSamples(). The second way is to use the for loop to travel through the indices of the sound sample. The following sub-sections show the two ways of travelling through the sound samples as well as different ways to modify the volume and a side-effect of modifying volume.

2.1 Looping through Sound Samples

One way you can travel through the sound is by using getSamples, which returns a list all the samples (Sample objects) in the sound. The code to travel through this list is below:

#Program 56, page 161
def increaseVolume(sound):
  for sample in getSamples(sound):
    value =getSampleValue(sample)
    setSampleValue(sample,value*2)

Other new functions used in this code are:

To try out this code and listen to the results, type the following:

>>> mySound=returnSound()
>>> explore(mySound)
>>> increaseVolume(mySound)
>>> explore(mySound)

Where, returnSound() is a helper function, included in this lab's sample code file. The helper function calls pickAFile() and makeSound()and returns the sound. We call the explore() function twice so that you can see the waveform and hear the change in volume for the original versus the modification.

2.2 Looping through a Range

The other way of travelling through the sound is by accessing the sample values using an index. Notice that the code below uses the range() function to generate a sequence (or list), which goes from 0 to (getLength(sound)-1).

#program 61, modified from page 175
def increaseVolume2(sound):
  for index in range(0, getLength(sound)):
    value =getSampleValueAt(sound,index)
    setSampleValueAt(sound,index,value*2)

To access/modify the samples at an index, two additional functions are used in the above code:

This increaseVolume2() function does the exact same thing as the increaseVolume() function in section 2.1. Why would we choose to use this version of for loop instead?

2.3 Creating a Generic Volume Modifier

Instead of having a function that only doubles the values of the sound samples, we can create a more generic volume function with a factor sent as an argument:

#program 58, page 167
def changeVolume(sound, factor):
  for sample in getSamples(sound):
    value =getSampleValue(sample)
    setSampleValue(sample, value*factor)

Notice that our sample values are multiplied by factor. What would the following calls do to the volume?

Be aware that "mySound" will be modified after each call to changeVolume().

2.4 Normalizing Sound

Playing with volume is pretty rewarding! What if you want to make the sound as loud as possible? You could try through experimentation to find a "multiplier" to use. That would, however, be tedious. What if you had some code that would calculate that "multiplier"! The following code does just that. It finds a multiplier that will give us the loudest volume that we can get based on a maximum sound sample and then boosts the sound values by that amount.

#program 59, page 168
def normalize(sound):
  largest=0
  for s in getSamples(sound):
    largest=max(largest,getSampleValue(s))
  multiplier=32767.0/largest
  print "Largest sample value in original sound was", largest
  print "Multiplier is", multiplier
  
  for s in getSamples(sound):
    louder = multiplier * getSampleValue(s)
    setSampleValue(s, louder)

The algorithm is as follows:

Could we have used our changeVolume() function in this code?

2.5 Note on Clipping

You might have noticed that when you increase the volume of "always.wav" that some strange sounds result: it might sound like your speakers are breaking. If you run the normalize() function on this sound sample, you will notice that the multiplier is approximately 1.4 (less than double). Because the increaseVolume() function is multiplying everything by 2.0, the largest sample will exceed 32 767. This is referred to as clipping. In other words, "the normal curves of the sound are broken by the limitations of the sample size" (page 169, Computing and Programming in Python, A Multimedia Approach by Mark Guzdial). If you look at the wave in signal view, it looks like someone has taken the scissors and clipped off the peaks of the waves. Watch out for this effect in your recordings! You will see many wave peaks extending out to the edges.


3. Splicing Sounds

And now for the section that you all have been anticipating!! How do you put pieces of sound together? For instance, you want to insert words that were not in an original recording. This section answers that question.

From a definition point of view, this is referred to as splicing sounds, "a term that dates back to when sounds were recorded on tape--juggling the order of things on the tape involved literally cutting the tape into segments and then gluing it back together in the right order." (page 177, of Computing and Programming in Python, A Multimedia Approach by Mark Guzdial)

Bundled together with the Python code for this lab are three sound samples. We will splice these sound samples in the following subsections. First, we will look at putting two sounds together by copying the values at specific ranges. Then, we will create some functions that will help us extract pieces of sound and copy them together.

3.1 Combining Two Sounds Together

The following code creates a new sound that splices "cs325" with "is fun" from "what_is_fun.wav".

#program 63, modified from page 177
#call setMediaPath() before calling this function
def merge():
  cs325Sound = makeSound(getMediaPath("CS325.wav"))
  isFunSound = makeSound(getMediaPath("what_is_fun.wav"))
  #target = makeSound(getMediaPath("sec3silence.wav"))
  samplingRate = int(getSamplingRate(cs325Sound))
  cs325Len = getLength(cs325Sound)
  silenceLen = int (0.1*samplingRate)
  isFunLen = getLength(isFunSound)
 
  target=makeEmptySound(cs325Len + silenceLen + isFunLen)
  #target=makeEmptySound(cs325Len + silenceLen + isFunLen, samplingRate)
  print "CS325 Sampling Rate is ", getSamplingRate(cs325Sound)
  print "Target Sampling Rate is ", getSamplingRate(target)
  index=0
  #Copy in "CS325"
  for source in range(0, getLength(cs325Sound)):
    value=getSampleValueAt(cs325Sound, source)
    setSampleValueAt(target, index, value)
    index = index + 1
  #Copy in 0.1 second pause (silence) (0)
  for source in range (0, int(0.1*getSamplingRate(target))):
    setSampleValueAt(target, index, 0)
    index = index + 1
  #Copy in "is fun"
  for source in range (24703, getLength(isFunSound)):
    value = getSampleValueAt(isFunSound,source)
    setSampleValueAt(target, index, value)
    index = index + 1
  normalize(target)
  play(target)
  return target

To figure out what range we had to copy from, we used the Explorer tool to find the index (approximately 24703) where "is fun" started in the "what_is_fun.wav" file.

An overview of the code is as follows:

Before we run this code, we have to call:

When we run this code, something strange happens. What is happening? How can we fix it?

3.2 Creating General Clip and Copy Functions

This sub-section will combine all three sounds so that we get "cs325 is always fun". In order to do that, we need to extract the "is" and the "fun" from the "what_is_fun.wav" file. We used the explorer tool to find the approximate beginning and ending of the words:

  start end
what 929 14109
is 24703 36955
fun 37169 60705

To make splicing easier, two functions were created:

Our final function that makes use of both clip and copy is below:

def merge2():
  cs325Sound = makeSound(getMediaPath("CS325.wav"))
  isSound = makeSound(getMediaPath("what_is_fun.wav"))
  alwaysSound = makeSound(getMediaPath("always.wav"))
  isClip=clip(isSound,24703,36955)
  funClip=clip(isSound,37169,60705)
  len=getLength(cs325Sound)+getLength(isClip)+getLength(alwaysSound)+ getLength(funClip)
  samplingRate= int(getSamplingRate(cs325Sound))
  newSound=makeEmptySound(len,samplingRate)
  copy(cs325Sound,newSound,0)
  copy(isClip,newSound,getLength(cs325Sound))
  copy(alwaysSound,newSound,getLength(cs325Sound)+getLength(isClip))
  copy(funClip,newSound,getLength(cs325Sound)+getLength(isClip)+getLength(alwaysSound))
  play(newSound)
  return newSound

The idea of this code is:

There is still a problem with how the result sounds. How would you fix it?

3.3 Making a Library

Let us say that you like that clip and copy function and want to use them over again in other projects. To reuse it, you can create a library. The steps are below:

  1. Cut and paste the clip and copy functions into a separate file (let us call it soundLib.py)

  2. To the first line of soundLib.py, include the following line:
    from media import *
    This will allow you to use the JES provided functions like getMediaPath, makeSound, etc

  3. In the second file, where you would like to call the clip and copy functions, add the following to the first line:
    from soundLib import *
    This is like copying the clip and copy (or all the) functions from soundLib.py into your code

  4. Find the directory where soundLib.py is stored, and use the following to set the library path:
    setLibPath("/Users/you/yourDirectory/PythonSound/")
    Of course, you will want to use your own directory as an argument. This function tells Python where to look for the files that you are importing.

4. References


5. Exercise

This exercise is taken from problem 7.8 on page 191 of Computing and Programming in Python, a Multimedia Approach:

Make an audio collage. Make it at least 5 seconds long, and include at least two different sounds (i.e., they come from different files). Make a copy of one of those different sounds and modify it [by changing the volume or some other creative approach (maybe not in the lab)]. Splice together the original two sounds and the modified sound to make the complete collage.

The sounds that you use should be your own recordings.

Use the clip and copy functions in a library