(0) Before we get started¶
You need a Google account to edit and run code here.
- Sign into your Google account and open this ipynb again.
- From the menu, choose Save a copy in Drive, open that one and close this one.
- Download https://github.com/machinelistening/machinelistening.github.io/blob/master/Piano1-1.wav
- On the left side here, click the folder icon, then upload Piano1-1.wav (root/content/Piano1-1.wav).
OR if you don't have/want that, execute the .ipynb locally:
- Download the .ipynb and Piano1-1.wav from the course github.
- Install miniconda (https://docs.conda.io/en/latest/miniconda.html)
- Open a terminal/Anaconda prompt
- "conda create -n [name] python=3.7"
- "conda activate [name]"
- "pip install {numpy, matplotlib, librosa, ipython, jupyter}"
- "jupyter notebook Machine_Listening_Seminar_1.ipynb"
- A browser tab should open with URL "http://localhost:8888/notebooks/Machine_Listening_Seminar_1.ipynb" and you should see the document.
(1) Hello world¶
Welcome to your first MIR seminar! Topics today:
- Python basics and library usage
- How to create a sine wave
- Disassemble the sine wave into its STFT-spectrum, Mel-spectrum and chromagram
- See the spectral differences between a sine wave and piano chords
Let's get started with some elemental Python basics!
part1 = 'Hello,'
part2 = 'AST world!'
print(...) # FILL IN: Concatenate the two strings and insert a space
print('\nLet\'s do a headcount!')
for ... in ...: # FILL IN: Start and end of the loop. Start at 1, end at 10!
print(i)
print('Great, everyone is here.')
Expected outputs:
- 'Hello, AST world!'
- 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 (ommitted line breaks here)
(2) Python & numpy basics¶
- A quick look at some functionalities of the Python numpy package
- importing Python libraries
- see: https://numpy.org/doc/stable/reference/index.html
import numpy as np
a = np.array([1, 2]) # vector/"1D-array" with values 1 and 2
b = np.array([3, 4]) # same, values 3 and 4
ab1 = np.vstack((a, b)) # stacks them into a matrix
ab2 = np.array([[1, 2], [3, 4]]) # same as ab1
c = np.ones((2, 3)) # 2D-array with 2 rows and 3 columns, filled with ones.
d = np.log(ab2) # natural logarithm
e = [1, 2, 3, 4, 5] # a list (not numpy)
f = list(np.arange(3, 8)) # [3, 4, 5, 6, 7]
g = np.sum(e) # sum of the elements in e
h = np.sum((e, f)) # sum of the elements in both e and f
i = np.sum(ab1, axis=0) # sum per column of the matrix
ef = set(e).intersection(set(f)) # intersection of e and f
m = e[0] # first element of e
n = e[1:3] # indices 2 and 3 of e
o = e[-1] # last element of e
p = len(e) # number of elements in e
q = [1, 2, [3, 4,4,6,8,9,0,], 5] # nested lists --> length of q?
print('a: {}'.format(a))
print('b: {}'.format(b))
print('ab1:\n {}'.format(ab1))
print('ab2:\n {}'.format(ab2))
print('c:\n {}'.format(c))
print('d:\n {}'.format(d))
print('g: {}'.format(g))
print('h: {}'.format(h))
print('i: {}'.format(i))
print('ef: {}'.format(ef))
print('m: {}, n: {}, o: {}'.format(m, n, o))
print('p: {}, len(q): {}'.format(p, len(q)))
(3) Sine wave¶
Next, we create a sine wave, which helps us get introduced to more Python concepts. Topics:
- formula for a sine wave
- get to know matplotlib for plotting graphs
Your task: Create a 1-second sinusoid with an amplitude of 3, a frequency of the note E3 (164,81 Hz) at a sample rate (fs) of 22050 Hz. Then sample the first 500 elements for the plot.
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import IPython.display as ipd
### START CODING HERE ###
amp = 3
freq = 164.81
fs = 22050
t = ... # Set up time samples.
pha = 0
sine_wave = ... # Formula for sine wave. Use np where applicable.
print(np.round(..., decimals=2)) # FILL IN: Print the first 10 samples of your sine wave.
fig, ax = plt.subplots()
ax.plot(..., ...) # FILL IN: Plot the first 500 samples of your sine wave. Takes parameters time and data.
ax.set(xlabel='...', ylabel='...', title='Simple sine wave') # FILL IN: Axis labels
ax.grid()
plt.show()
### END CODING HERE ###
ipd.Audio(sine_wave, rate=fs)
expected output: [0. 0.14 0.28 0.42 0.56 0.7 0.83 0.97 1.1 1.23]
(4) Path operations with os library¶
- os library (https://docs.python.org/3/library/os.html)
- Path operations useful for datasets, file checks, sytem operations
- Playground for parameters
import os
current_os = os.name
current_path = os.getcwd() # get the current working directory
# 'nt' for windows, 'posix' for linux, among others
print("Current operating system: " + current_os) # get current operating system
print("Current path: " + current_path)
new_folder = "data" # create a new sub directory
abs_path = os.path.join(current_path, new_folder) # concetenate path names
head, tail = os.path.split(abs_path) # Split absolute path
base_name = os.path.basename(abs_path)
print("Absolute path: " + abs_path)
print("head: " + head)
print("tail: " + tail)
print("basename: " + base_name)
os.mkdir(abs_path) # create a new folder
os.chdir(new_folder) # enter folder
for i in range(4): # create 5 text files in the folder
filename = "example_" + str(i) + ".txt"
new_file = open(filename, 'w')
new_file.close()
os.chdir("..") # change back to parent directory
example_file = os.path.join(abs_path, "example_0.txt")
path_exists = os.path.exists(abs_path) # check if path exists
is_file = os.path.isfile(example_file) # check if path is a file
is_dir = os.path.isdir(abs_path) # check if path is a directory
print("\n")
print("Does the path exist?: " + str(path_exists))
print("Is the path a file?: " + str(is_file))
print("Is the path a directory?: " + str(is_dir))
os.remove(example_file)
path_exists = os.path.exists(abs_path) # check if path exists
is_file = os.path.isfile(example_file) # check if path is a file
is_dir = os.path.isdir(abs_path) # check if path is a directory
print("\n")
print("Does the path exist?: " + str(path_exists))
print("Is the path a file?: " + str(is_file))
print("Is the path a directory?: " + str(is_dir))
# Create 2 subfolders "speech" and "music" with 3 .txt files inside named
# speech_file_00x.txt or music_file_00x.txt respectivly. (replace '00x' with
# the actual file number)
import os
### Start code here
### End code here
os.chdir('..')
# walk through folder structure and remove content
for root, dirs, files in os.walk(abs_path, topdown=False):
for name in files:
print ("Removing: " + os.path.join(root, name))
os.remove(os.path.join(root, name))
for name in dirs:
os.rmdir(os.path.join(root, name))
print ("Removing: " + os.path.join(root, name))
os.rmdir(abs_path)
path_exists = os.path.exists(abs_path) # check if path exists
is_file = os.path.isfile(example_file) # check if path is a file
is_dir = os.path.isdir(abs_path) # check if path is a directory
print("\n")
print("Does the path exist?: " + str(path_exists))
print("Is the path a file?: " + str(is_file))
print("Is the path a directory?: " + str(is_dir))
(5) Short-Time Fourier Transform (STFT) & Chromagram (I): Sine wave¶
Let's investigate the STFT. Topics:
- librosa library (https://librosa.org/doc/latest/index.html)
- Analysis of our sine wave using STFT and Chromagram
- Playground for parameters
import librosa
import librosa.display
import numpy as np
### Create an (mag-)STFT from the sine wave
stft_sine = np.abs(librosa.stft(y=sine_wave, n_fft=1024, hop_length=1024, win_length=None, window='hann', center=True, pad_mode='reflect'))
stft_sine_db = librosa.amplitude_to_db(stft_sine, ref=np.max)
fig, ax = plt.subplots()
img = librosa.display.specshow(stft_sine_db, y_axis='log', x_axis='time', ax=ax)
ax.set(title='Power spectrogram')
#plt.colorbar(img, ax=ax, format="%+2.0f dB")
plt.plot()
### Create a chromagram from the STFT. Check the librosa doc!
chroma = ...
fig, ax = plt.subplots()
img = librosa.display.specshow(chroma, y_axis='chroma', x_axis='time', ax=ax)
ax.set(title='Chromagram')
#plt.colorbar(img, ax=ax)
plt.plot()
(6) Short-Time Fourier Transform (STFT) & Chromagram (II): Piano chords¶
Topics in this section:
- Get to know the STFT-spectrogram
- Spectrum analysis of piano overtones, in comparison to the sine wave
- Behaviour of Chromagram for notes over several octaves
import librosa
import librosa.display
import matplotlib.pyplot as plt
import numpy as np
import IPython.display as ipd
### Example audio: Load 'Piano1-1.wav' using librosa. It has a sample rate of 44100. Convert it to mono.
piano_data, piano_sr = ...
### Turn into STFT, then db, then plot
piano_stft = ...
piano_stft_db = ...
fig, ax = plt.subplots()
img = librosa.display.specshow(..., y_axis='log', x_axis='time', ax=ax)
ax.set(title='Power spectrogram')
#plt.colorbar(img, ax=ax, format="%+2.0f dB")
plt.plot()
### Turn the STFT into a chromagram, then db, then plot
...
...
fig, ax = plt.subplots()
img = librosa.display.specshow(..., y_axis='chroma', x_axis='time', ax=ax)
ax.set(title='Chromagram')
#plt.colorbar(img, ax=ax)
plt.plot()
Expected output:
- 1st note block: [C E G] in succession, but overlapping
- 2nd note block: [E G C+1 E+1] simultaneously
- 3rd note block: [C E G C+1] simultaneously
- Chromagram: notice the same notes from different octaves are grouped together
(7) Mel-Spectrogram¶
The Mel-Spectrogram is based on the Mel scale, a perceptual scale of the pitches. Topics in this section:
- Mel-spectrogram with 128 bands and a f_max of 8000
- Analyse the difference to an STFT-spectrogram
import librosa
import librosa.display
import matplotlib.pyplot as plt
import numpy as np
### Create the Mel-spectrogram from the piano data, then convert it from power to dB
piano_mel = ...
piano_mel_dB = ...
fig, ax = plt.subplots()
img = ... # FILL IN: librosa's specshow()
ax.set(title='Mel-Spectrogram')
#plt.colorbar(img, ax=ax, format="%+2.0f dB")
plt.plot()
(Coming up) Building a sound classification system - First steps¶
ESC-50 dataset¶
- https://github.com/karolpiczak/ESC-50
- Dataset for Environmental Sound Classification
- We choose 5 classes, and 2 should be very similar to each other
In the next AST seminar, we will use these things to build our own first simple sound classification system.