capital-g / musikinformatik-sose2021 Goto Github PK

Course materials for Musikinformatik course SoSe 2021 at RSH Düsseldorf

Home Page: https://capital-g.github.io/musikinformatik-sose2021/

Jupyter Notebook 99.96% Dockerfile 0.01% Makefile 0.01% Shell 0.01% Python 0.01% TeX 0.01% SuperCollider 0.01% PureBasic 0.03%

machine-learning music-informatics

musikinformatik-sose2021's People

Contributors

Stargazers

Watchers

Forkers

naudr

musikinformatik-sose2021's Issues

Change MIDI dataset

Swich to dataset https://colinraffel.com/projects/lmd

This could take a while. [really half an hour?]

pip3 install -r requirements.txt

after a long chain of successes:

INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run, you can press Ctrl + C to do so. To improve how pip performs, tell us what happened here: https://pip.pypa.io/surveys/backtracking
INFO: pip is looking at multiple versions of cloudpickle to determine which version is compatible with other requirements. This could take a while.
Collecting cloudpickle>=1.1.1
  Downloading cloudpickle-1.5.0-py3-none-any.whl (22 kB)
  Downloading cloudpickle-1.4.1-py3-none-any.whl (26 kB)
  Downloading cloudpickle-1.4.0-py3-none-any.whl (25 kB)
  Downloading cloudpickle-1.3.0-py2.py3-none-any.whl (26 kB)
  Downloading cloudpickle-1.2.2-py2.py3-none-any.whl (25 kB)
  Downloading cloudpickle-1.2.1-py2.py3-none-any.whl (25 kB)
  Downloading cloudpickle-1.2.0-py2.py3-none-any.whl (24 kB)
  Downloading cloudpickle-1.1.1-py2.py3-none-any.whl (17 kB)

no network traffic, just silent …

Add bibliography

See https://sphinxcontrib-bibtex.readthedocs.io/en/latest/quickstart.html

python setup

At least in the context of this course, it would be useful to provide a link how to get python3 (e.g. homebrew), and also how to upgrade pip3 if there is not the right version.

some intermediate steps

I think it would help the students if we add some more basic transformations of the FFT before doing the sorting via ML techniques.

here are some suggestions:

## reconstruction of the original
data_inverted = librosa.istft(data_fft, hop_length=HOP_LENGTH, win_length=WIN_LENGTH)
display(Audio(data_inverted, rate=sr))

## backwards
data_fft_shifted = np.flip(data_fft, axis=1)
data_shifted = librosa.istft(data_fft_shifted, hop_length=HOP_LENGTH, win_length=WIN_LENGTH)
display(Audio(data_wo_phase, rate=sr))

## inverted spectrum
data_fft_shifted = np.flip(data_fft, axis=0)
data_shifted = librosa.istft(data_fft_shifted, hop_length=HOP_LENGTH, win_length=WIN_LENGTH)
display(Audio(data_wo_phase, rate=sr))

## scrambled spectrum
import random
data_fft_shuffled = data_fft.copy()
random.shuffle(data_fft_shuffled)
data_shuffled = librosa.istft(data_fft_shuffled, hop_length=HOP_LENGTH, win_length=WIN_LENGTH)
display(Audio(data_shuffled, rate=sr))

then it would be nice to have something similar to the following sclang transformations in python, of course only if there is a simple equivalent:

n = data_fft.size;
Array.fill(n, { if(0.3.coin) { 1 } { 0 } }) * data_fft
Array.fill(n, { |i| if(i.linlin(0, n, 0, 1).coin) { 1 } { 0 } }) * data_fft
data_fft.rotate(n div: 2)

How to activate venv in windows

Current source thingy does not work

Fix doc build warnings

Currently the build process is quite noisy

WARNING: while setting up extension jupyter_sphinx: node class 'JupyterWidgetViewNode' is already registered, its visitors will be overridden
WARNING: while setting up extension jupyter_sphinx: node class 'JupyterWidgetStateNode' is already registered, its visitors will be overridden
...
WARNING: Execution Failed with traceback saved in /Users/scheiba/github/musikinformatik_sose2021/docs/_build/html/reports/01_midi_drums.log
WARNING: Notebook code has no file extension metadata, defaulting to `.txt`
...
/Users/***/github/musikinformatik_sose2021/docs/_build/jupyter_execute/docs/_build/jupyter_execute/docs/_build/jupyter_execute/docs/_build/jupyter_execute/docs/_build/jupyter_execute/docs/_build/jupyter_execute/docs/_build/jupyter_execute/docs/_build/jupyter_execute/docs/_build/jupyter_execute/docs/_build/jupyter_execute/docs/_build/jupyter_execute/docs/_build/jupyter_execute/01_midi_drums/01_midi_drums.ipynb: WARNING: document isn't included in any toctree

DoD

Docs build w/o warning

Add installation of virtualenv to setup

note_seq ModuleNotFoundError

On import note_seq I suddenly get an error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-17-7a314a24223e> in <module>
----> 1 import note_seq

ModuleNotFoundError: No module named 'note_seq'

I am pretty sure that the setup will run into problems for windows users as shell scripts do not exist on windows, the path structure is different and the startup of the virtual env works differently on windows. currently I do not have a windows machine so maybe wait til someone tries it on a windows machine so it can be fixed step by step

Build and publish docs in CI/CD

The current setup to update the docs is to build them manually and to execute the command

ghp-import -n -p -f "docs/_build/html"

which requires ghp-import to be installed.

But the docs of Jupyter Book describe a way to use Github Actions to automatically update the docs after a push which should be preferred.

Provide update scenario for repo

As the repo is in steady development there needs to be a guide on how to stay up to date and also keep the changes one has done - either use stashing (probably too advanced) or simply consider renaming of files if experiments are made

Motivation on MIDI vs PCM

Add motivation on why MIDI is easier than PCM for a first project

Exchange from python to SC

There are multiple ways to transfer results from python to SC - it should be discussed what to use here

Format	Remarks	SC support	Python Support
SDIF	Exchange format from IRCAM	Beta version seems to exist	https://github.com/gesellkammer/pysdif
CSV	Standard exchange format in data science - has no types	CSVReader	built-in/pandas
JSON	Like CSV but supports some types	Quark JSON Parser	built-in/pandas
MIDI	Quite a linear format	Quark SimpleMIDIFile	music21
OSC	Standard communication protocol in SC - I wrote a tutorial on this	built-in	pyosc

Introduction to Python

I think it would be a good way to include some basic Python in here - of course there is other resources to cover this but just so one can get over the basic stuff.

I have led a course on this at RSH and the course material is available @ https://github.com/capital-G/programmierkurs - so either just link it or include it here and also mention some other ressources to learn python - realpython.com is e.g. a ressource which I can recommend

Add introduction on which algorithm to choose

Porbably based around this map https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html

Math 101

Mean
Dimensions
Variance
"Ein Raum heißt höchstens n-dimensional, wenn jeder Punkt in beliebig kleinen Umgebungen mit höchstens (n - l)-dimensionalen Begrenzungen enthalten ist."

minor simplification in plt.scatter

in 01_spect_resynth/02_spect.ipynb

instead of
plt.scatter(x=data_tsne, y=np.zeros(data_tsne.shape), c=list(range(len(data_tsne))))

can be simplified to
plt.scatter(x=data_tsne, y=np.zeros(len(data_tsne)), c=range(len(data_tsne)))

(and in all the other examples)

Reorder repo

I think it would be good to re-arrange the folder structure and the chapters on the website

Meta
  Setup
  Contribute
  Bib

Introduction to Python
  Python basics
  Dimensionality in SuperCollider
  Dimensionality in Python
  Generating sounds in Python
  Communicating between SuperCollider and Python

Machine Learning
  Math basics
  Machine Learning basics
  Introduction to NN
  CNNs
  Autoencoders
  RNNs

Resynth Sound
  Spectrogram
  Matrix Decomposition
  Wavesets

Working with datasets
  Drums

Those chapters should also be reflected by the folder structure.
Assets should get their own subfolder within each folder.

This will also need some re-adjusting of the paths within the notebooks as well - maybe the change can be kept to a minimum to avoid a too big commit.

Take a look at flucoma

https://github.com/flucoma/flucoma-sc allows for PCA/NMF on Buffers in SC.

As Flucoma itself is a C++ library the algorithms are implemented as a UGen in SC which gets quite quirky sometimes - but impressive and worth taking a look for sure.
I think the Python introduction is still vaild as it allows for a more experimental aproach and one understands the inner workings of the algorithms better.

Is there a possibility to Exchange Buffer information from scsynth <-> sclang like in WaveTables?
Flucoma implements the Eigen library which allows for performant PCA/NMF - I am have not yet checked the solver in MathLib.

Dim reduction to markov chain

@telephon one remix of the markov version shown today is by calculating the PCA of a spectogram.
For each vector representation of this we can calculate the distance to each vector resulting in a n x n distance matrix which can be used as transition matrix for a markov chain.
The handing over part is done via csv.

Python

import numpy as np
import librosa
import librosa.display
import matplotlib.pyplot as plt
import soundfile

data, sr = librosa.load('chief.wav', sr=None, mono=True)

N_FFT = 10000
WIN_LENGTH = 10000
HOP_LENGTH = 10000

stft = librosa.stft(data, n_fft=N_FFT, hop_length=HOP_LENGTH, win_length=WIN_LENGTH)

spect = librosa.feature.melspectrogram(data, sr=sr, n_fft=N_FFT, hop_length=HOP_LENGTH, win_length=WIN_LENGTH)

plt.figure(figsize=(15, 10))
librosa.display.specshow(librosa.amplitude_to_db(spect, ref=np.max), y_axis='hz', x_axis='s')

from sklearn.manifold import TSNE
from sklearn.decomposition import PCA

tsne = PCA(n_components=10)

spect_2d = tsne.fit_transform(spect.T)

plt.scatter(x=spect_2d[:, 0], y=spect_2d[:, 1])

from scipy.spatial import distance_matrix

d = distance_matrix(spect_2d, spect_2d)

d = (-1)*d + d.max(axis=1)

import pandas as pd

pd.DataFrame(d).to_csv('foo.csv', index=False, header=False)

d.shape

(966, 966)

SuperCollider

s.boot;

-> localhost

b = Buffer.read(s, "/Users/scheiba/github/musikinformatik_sose2021/datasets/specto_cluster/expo.flac");

-> Buffer(2, nil, nil, nil, /Users/scheiba/github/musikinformatik_sose2021/datasets/specto_cluster/expo.flac)

-> Buffer(2, nil, nil, nil, /Users/scheiba/github/musikinformatik_sose2021/datasets/specto_cluster/expo.flac)

SynthDef(\bplaySection, {|out, bufnum, start, end, rate=1.0, sustain=1.0, amp=0.1, attack=0.001|
    var sig, env;
    env = EnvGen.kr(Env.linen(
        attackTime: attack,
        sustainTime: (end-start)/BufSampleRate.kr(b),
        releaseTime: 0.001,
    ), doneAction: Done.freeSelf);
    sig = PlayBuf.ar(
        numChannels: 2,
        bufnum: b,
        rate: BufRateScale.kr(b) * rate,
        startPos: start,
    );
    sig = sig*env*amp;
    Out.ar(out, sig);
}).add;

-> a SynthDef

Synth(\bplaySection, [
    \bufnum, b,
    \start, 2000,
    \end, 40000,
]);

-> Synth('bplaySection' : 1121)

t = CSVFileReader.readInterpret("/Users/scheiba/github/musikinformatik_sose2021/fftkov/foo.csv")

-> [ [ 93228.044750679, 93227.989116289, 60061.208920211, 82305.4775241, 81328.446330002, 56999.69536449, 80908.997353898, 80240.290440599, 85232.945631371, 85866.381053318, 79157.063936866, 87463.297267627, 80287.238484691, 79686.530098955, 87165.471127, 75130.627055807, 82063.928728668, 82161.697344702, 70609.019307724, 83910.493911634, 77055.910753639, 75979.706831109, 80027.464007242, 73534.218585885, 81777.190208451, 71039.294910442, 75133.168903054, 81515.328022714, 52909.621543427, 77970.088271363, 8595...etc...

Tdef(\x, {
    var curState=0;
    var winSize = 10000;
    var hopSize = 10000;
    var sampleRate = 44100;
    loop {
        curState = (0..t.shape[0]).wchoose(t[curState].normalizeSum);
        Synth(\bplaySection, [
            \bufnum, b,
            \start, curState*hopSize,
            \end, curState*hopSize + winSize,
            \amp, 0.5,
            \attack, 0.1,
        ]);
        ((winSize/sampleRate)*0.2).wait;
    }
}).play;

-> Tdef('x')

-> CmdPeriod

one-to-one comparison

It would be nice to have a one-to-one comparison of the array indexing / manipulation functions in numpy and sclang.

Good sources for sclang:

J concepts in SC
Syntax Shortcuts

dl_url not

An error in the section "Getting the dataset":

dl_url---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-4-a68ad214e112> in <module>
     38         os.remove(dl_files["midi"]["path"])
     39 
---> 40 download_dataset()

<ipython-input-4-a68ad214e112> in download_dataset(download_path)
     22             continue
     23         print(f"Start downloading {dl_name} to {dl['path']} - this can take multiple minutes!")
---> 24         urllib.request.urlretrieve(, dl_path)
     25         print(f"Finished downloading")
     26 

NameError: name 'dl_url' is not defined

Adding multiplied NMF plot

During session it turned out that plotting the product of WH as spectogram as well

Add MrKov

There is already a python implementation in the dropbox and also a wrong implementation of the algorithm which is interesting nonetheless IMO.
But I would be really interested in a SC implementation of this as I am not familiar with FFT in SC and having a live signal makes this interesting.

One basic idea that is different from the python implementation could be the use a local sensitive hash instead of a representation in a vector space - here is a sketch of a suggestion

Start

|1| 2
|-| -
|A| B

we recorded and hash 2 grains (A and B) and delay playback by 1 grain so we have a look-ahead of 1 sample - the playbacked grain is indicated by | |

Hash "collision"

1  2  3 |4| 5
-  -  - |-| -
A  B  C |D| B

After recording and playback of n samples we occur a hash-collision in our look-ahead - for now we will continue of the playback of

Lets say we jumped back to sample 2

1  |2|  3  4  5  6
-  |-|  -  -  -  -
A  |B|  C  D  B  E

Transitions

IIRC Markov chains were first used as to determine the next character in a book given a certain character (and is also used in the PageRank algorithm which spawned the company google) - in mathematic formalism we say the transition from one state to another one. We can also consider the last n states for our prediction of state n+1 - this is called a markov chain of order n.

In the example above we do not care about the state of the current sample to calculate the next state so we do not really account for the characteristics of a Markov chain.
Using the transition probability from grain A to grain D is possible but this has the problem that we need to have all possible grains in memory and therefore needs a different design.

Performance parameters

Hash bit resolution

Reducing the bit size of our hash would limit our dictionary size and would increase jumps - although it would be good to make create high resolution hashes and allowing to reduce them at a later stage for more performance.
It would be good to use local sensitive hashing so reducing the bit size of our hash would result in a confusion of similar sounding grains

Length of Buffer

As at some point we have to clear the Buffer in a fifo way - the size of the buffer correlates with the number of available hashes so this allows us to jump more throughout the signal.
As the playback speed is 1 (?) we do not have the problem that the erasing of our buffer could catch us up in the playback as long as we do not jump to a grain which gets currently deleted so it would be good procedure to delete the hash before feeing samples from the buffer.

Skewness of distribution

Descibed above we used a uniform distribution for the likeliness of jumps - maybe it would be good to introduce a gradual skewness as a parameter using box muller which allows us to transform a uniform distribution to a normal distribution.

BUT this contradicts the

Using characters as hashes

If we use characters as hash symbols this allows us to interchange text and music hashes => To explore further

Exchange transition matrix

Applying the tranistion matrix of signal A on signal B could yield interesting results.

Stationary Markov process

One of the interesting results of markov chains is the stationary distribution which gives us the expected occurences of each object if we would run the markov chain infinetely

What to do with GM MIDI

MIDI Glitching
Extreme MIDI / Mod Files

Sanitize 2 commits

There are currently 2 commits on the main branch (1527729 and aafbaa3) which are not really tidied up for distribution and introduce ambiguous notebooks.
As those notebooks are quite big in filesize I think those should not be included in the history and should be reintroduced in a clean state.

This is a bit problematic as I want to add some stuff for tomorrow, maybe I will re-write the history and move the notebooks into a branch.

Add convolutions and autoencoder

in the course i demonstrated

convolutional neural networks (CNN) and therefore "deep" learning
autoencoders w/ latent space which allow for the generation of new examples

which are currently missing from the online course materials.

Multi-dimensionality in SuperCollider

It was discussed to start the course with multi dimensional arrays and the slicing of them in SC and then transfer this to python (via numpy).

Topics to cover:

Slicing
Multi-dim array and multi channel
Multiplication of multi dim arrays
Introduction of tensors (?) interesting but rather advanced

Add RNN chapter

Today RNNs were discussed but the material is not online yet - one should fix this

FFT?

might be interesting?
https://sidsite.com/posts/fourier-nets/

Working with wavesets

Using wavesets as discrete snippets are a manageable representation for easy machine learning projects - there is also a waveset implementation in sc, see https://github.com/musikinformatik/WavesetsEvent

This issue is for discussion what this topic could cover

Add information on how to show contextual help in Jupyter Lab

As discussed in class and shown in

which mimics the SC layout

Add more information on bias

There were questions regarding bias which could be explained in more detail

Improve performance of MIDI extraction

Currently the extraction of the MIDI files take a couple of hours which is a bit much because we only have 100k examples to load.
I tried to improve the speed by using https://github.com/jmcarpenter2/swifter which promises to parallelize the code and it indeed uses 100% of the CPU but it seems it is not really speeding up the process but therefore introduces a couple of problematic dependencies.

It is also worth to take a look at

The problem of headless SC remains if everything of this should be used