chathasphere / pno-ai Goto Github PK

View Code? Open in Web Editor NEW

98.0 98.0 28.0 108 KB

Music Transformer Sequence Generation in Pytorch

License: MIT License

Python 100.00%

pno-ai's People

Contributors

Stargazers

Watchers

pno-ai's Issues

Implement Sequence min/max length

This should go in the encoder module.

Argument for Positional Encoding

Vs. Position embedding.

MIDI Controller support

Wouldn't it be cool if you could play something on a MIDI controller and use it to seed the generator?

Is there a way to evaluate performance?

Is there a test metric to evaluate performance of the model?

Fix "split_sequences" method

Seems to struggle with the edge case of extra long notes.

Check out this line:

new_length = note.end - sample_start

seems problematic if new_length is greater than 30 seconds.

A few things I had to do to get this working.

Hi. I was curious in trying your Music Transformer implementation, but I had to make some small changes to get it working.
Perhaps others may find some use from this information, and you might have insight.

First, in "train.py" line 147, there is a previously undeclared variable mask, which I believe should be x_mask.

Second, "generate.py" expects a "model.yaml" file that is not generated by "run.py": this was bypassed by directly setting the MusicTransformer model the same as in "run.py", and directly setting model_path to one of the checkpoints saved by "run.py"

Third, in "helpers.py", changed the line:
input_tensor = torch.LongTensor(input_sequence).unsqueeze(0)
into
model.cuda() input_tensor = torch.cuda.LongTensor(input_sequence).unsqueeze(0)
To fix an issue of mismatch between CPU and CUDA that arose in transformer.forward()

And lastly, because I was running this locally, I decreased batch size in run.py to fit my GPU.

Thanks.

Save Trained Models

...and figure out how to load them when generating

Finish Preprocess Pipeline

Successfully extracted note sequences w/ pedal action.
According to research , the next steps should be:

Stretch music in time by: 0.95, 0.975, 1, 1.025, 1.05 (4x as many samples)
Split MIDIs into 30-second clips
Quantize samples (what exactly does this entail...sample rate of 125 Hz?
Transpose all samples by all intervals up to a major third (9x as many samples)
Simplify dynamics (midi velocity) to 32 bins

Deploy model to Heap Cluster

Woohoo, dev ops

Access saved models

Currently I dump a dictionary of parameters in the generate.py file. This could be more elegantly solved as a .yaml file.

Vanilla Transformer

Data Encoding

Given the preprocessed data, convert these MIDI snippets to "Event Sequences."

Under the Time-shift representation, 413 events are possible:

128 note on events (each corresponding to a MIDI pitch)
128 note off events
125 time shift events (move time step forward by 8ms up to 1 second)
32 velocity events (change velocity applied to all subsequent events until another change)

A one-hot encoding turns each event into a 413-dimensional vector. A sequence of these is fed into a Neural Network.

License!

MIT?

Decode generated event sequences

Adapt the Sequence Encoder class to generate Note sequences from poorly-behaved, generated event sequences. For instance, set "note off" automatically if no note off event is provided for a pitch.

Training/Validation Split in Data

How will this trade be affected by the various data augmentation strategies?

documentation

Docstrings for modules, plus an updated README/ Wiki. Links to a clean Colab notebook could be useful for teaching how to train a model.

Generate Midis

Use softmax sampling w/ temperature / topk to generate sequences
Decode numeric sequence into event sequence
Decode event sequence as MIDI object
Write out MIDIs

Web deployment

Embed a trained model in a flask application, generate MIDI sequences on the fly

Pretrained model?

Hi, would it be possible to provide a link to a pre-trained model? Thanks!

Optimize preprocessing pipeline

Consider using numpy arrays instead of a list of notes?

Will likely lead to efficiency gains in transposing/stretching samples.

no such file: saved_models/model.yaml

After running run.py and generate.py, it gave me the result that ‘no such file: saved_models/model.yaml’
I am doing research on the google magenta project (optimize the music transformer), your codes are valuable and useful for my research, so please help me to solve this problem.

Think about how to initialize the training vector (?) All zeroes?
Helpful link(s):

chathasphere / pno-ai Goto Github PK

pno-ai's People

Contributors

Stargazers

Watchers

Forkers

pno-ai's Issues

Recommend Projects

Recommend Topics

Recommend Org