jason9693 / musictransformer-tensorflow2.0 Goto Github PK

View Code? Open in Web Editor NEW

353.0 14.0 81.0 56.43 MB

implementation of music transformer with tensorflow-2.0 (ICLR2019)

License: MIT License

Shell 1.92% Python 83.78% Jupyter Notebook 14.30%

musictransformer-tensorflow2.0's Introduction

Music Transformer: Generating Music with Long-Term Structure

2019 ICLR, Cheng-Zhi Anna Huang, Google Brain
Re-producer : Yang-Kichang
paper link
paper review

Abstract

This Repository is perfectly compatible with tensorflow 2.0
If you want pytorch version, see here

Contribution

Domain: Dramatically reduces the memory footprint, allowing it to scale to musical sequences on the order of minutes.
Algorithm: Reduced space complexity of Transformer from O(N^2D) to O(ND).

Preprocessing

In this repository using single track method (2nd method in paper.).
If you want to get implementation of method 1, see here .
~~I refered preprocess code from performaceRNN re-built repository..~~
Preprocess implementation repository is here.

Repository Setting

$ git clone https://github.com/jason9693/MusicTransformer-tensorflow2.0.git
$ cd MusicTransformer-tensorflow2.0
$ git clone https://github.com/jason9693/midi-neural-processor.git
$ mv midi-neural-processor midi_processor

Midi Download

$ sh dataset/script/{ecomp_piano_downloader, midi_world_downloader, ...}.sh

These shell files are from performaceRNN re-built repository implemented by djosix

Prepare Dataset

$ python preprocess.py {midi_load_dir} {dataset_save_dir}

Trainig

~~Train with Encoder & Decoder architecture ( original transformer architecture )~~

-> original transformer model is not compatible with music generation task. ( attention map is entangled )

-> If you wanna see this model, see MusicTransformer class in model.py

Train with only Decoder wise ( only self-attention AR. )

$ python train.py --epochs={NUM_EPOCHS} --load_path={NONE_OR_LOADING_DIR} --save_path={SAVING_DIR} --max_seq={SEQ_LENGTH} --pickle_dir={DATA_PATH} --batch_size={BATCH_SIZE} --l_r={LEARNING_RATE}

Hyper Parameter

learning rate : Scheduled learning rate ( see: CustomSchedule )
head size : 4
number of layers : 6
seqence length : 2048
embedding dim : 256 (dh = 256 / 4 = 64)
batch size : 2

Result

Baseline Transformer ( Green, Gray Color ) vs Music Transformer ( Blue, Red Color )

Loss
Accuracy

Generate Music

mt.generate() can generate music automatically.

from model import MusicTransformerDecoder
mt = MusicTransformerDecoder(
  	embedding_dim=256, vocab_size=par.vocab_size, 
  	num_layer=6, 
  	max_seq=max_seq,
  	dropout=0.1,
  	debug=False
)
mt.generate(prior=[64], length=2048)

If you want to generate with shell wise, see this.

$ python generate.py --load_path={CKPT_CONFIG_PATH} --length={GENERATE_SEQ_LENGTH} --beam={NONE_OR_BEAM_SIZE}

Generated Samples ( Youtube Link )

click the image.

TF2.0 Trouble Shooting

1. tf.keras

you can't use tf.keras directly in alpha ver. So you should import from tensorflow.python import keras ,then use > keras.{METHODS}

example :

from tensorflow.python import keras 
dropout = keras.layers.Dropout(0.3)

2. tf.keras.optimizers.Adam()

tf-2.0alpha currently not supported keras.optimizers as version 2. so, you can't use optimizer.apply_gradients(). So, you should import from tensorflow.python.keras.optimizer_v2.adam import Adam first.

example:

from tensorflow.python.keras.optimizer_v2.adam import Adam
optimizer = Adam(0.0001)

3. Keras Model Subclassing

current tf 2.0(alpha) , subclassed keras model can't use method save(), summary(), fit() and save_weigths() with .h5 format

4. Distribute Training

As dist train (with 4GPU) is slower than using single GPU, I trained this model with single GPU. Nonethless, you want to see how to distribute training, See dist_train.py

musictransformer-tensorflow2.0's People

Contributors

Stargazers

Watchers

musictransformer-tensorflow2.0's Issues

generate error

When I run the generate.py ,the following error will happen.
Exception ignored in: <bound method _CheckpointRestoreCoordinator.del of <tensorflow.python.training.tracking.util._CheckpointRestoreCoordinator object at 0x7f30cc3c29e8>>
Traceback (most recent call last):
File "/home/anaconda3/envs/musictransformer/lib/python3.6/site-packages/tensorflow/python/training/tracking/util.py", line 240, in del
TypeError: 'NoneType' object is not callable

Test

train.py----ValueError: Arg specs do not match

ValueError: Arg specs do not match: original=FullArgSpec(args=['x', 'y', 'name'], varargs=None, varkw=None, defaults=(None,), kwonlyargs=[], kwonlydefaults=None, annotations={'return': tensorflow.security.fuzzing.py.annotation_types.TensorFuzzingAnnotation[tensorflow.security.fuzzing.py.annotation_types.Bool], 'x': tensorflow.security.fuzzing.py.annotation_types.TensorFuzzingAnnotation[~TV_Greater_T], 'y': tensorflow.security.fuzzing.py.annotation_types.TensorFuzzingAnnotation[~TV_Greater_T]}), static=FullArgSpec(args=['x', 'y', 'name'], varargs=None, varkw=None, defaults=(None,), kwonlyargs=[], kwonlydefaults=None, annotations={}), fn=<function greater at 0x7fe44acf8700>

For Seq2seq training

What should I do when I try to train Seq2seq tasks with this?
Especially, How to give Source and Target data for the model

Multi-GPU Programming

Hello, thank you for your code contribution. Do you have any plans for multi-gpu parallel computing recently? I have encountered some problems when trying to change the code to parallel computing.

Traceback (most recent call last):
File "train.py", line 72, in
mt.compile(optimizer=opt, loss=callback.transformer_dist_train_loss)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 456, in _method_wrapper
result = method(self, *args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 263, in compile
'We currently do not support distribution strategy with a '
ValueError: We currently do not support distribution strategy with a Sequential model that is created without input_shape/input_dim set in its first layer or a subclassed model.

Project dependencies may have API risk issues

Hi, In MusicTransformer-tensorflow2.0, inappropriate dependency versioning constraints can cause risks.

Below are the dependencies and version constraints that the project is using

absl-py==0.7.1
alembic==1.0.11
appdirs==1.4.3
asciimatics==1.11.0
asn1crypto==0.24.0
astor==0.8.0
bc-dvc-init==0.3.0
boto3==1.9.115
botocore==1.12.180
certifi==2018.1.18
chardet==3.0.4
Click==7.0
cloudpickle==1.1.1
colorama==0.4.1
config==0.4.2
configobj==5.0.6
configparser==3.7.4
contextlib2==0.5.5
cryptography==2.3
databricks-cli==0.8.7
decorator==4.4.0
distro==1.4.0
docker==4.0.2
docutils==0.14
dvc==0.50.1
entrypoints==0.3
Flask==1.0.3
funcy==1.12
future==0.17.1
gast==0.2.2
git-url-parse==1.2.2
gitdb2==2.0.5
GitPython==2.1.11
google-pasta==0.1.6
grandalf==0.6
grpcio==1.21.1
gunicorn==19.9.0
h5py==2.9.0
humanize==0.5.1
idna==2.6
inflect==2.1.0
itsdangerous==1.1.0
Jinja2==2.10.1
jmespath==0.9.4
jsonpath-ng==1.4.3
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
keyring==10.6.0
keyrings.alt==3.0
Mako==1.0.12
Markdown==3.1.1
MarkupSafe==1.1.1
mido==1.2.9
mlflow==1.0.0
nanotime==0.5.2
networkx==2.3
numpy==1.16.4
pandas==0.24.2
pathspec==0.5.9
pbr==5.3.1
Pillow==6.2.0
ply==3.11
pretty-midi==0.2.8
progress==1.5
protobuf==3.8.0
psutil==5.6.6
pyasn1==0.4.5
pycrypto==2.6.1
pyfiglet==0.8.post1
pygobject==3.26.1
pyparsing==2.4.0
python-apt==1.6.4
python-dateutil==2.8.0
python-editor==1.0.4
pytz==2019.1
pyxdg==0.26
PyYAML==5.1.1
querystring-parser==1.2.3
requests==2.22.0
ruamel.yaml==0.15.97
s3transfer==0.2.1
schema==0.7.0
SecretStorage==2.3.1
shortuuid==0.5.0
simplejson==3.16.0
six==1.12.0
smmap2==2.0.5
SQLAlchemy==1.3.5
sqlparse==0.3.0
ssh-import-id==5.7
tabulate==0.8.3
tb-nightly==1.14.0a20190603
tensorflow-gpu==2.0.0b1
termcolor==1.1.0
tf-estimator-nightly==1.14.0.dev2019060501
tfp-nightly==0.8.0.dev20190807
treelib==1.5.5
urllib3==1.24.2
wcwidth==0.1.7
websocket-client==0.56.0
Werkzeug==0.15.4
wrapt==1.11.1
zc.lockfile==1.4

The version constraint == will introduce the risk of dependency conflicts because the scope of dependencies is too strict.
The version constraint No Upper Bound and * will introduce the risk of the missing API Error because the latest version of the dependencies may remove some APIs.

After further analysis, in this project,
The version constraint of dependency Flask can be changed to >=0.11,<=0.12.5.
The version constraint of dependency future can be changed to >=0.12.0,<=0.18.2.
The version constraint of dependency networkx can be changed to >=2.0,<=2.8.4.
The version constraint of dependency pyasn1 can be changed to >=0.4.1,<=0.4.8.

The above modification suggestions can reduce the dependency conflicts as much as possible,
and introduce the latest version as much as possible without calling Error in the projects.

The invocation of the current project includes all the following methods.

The calling methods from the Flask

json.load
json.dump

The calling methods from the future

datetime.datetime.now

The calling methods from the networkx

max

The calling methods from the pyasn1

open

The calling methods from the all methods

ControlSeq.feat_dims
TransformerLoss
self.TransformerLoss.super.__init__
argparse.ArgumentParser.parse_args
tensorflow.equal
preprocess_midi_files_under
preprocess_midi
NoteSeq.from_midi_file
Encoder
f.join
tensorflow.cast
DecoderLayer
predictions.y.metric.numpy
result.sequence.EventSeq.from_array.to_note_seq
sum
datetime.datetime.now
tf.argmax
itertools.chain
tf.executing_eagerly
tensorflow.python.keras.optimizer_v2.adam.Adam
model.MusicTransformer.train_on_batch
self.layernorm3
math.sqrt
self._set_metrics
data.Data.slide_seq2seq_batch
numpy.roll
pickle.dump
random.sample
c.numpy
kwargs.kargs.self.to_midi.write
numpy.append
es.to_array.EventSeq.from_array.to_note_seq.to_midi_file
predictions.tf.nn.softmax.out_tar.metric.numpy
os.makedirs
numpy.shape
numpy.float32
self.sanity_check
rga
self._distribution_strategy.experimental_run_v2
split_last_dimension
self.Encoder
model.MusicTransformerDecoder.compile
self.FFN_pre
NoteSeq.from_midi
MusicTransformer.__prepare_train_data
EventSeq.from_array
tf.transpose
tensorflow.constant
self.MusicTransformerDecoder.super.__init__
tensorflow.matmul
list.append
numpy.ones
RelativeGlobalAttention
PositionEmbeddingV2.__get_angles
self._skewing
self.TransformerLoss.super.call
pretty_midi.Note
MusicTransformer.save_weights
math.sin
tf.distribute.MirroredStrategy.scope
tensorflow.range
midi.instruments.append
self.__load_config
self.layernorm1
model.MusicTransformer.reset_metrics
model.MusicTransformerDecoder.generate
ControlSeq
result_metric.append
weights.append
math.log
copy.deepcopy
range
self.Encoder.super.__init__
self.Decoder.super.__init__
deprecated.sequence.EventSeq.from_array
self.rga
tf.summary.scalar
metric.reset_states
self.PositionEmbeddingV2.super.__init__
self._qe_masking
tf.ones
tensorflow.pad
min
tensorflow.add
progress.bar.Bar.iter
super
tensorflow.logical_not
self.CustomSchedule.super.__init__
EventSeq.from_note_seq.to_array
ctrl_seq_list.append
name.lower.lower
tf.nn.top_k
self.FFN_suf
d.items
tensorflow.math.minimum
math.exp
self.model.save
name.lower.endswith
path.split
numpy.ones.note_count.pitch_count.note_count.tolist
self.rga2
tape.gradient
Decoder
EventSeq.from_note_seq
utils.attention_image_summary
tensorflow.summary.image
ControlSeq.feat_dims.values
tf.concat.numpy
self.layernorm2
pickle.load
deprecated.sequence.EventSeq.from_array.to_note_seq
tensorflow.ones_like
self.fc
get_masked_with_pad_tensor
progress.bar.Bar
es_seq_list.append
self.Wk
tensorflow.expand_dims
tensorflow.python.keras.layers.Dropout
tensorflow.io.TFRecordWriter
enumerate
tf.reshape
data.Data
datetime.datetime.now.strftime
len
self.process_midi_from_dir
numpy.concatenate
DynamicPositionEmbedding
model.MusicTransformerDecoder
json.load
self.__prepare_train_data
max
tensorflow.argmax
i.self.enc_layers
tf.summary.image
self.EncoderLayer.super.__init__
collections.OrderedDict
EventSeq.feat_dims.values
self.notes.sort
tensorflow.sequence_mask
phist.np.array.astype
ControlSeq.feat_dims.items
tensorflow.train.BytesList
argparse.ArgumentParser.add_argument
model.MusicTransformer.evaluate
self.__dist_train_step
shape_list
utils.fill_with_placeholder
deprecated.sequence.EventSeq.feat_ranges
MusicTransformer.load_weights
note_events.append
i.self.dec_layers
print
metric
EventSeq.feat_ranges
res.append
tensorflow.reduce_max
self.Wq
controls.append
argparse.ArgumentParser
join
tensorflow.train.Int64List
self.CustomSchedule.super.get_config
eval
tensorflow.math.equal
model.MusicTransformerDecoder.evaluate
numpy.power
filter.append
tensorflow.python.keras.metrics.SparseCategoricalAccuracy
tensorflow.python.keras.layers.LayerNormalization
self.call
numpy.sin
tensorflow.math.mod
es.to_array.EventSeq.from_array.to_note_seq
tf.constant
self.batch
self.get_config
tf.summary.create_file_writer.as_default
isinstance
model.MusicTransformerDecoder.sanity_check
self.save_weights
tensorflow.math.rsqrt
tensorflow.ones
tf.summary.histogram
tf.reduce_mean.numpy
numpy.arange
midi_processor.processor.decode_midi
self.add_weight
model.MusicTransformerDecoder.train_on_batch
tensorflow.math.logical_not
NoteSeq
self._distribution_strategy.reduce
note_events.sort
MusicTransformer
self.Wv
self.dropout2
EventSeq
self.add_notes
_rel_pitch
p.split
numpy.array
tf.nn.softmax
numpy.searchsorted
midi_processor.processor.encode_midi
self.loss
open
self.load_weights
tensorflow.train.Feature
self.Decoder
int
deprecated.sequence.EventSeq.dim.events.events.all
p.grad.data.norm
model.MusicTransformer
tensorflow.transpose
utils.find_files_by_extensions
zip
numpy.zeros
self.MTFitCallback.super.__init__
self.pos_encoding
self.MusicTransformer.super.__init__
self.DecoderLayer.super.__init__
tensorflow.nn.softmax
utils.get_masked_with_pad_tensor
self._get_left_embedding
format
tensorflow.one_hot
os.walk
tf.cast
result_array.append
MusicTransformer.generate
tensorflow.nn.softmax_cross_entropy_with_logits
tensorflow.python.keras.layers.Dense
self.dropout1
EventSeq.feat_dims
self._get_seq
self._get_seq.append
tf.concat
Event
random.uniform
tf.name_scope
tensorflow.print
model.MusicTransformer.compile
event_seq.to_note_seq.to_midi_file
tensorflow.math.pow
self.__train_step
self.dropout
EventSeq.feat_ranges.items
tensorflow.summary.histogram
model.MusicTransformerDecoder.reset_metrics
r.numpy
predictions.tf.argmax.numpy
random.randrange
tf.summary.create_file_writer
tensorflow.einsum
self.optimizer.apply_gradients
numpy.cos
predictions.target.metric.numpy
deprecated.sequence.EventSeq.dim
EventSeq.get_velocity_bins
self.load_ckpt_file
pretty_midi.Instrument
EncoderLayer
str
numpy.uint8.ndens.np.array.reshape
self.embedding
custom.callback.CustomSchedule
tf.expand_dims
tensorflow.math.sqrt
pretty_midi.PrettyMIDI
tensorflow.reshape
filter
list
tensorflow.maximum
x.get_shape
x.get_shape.as_list
self.load_config_file
data.Data.seq2seq_batch
tensorflow.shape
tf.GradientTape
json.dump
os.path.join
tf.print
predictions.tf.nn.softmax.y.metric.numpy
super.__init__
result.sequence.EventSeq.from_array.to_note_seq.to_midi_file
model.MusicTransformerDecoder.save
EventSeq.dim
tensorflow_probability.distributions.Categorical
tf.distribute.MirroredStrategy
item.split.split
self.to_midi
array.astype
Control
tensorflow_probability.distributions.Categorical.sample
self.dropout3
tensorflow.concat
tensorflow.python.keras.layers.Embedding
os.path.exists
model.MusicTransformer.save
_has_ext
tf.reduce_mean
deprecated.sequence.ControlSeq.feat_ranges
EventSeq.feat_dims.items

@developer
Could please help me check this issue?
May I pull a request to fix it?
Thank you very much.

Cannot use custom midi

there is a processing error when I use my own 2k collection of MIDI. I tried changing the format from .mid to .midi, still give "��]q" as an output. MIDI dataset was generated from wave2midi2wave method from Magenta

Implement QKV logic in terms of einsum

You could reimplement the QKV / dense logic in terms of einsum for faster computation. An example layer here and the use here. This is how it is is now implemented in the tf2 version of bert / transformer.

How can we make it better for song vocals ?

The output for audio vocals is very bad. How can we use it? What preprocess should we use?

error: ecomp_piano_downloader.sh

When I run the script in the title, I get the following error
No URLs found in -.
rm: missing operand
Try 'rm --help' for more information.
rm: cannot remove '*': No such file or directory

Does anybody know how?

generate error

Exception ignored in: <bound method _CheckpointRestoreCoordinatorDeleter.del of <tensorflow.python.training.tracking.util._CheckpointRestoreCoordinatorDeleter object at 0x63ecb5470>>
Traceback (most recent call last):
File "/Users/lun/.conda/envs/test/lib/python3.6/site-packages/tensorflow_core/python/training/tracking/util.py", line 140, in del
TypeError: 'NoneType' object is not callable

Unable to reproduce loss/accuracy result

Hi jason9693 and community,

Thanks for the wonderful implementation of Music Transformer! Really appreciate your work.

I have tried using the preset hyperparameter values to train the model, but unfortunately I can't get the same results as claimed in the graphs shown in README.md.

I hereby attach my loss and accuracy graphs -- it already converges at 5.0+ and 0.04+ during epoch 200, not as claimed to be 2.2+ and 0.34+ in the graph.

Loss:

Accuracy:

Would appreciate any advice from this community on measures to take to reproduce the results. Thank you =)

Is there a pretrained model?

Hello @jason9693 !

First of all, thank you for sharing this project through MIT license.

Now, I'd like to ask if there are any pretrained models for this project of yours?

I'd appreciate your answer very much!
-Milos

No python-apt==1.6.4 and bc-dvc-init==0.3.0

Where can I find these files?

Multi-Track midi not supported?

Is there no way to training multi-track method custom midi data??

bash ecomp_piano_downloader.sh midi/ecomp_piano

When I run the script in the title, I get the following error
No URLs found in -.
rm: missing operand
Try 'rm --help' for more information.
rm: cannot remove '*': No such file or directory

Does anybody know how?

Why not encoder?

I try th encoder-decoder model and I found my loss dosen't convergence. I wonder why is the extra information (the encoder input) results in a bad performance?

Vocabulary size mismatch

preprocess.py use midi_processor.processor to encode my midi files (PIANO-E-COMPETITION), the vocabulary size is 388, but sequence.py get rid off the notes unused in my midi scores, so the vocabulary size in model decreased to 243,
when I run train.py, it reports:

tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,1920] = 361 is not in [0, 243) [Op:ResourceGather] name: encoder/embedding/embedding_lookup/

So how to deal with the vocab size mismatch problem?

ModuleNotFoundError: No module named 'midi_processor'

How do I get this file?

Problem about the process of generate

Hello. Thank you for your contributions about these code. After I finished training the model, the loss decreases normly. But when I generate, as your suggestion, the model tends to output a series of identical numbers. I am confused about this. Could please tell me something about this or upload a pretrained model. Thank you very much

I've got a problem about sh

Thank you for your code and the opportunity to see what's going on with Music Transformer!!

However, I'm a newbie to this field, so I have no idea how to download midi dataset

$ sh dataset/script/{ecomp_piano_downloader, midi_world_downloader, ...}.sh

I've tried several times but, I can't figure it out on my own.
also,

$ python preprocess.py {midi_load_dir} {dataset_save_dir}

I imitated
https://si-partners.net/blog/music-transformer.html
but I still don't know how to deal with it.
please help me, I feel suffocated!