davidadsp / gdl_code Goto Github PK

View Code? Open in Web Editor NEW

1.5K 53.0 743.0 22.13 MB

The official code repository for examples in the O'Reilly book 'Generative Deep Learning'

License: GNU General Public License v3.0

Jupyter Notebook 95.22% Python 4.73% Shell 0.06%

gdl_code's Introduction

🦜 Generative Deep Learning

[NEW] 🚀 2nd edition codebase now live!

⚠️ This repository is no longer maintained as the codebase for the 2nd edition of Generative Deep Learning is now live!

Please head over to https://github.com/davidADSP/Generative_Deep_Learning_2nd_Edition to check it out.

Branches

The master branch of this repo contains the Tensorflow 1.14 code that is present in the original book (first edition)

The tensorflow_2 branch contains updated code that runs using Tensorflow 2.

However, I recommend swiching over to the repository for the 2nd edition (see above), as it contains many new examples and improvements to the overall structure.

gdl_code's People

Contributors

Stargazers

Watchers

Forkers

anujarora93 oliverwy phaphuang ajnovice rickiepark baldrlector dasavisha nerdsniper dalenicholson yeshwanthv5 gakkilovemath quintusdias allensmile nguyendo24 dmanh advino anyuanay jdc08161063 leoyouli vedfu luhuijun666 huyz1117 lgl603 mrzyzhaozeyu noeverer airob qianrenjian ahuatian25 icefire-luo yxmanfred d0cx4nd3r chengjingfeng tchigher yunhengzi tonykuo222 tanghy2016 gelansheshed aust-hansen hexieshenghuo angleboy8 ayuliao jwdinius ianliyi1996 stevenhailin wang-jinghui owlonoak jeremychenjianwei caiomiyashiro thisyoung inchwater bentosilva jaumemir yixf-self hellokang arlancooper codefavor2018 terezaif pratyushlohumi26 huxiangdong ai-maxim boozyguo xrosliang bhatti shyamalschandra cliuxinxin aiexperts cicorias cwiz ljayy alessando-guida cheeringwong yuzhang49 batermj shahidpavis aoe-khkhan marcelomata linhduongtuan shravankumar147 bodomit yaozhengjie tiravata weiling103 j0kershi axfv vikasmech andretaibei lstmemery pymia rishabh121212 pyoungkangkim nielspace vkomini rashidch kjfff fileung harrywang shanhedian2017 eywalker joehandzik lavanyashukla

gdl_code's Issues

Could you please add the midi files that generated by your machine?

So we can listen the results

03_01_autoencoder_train.ipynb and 03_03_vae_digits_train.ipynb both crash with GraphViz error

I set up the conda environment according to instructions; all notebooks in chapter 2 work, but as soon as I get to chapter 3 I'm stopped by this error in 03_01_autoencoder_train.ipynb and 03_03_vae_digits_train.ipynb:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
~/anaconda3/envs/generative/lib/python3.6/site-packages/pydot.py in create(self, prog, format, encoding)
   1914                 arguments=arguments,
-> 1915                 working_dir=tmp_dir,
   1916             )

~/anaconda3/envs/generative/lib/python3.6/site-packages/pydot.py in call_graphviz(program, arguments, working_dir, **kwargs)
    135         stdout=subprocess.PIPE,
--> 136         **kwargs
    137     )

~/anaconda3/envs/generative/lib/python3.6/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors)
    728                                 errread, errwrite,
--> 729                                 restore_signals, start_new_session)
    730         except:

~/anaconda3/envs/generative/lib/python3.6/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
   1363                             err_msg += ': ' + repr(err_filename)
-> 1364                     raise child_exception_type(errno_num, err_msg, err_filename)
   1365                 raise child_exception_type(err_msg)

FileNotFoundError: [Errno 2] No such file or directory: 'dot': 'dot'

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
~/anaconda3/envs/generative/lib/python3.6/site-packages/keras/utils/vis_utils.py in _check_pydot()
     25         # to check the pydot/graphviz installation.
---> 26         pydot.Dot.create(pydot.Dot())
     27     except OSError:

~/anaconda3/envs/generative/lib/python3.6/site-packages/pydot.py in create(self, prog, format, encoding)
   1921                     prog=prog)
-> 1922                 raise OSError(*args)
   1923             else:

FileNotFoundError: [Errno 2] "dot" not found in path.

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
<ipython-input-4-2cbc3320019b> in <module>
     11 
     12 if MODE == 'build':
---> 13     AE.save(RUN_FOLDER)
     14 else:
     15     AE.load_weights(os.path.join(RUN_FOLDER, 'weights/weights.h5'))

~/repo/GDL_code/models/AE.py in save(self, folder)
    153                 ], f)
    154 
--> 155         self.plot_model(folder)
    156 
    157 

~/repo/GDL_code/models/AE.py in plot_model(self, run_folder)
    182 
    183     def plot_model(self, run_folder):
--> 184         plot_model(self.model, to_file=os.path.join(run_folder ,'viz/model.png'), show_shapes = True, show_layer_names = True)
    185         plot_model(self.encoder, to_file=os.path.join(run_folder ,'viz/encoder.png'), show_shapes = True, show_layer_names = True)
    186         plot_model(self.decoder, to_file=os.path.join(run_folder ,'viz/decoder.png'), show_shapes = True, show_layer_names = True)

~/anaconda3/envs/generative/lib/python3.6/site-packages/keras/utils/vis_utils.py in plot_model(model, to_file, show_shapes, show_layer_names, rankdir)
    130             'LR' creates a horizontal plot.
    131     """
--> 132     dot = model_to_dot(model, show_shapes, show_layer_names, rankdir)
    133     _, extension = os.path.splitext(to_file)
    134     if not extension:

~/anaconda3/envs/generative/lib/python3.6/site-packages/keras/utils/vis_utils.py in model_to_dot(model, show_shapes, show_layer_names, rankdir)
     53     from ..models import Sequential
     54 
---> 55     _check_pydot()
     56     dot = pydot.Dot()
     57     dot.set('rankdir', rankdir)

~/anaconda3/envs/generative/lib/python3.6/site-packages/keras/utils/vis_utils.py in _check_pydot()
     27     except OSError:
     28         raise OSError(
---> 29             '`pydot` failed to call GraphViz.'
     30             'Please install GraphViz (https://www.graphviz.org/) '
     31             'and ensure that its executables are in the $PATH.')

OSError: `pydot` failed to call GraphViz.Please install GraphViz (https://www.graphviz.org/) and ensure that its executables are in the $PATH.

I did some searching and people are saying to fix it by running conda install graphviz followed by conda install python-graphviz, but I tried that and it doesn't change anything. The book doesn't mention GraphViz anywhere.

Question: 05_01_cyclegan_train extremely slow (TF 2.0)

Hi there,

Training the 05_01_cyclegan notebook (apple2orange) is taking me ~30 hours on a system with a 2060RTX Super.
Is anyone else having similar (large) training times?

I tried Google Colab - it runs, but it takes so long it just times out.

What are your experiences and thoughts?

Thanks!

CycleGan error 'ListWrapper' object has no attribute 'name'

When executing the line that creates the GAN, I receive the following error:

AttributeError Traceback (most recent call last)
in
12
13 if mode == 'build':
---> 14 gan.save(RUN_FOLDER)
15 else:
16 gan.load_weights(os.path.join(RUN_FOLDER, 'weights/weights.h5'))

~/Projects/GDL_code/models/cycleGAN.py in save(self, folder)
410 ], f)
411
--> 412 self.plot_model(folder)
413
414

~/Projects/GDL_code/models/cycleGAN.py in plot_model(self, run_folder)
388
389 def plot_model(self, run_folder):
--> 390 plot_model(self.combined, to_file=os.path.join(run_folder ,'viz/combined.png'), show_shapes = True, show_layer_names = True)
391 plot_model(self.d_A, to_file=os.path.join(run_folder ,'viz/d_A.png'), show_shapes = True, show_layer_names = True)
392 plot_model(self.d_B, to_file=os.path.join(run_folder ,'viz/d_B.png'), show_shapes = True, show_layer_names = True)

/usr/lib/python3.8/site-packages/tensorflow/python/keras/utils/vis_utils.py in plot_model(model, to_file, show_shapes, show_layer_names, rankdir, expand_nested, dpi)
276 This enables in-line display of the model plots in notebooks.
277 """
--> 278 dot = model_to_dot(model,
279 show_shapes=show_shapes,
280 show_layer_names=show_layer_names,

/usr/lib/python3.8/site-packages/tensorflow/python/keras/utils/vis_utils.py in model_to_dot(model, show_shapes, show_layer_names, rankdir, expand_nested, dpi, subgraph)
141
142 # Append a wrapped layer's label to node's label, if it exists.
--> 143 layer_name = layer.name
144 class_name = layer.class.name
145

AttributeError: 'ListWrapper' object has no attribute 'name'

some tiny issues in GAN

Hey great read and great code.
I stumbled on some tiny issues when running the GAN:
in def train_discriminator
the:

else:
            idx = np.random.randint(0, x_train.shape[0], batch_size)
            true_imgs = x_train[idx]

should be

else:
           idx = np.random.randint(0, x_train[0].shape[0], batch_size)
            true_imgs = x_train[0][idx]

as it comes back with x at [0] and y at [1]
when running the train I had to use:

graph = tf.get_default_graph()
with graph.as_default():
     gan.train(........

as it was complaining about tensors not being part of the graph. that seems to be a Keras issue.

How can I start the dockerfile?

When I start the dockerfile, I get the error:

Unable to find image 'gdl-image:latest' locally
docker: Error response from daemon: pull access denied for gdl-image, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.

No module named 'models.MuseGAN_old'

What a nice repo! And an excellent book!

When running 07_05_musegan_analysis I get the following error. Toggling the comment for both lines helps. Is this way to go?

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-582e35342597> in <module>()
      7 from music21 import converter
      8 
----> 9 from models.MuseGAN_old import MuseGAN
     10 # from models.MuseGAN import MuseGAN
     11 

ModuleNotFoundError: No module named 'models.MuseGAN_old'

02_01: Sequential and Funtional variant do not give the same summary

Hi David,

thanks a lot for the book, which I very much enjoy.

I noticed that the Functional example in 02_01 and the corresponding Sequential one (only given in the book) do not give the same model summary. Is that expected?

Functional:

input_layer = Input((32,32,3))

x = Flatten()(input_layer)

x = Dense(200, activation = 'relu')(x)
x = Dense(150, activation = 'relu')(x)

output_layer = Dense(NUM_CLASSES, activation = 'softmax')(x)

model = Model(input_layer, output_layer)

model.summary()

Output:

Model: "model_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_9 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
flatten_15 (Flatten)         (None, 3072)              0         
_________________________________________________________________
dense_48 (Dense)             (None, 200)               614600    
_________________________________________________________________
dense_49 (Dense)             (None, 150)               30150     
_________________________________________________________________
dense_50 (Dense)             (None, 10)                1510      
=================================================================
Total params: 646,260
Trainable params: 646,260
Non-trainable params: 0

Sequential:

model = Sequential([
    Dense(200, activation="relu", input_shape=(32, 32, 3)),
    Flatten(),
    Dense(150, activation="relu"),
    Dense(10, activation="softmax")
])

model.summary()

Output:

Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_45 (Dense)             (None, 32, 32, 200)       800       
_________________________________________________________________
flatten_14 (Flatten)         (None, 204800)            0         
_________________________________________________________________
dense_46 (Dense)             (None, 150)               30720150  
_________________________________________________________________
dense_47 (Dense)             (None, 10)                1510      
=================================================================
Total params: 30,722,460
Trainable params: 30,722,460
Non-trainable params: 0

Kind regards,
Axel

pickle error in GAN.py save_model function

Has someone got this error with pickle ?

I am using tensorflow-gpu 2.0.0 from pypi install channels

~/GDL_code/models/GAN.py in save_model(self, run_folder)
    332         self.discriminator.save(os.path.join(run_folder, 'discriminator.h5'))
    333         self.generator.save(os.path.join(run_folder, 'generator.h5'))
--> 334         pkl.dump(self, open( os.path.join(run_folder, "obj.pkl"), "wb" ))
    335 
    336     def load_weights(self, filepath):

TypeError: can't pickle _thread.RLock objects

do you know where it came from ?

utils module

Hi,
I'm trying to execute the notebooks as I am reading the book on O'Reilly platform but the utils folder (which correspond to a module used in chapter 3) is empty.
Is it too early to try execute the notebooks?
Thanks.

WGAN-GP implementation incompatible with TensorFlow 2.x

FWIW the WGAN-GP implementation in this repo, as the issue title suggests, breaks when you try to use TensorFlow 2.x. When you try to use the current implementation K.gradients() returns None when computing the gradient penalty loss.

I am not sure if this book is intentionally restricted to TF 1.x, but it might be nice to mention this breakage somewhere?

There's an implementation of a WGAN-GP using Keras and TensorFlow here (this code has the WGAN-GP extend the keras.models.Model class but I don't think that is actually necessary).

any model weights available?

It seems colab cannot load all the data of celeb pictures from google drive and it always times out. I wonder if there is trained weights of different models already to ease the training step, especially loading the data.

ValueError: operands could not be broadcast together with shapes (32,) (7,) (32,)

I'm getting the following error when finishing the first epoch for 03_05_vae_faces_train notebook. The training throughout the epoch is ok until I need to pass to epoch 2.

ValueError: operands could not be broadcast together with shapes (32,) (7,) (32,)

-Tensorflow-gpu 2

Ubuntu 18.04
NVIDIA 1050Ti

`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
5 , run_folder = RUN_FOLDER
6 , print_every_n_batches = PRINT_EVERY_N_BATCHES
----> 7 , initial_epoch = INITIAL_EPOCH
8 )

~/Documents/Generative/tf2/GDL_code/models/VAE.py in train_with_generator(self, data_flow, epochs, steps_per_epoch, run_folder, print_every_n_batches, initial_epoch, lr_decay)
249 , initial_epoch = initial_epoch
250 , callbacks = callbacks_list
--> 251 , steps_per_epoch=steps_per_epoch
252 )
253

~/anaconda3/envs/generativetf2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in _method_wrapper(self, *args, **kwargs)
64 def _method_wrapper(self, *args, **kwargs):
65 if not self._in_multi_worker_mode(): # pylint: disable=protected-access
---> 66 return method(self, *args, **kwargs)
67
68 # Running inside run_distribute_coordinator already.

~/anaconda3/envs/generativetf2/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
853 context.async_wait()
854 logs = tmp_logs # No error, now safe to assign to logs.
--> 855 callbacks.on_train_batch_end(step, logs)
856 epoch_logs = copy.copy(logs)
857

~/anaconda3/envs/generativetf2/lib/python3.6/site-packages/tensorflow/python/keras/callbacks.py in on_train_batch_end(self, batch, logs)
388 if self._should_call_train_batch_hooks:
389 logs = self._process_logs(logs)
--> 390 self._call_batch_hook(ModeKeys.TRAIN, 'end', batch, logs=logs)
391
392 def on_test_batch_begin(self, batch, logs=None):

~/anaconda3/envs/generativetf2/lib/python3.6/site-packages/tensorflow/python/keras/callbacks.py in _call_batch_hook(self, mode, hook, batch, logs)
296 for callback in self.callbacks:
297 batch_hook = getattr(callback, hook_name)
--> 298 batch_hook(batch, logs)
299 self._delta_ts[hook_name].append(time.time() - t_before_callbacks)
300

~/anaconda3/envs/generativetf2/lib/python3.6/site-packages/tensorflow/python/keras/callbacks.py in on_train_batch_end(self, batch, logs)
882
883 def on_train_batch_end(self, batch, logs=None):
--> 884 self._batch_update_progbar(logs)
885
886 def on_test_batch_end(self, batch, logs=None):

~/anaconda3/envs/generativetf2/lib/python3.6/site-packages/tensorflow/python/keras/callbacks.py in _batch_update_progbar(self, logs)
926 add_seen = num_steps if self.use_steps else num_steps * batch_size
927 self.seen += add_seen
--> 928 self.progbar.update(self.seen, list(logs.items()), finalize=False)
929
930 def _finalize_progbar(self, logs):

~/anaconda3/envs/generativetf2/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py in update(self, current, values, finalize)
570 self._values[k] = [v * value_base, value_base]
571 else:
--> 572 self._values[k][0] += v * value_base
573 self._values[k][1] += value_base
574 else:

ValueError: operands could not be broadcast together with shapes (32,) (7,) (32,)
`

GAN.py does not work when using tf.keras instead of the standalone keras

Get:
InvalidArgumentError: You must feed a value for placeholder tensor 'discriminator_input' with dtype float and shape [?,28,28,1]
[[{{node discriminator_input}}]]
[[{{node loss_1/model_loss/broadcast_weights/assert_broadcastable/is_valid_shape/has_valid_nonscalar_shape/has_invalid_dims/concat}}]]

I have not been able to trace this down other than it is in the generator model.

I met some errors when installing requirements.txt

Hello.
I need some help.
I met some error messages as followed when installing requirements.txt.

I am waiting for your reply having the solution for me.

Help trying to adapt attention model to my data.

My code is almost a copy paste of the attention model. Even though the original code/data works fine, when I tweak it a bit for my data, it doesn't.

While this code works with music notation, my data consists of very small images (5 by 5 pixels). And they already have values between 0 and 1.

My input has a shape of 257000, 240, 50 so my sequences are 240 long, and I am concatenating and flattening two 5x5 images to get 50 points (I know this is not the best strategy, but this is only the first try). The output is 257000, 25. So just one of the images. The idea is to input sequences of pair of images, and output the next image. This code works well, and produces nice results, when doing stacked LSTMs.

My code for attention, following the link before, is as follows:

def create_network(n_in, embed_size = 100, rnn_units = 256, use_attention = True):
    """ create the structure of the neural network """

    inputs = Input(shape = (n_in.shape[1],n_in.shape[2]))

    # we will use a dense layer as embedding
    x = Dense(embed_size, activation='relu')(inputs)

    x = LSTM(rnn_units, return_sequences=True)(x)

    if use_attention:

        x = LSTM(rnn_units, return_sequences=True)(x)

        e = Dense(1, activation='tanh')(x)
        e = Reshape([-1])(e)
        alpha = Activation('softmax')(e)

        alpha_repeated = Permute([2, 1])(RepeatVector(rnn_units)(alpha))

        c = Multiply()([x, alpha_repeated])
        c = Lambda(lambda xin: K.sum(xin, axis=1), output_shape=(rnn_units,))(c)
    
    else:
        c = LSTM(rnn_units)(x)
                                    
    bz_out = Dense(25, activation = 'relu', name = 'gen_oscs')(c)
   
    model = Model(inputs, bz_out)
    

    if use_attention:
        att_model = Model(inputs, alpha)
    else:
        att_model = None

    opti = RMSprop(lr = 0.001)
    model.compile(loss='mae', optimizer=opti)

    return model, att_model

And my code to train the network:


def trainRNNGen(model, generator):

    randomize = np.arange( len(generator) - 1 ) # remove last post we filled with 0s
    np.random.shuffle(randomize)
    trainLimit = int( 0.9*len(randomize) )
    valsteps = int( 0.1*len(randomize) )

    folderpath = "/home/juanma/data/RNN_BZ/RNN_weights/"
    filepath = folderpath+"weights-{epoch:03d}-{loss:.4f}-{val_loss:.4f}.hdf5"    

    checkpoint1 = ModelCheckpoint(
        filepath, monitor='loss',
        verbose=0,
        save_best_only=True,
        period=5,
        mode='min'
    )

    checkpoint2 = ModelCheckpoint(
        os.path.join(folderpath, "weights.h5"),
        monitor='loss',
        verbose=0,
        save_best_only=True,
        mode='min'
    )

    early_stopping = EarlyStopping(
        monitor='loss'
        , restore_best_weights=True
        , patience = 100
    )

    callbacks_list = [
        checkpoint1
        , checkpoint2
        , early_stopping
     ]

    model.save_weights(os.path.join(folderpath, "weights.h5"))
    model.fit(x = customGenerator(generator, randomize[:trainLimit]), y = None,
        validation_data = customGenerator(generator, randomize[trainLimit:]),
        epochs=1000, steps_per_epoch = trainLimit, 
        validation_steps =  valsteps, 
        use_multiprocessing = False, callbacks=callbacks_list)

    return model

When I run both these functions, using my dataset, and setting use_attention to False, so the NN is just stacked LSTMs, it works fine, and the loss value goes down. But when I set use_attention to True, and it does not learn anything, and the loss function does not go down not even in the first iterations.

I think the attention model somehow is destroying the data, but at the moment I have no idea how.

Error: MuseGan.py　

Trouble about from keras.layers.merge import _Merge at second line in MuseGan.py.
It seems that _Merge is no longer supported in Keras.

Will you update this file?

VAE Analysis - Faces dataset

ValueError: Invalid class_mode: other; expected one of: {'sparse', 'binary', 'raw', 'categorical', None, 'input', 'multi_output'}

Question about WGAN-GP and 0 dummy vector as output

Hi, this is a question from the book against the code. Page 125, point 7, it says "The model has three outputs: 1..., -1... and a dummy 0 vector.

But then in the code (example 4.10) the third output is validity_interpolated, which is not a 0 vector.

Is this a mistake or there's something I don't understand?

Thanks for your help

Error : def load_music() in utils/loaders.py

def load_music(data_name, filename, n_bars, n_steps_per_bar):
    file = os.path.join("./data", data_name, filename)

    with np.load(file, encoding='bytes', allow_pickle=True) as f:       #Fix this line.
        data = f['train']

You have to add "allow_pickle=True" to use this function.

Too many epoch in training LSTM for Music generation

In 7-2 notebook, why the default epoch is set to 200000?

Some problems with libproj library when loading

I got the GDL source to compile on a fairly plain-vanilla Ubuntu 18.04 system, but only after solving some loader problems with unresolved names, in particular "pj_init_plus" --- this name appears in "projections.cpp", and that routine does seem to have some conditional compilation commands that circumvent the appearance of "pj_init_plus", but I couldn't get those to work.

What did work was removing all versions of "libproj" and "libgeotiff" and associated files from my system, then going to github and getting the source code for the current versions, compiling and installing "libproj" from proj-7.1.0.tar.gz then compiling and installing "libgeotiff" from libgeotiff-master.zip, and finally compiling GDL. "pj_init_plus" is provided in recent versions of libproj, and the current version of libgeotiff depends on it (and not the deprecated "pj_init"). I got libproj.so.19, whereas the standard version for Ubuntu 18.04 is libproj.so.12

Could you explain what does this line do in load_music()

    if n_bars * n_steps_per_bar < x.shape[0]:
        data_ints.append(x[counter:(counter + (n_bars * n_steps_per_bar)),:])

Problem with the 03_03_vae_digits_train.ipynb

Thanks for the book. When running this notebook the program fails on the training step

TypeError: An op outside of the function building code is being passed a "Graph" tensor. It is possible to have Graph tensors leak out of the function building context by including a tf.init_scope in your function building code. For example, the following function will fail: @tf.function def has_init_scope(): my_constant = tf.constant(1.) with tf.init_scope(): added = my_constant * 2 The graph tensor has name: log_var/Identity:0

As far a I understand the issue lies somewhere there, however being a novice in tensorflow
am I not able to understand how to resolve it.

def sampling(args): mu, log_var = args epsilon = K.random_normal(shape=K.shape(mu), mean=0., stddev=1.) return mu + K.exp(log_var / 2) * epsilon encoder_output = Lambda(sampling, name='encoder_output')([self.mu, self.log_var])

I would appreciate any help.

AttributeError: 'str' object has no attribute 'shape' in 07_04 Musegan notebook

/content/drive/My Drive/mommy/GDL_code/utils/loaders.py in load_music(data_name, filename, n_bars, n_steps_per_bar)
269 counter += 4
270
--> 271 if n_bars * n_steps_per_bar < x.shape[0]:
272 data_ints.append(x[counter:(counter + (n_bars * n_steps_per_bar)),:])
273

AttributeError: 'str' object has no attribute 'shape'

What can be the problem?I

the dataset are missing

several data set are missing such as ./data/celeb/ or the GAN one, and seems I cannot find instructions how to download them correctly in the book or readme

any hints?

There's no legend to explain which color corresponds to what in 07_04_musegan_train.ipynb

There's no legend to explain which color corresponds to what in 07_04_musegan_train.ipynb.

The final figure plots how various components of losses change throughout epochs, but there's no legend, and therefore it's difficult to understand what black, green, red, and orange lines correspond to.

A simple legend that shows the meaning of each color will make this figure much easier to interpret.

DOUBT: Did not understand the concept of 'models'.

Please note that I am not a very good programmer. I understand that my doubt might be silly to a few people.

Why has the author made a separate file for models? [If I am not wrong, 'models' is not actually a well-known package but a file that contains a lot of other python files that contain classes we need to perform the task. Considering, I am still on Chapter 3, models' folder's AE.py file has the class Autoencoder, which has all the custom functions required to build the model].

How can I make my own dataset for MuseGAN?

using tensorflow2 branch getting error

set_session` is not available when using TensorFlow 2.0

Error installing requirements.txt

Hi I've read halfway through your awesome book and now want to start diving into the code.
Unfortunately, I cannot get past setting things up.
After running 'pip install requirements' in the virtualenv i made, the install is aborted and the following errors appear:

ERROR: tensorflow-tensorboard 0.4.0 has requirement bleach==1.5.0, but you'll have bleach 3.1.0 which is incompatible.
ERROR: tensorboard 1.14.0 has requirement setuptools>=41.0.0, but you'll have setuptools 39.0.1 which is incompatible.

I have the feeling the interdependencies of the required modules are broken... could you check it out and fix the requirements.txt accordingly?

Thank you.

Why do you leave out batch norm on the first iteration of creating the discriminator for GAN.py?

It kind of struck my curiosity on why you left out Batchnorm on iteration 0.

https://github.com/davidADSP/GDL_code/blob/master/models/GAN.py#L105

File name error in 07_05_musegan_analysis.ipynb, because RUN_ID is different from 07_04_musegan_train.ipynb

File name error in 07_05_musegan_analysis.ipynb, because RUN_ID is different from 07_04_musegan_train.ipynb:

Unable to open file (unable to open file: name = 'run/compose/0016_chorales/weights/weights-g.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

That's because RUN_ID in 07_04_musegan_train.ipynb, is different in 07_05_musegan_analysis.ipynb.

magnitude of loss values when training the variational autoencoder

Hey there - thanks for writing such a great book, and releasing the code! I'm looking forward to your next book on reinforcement learning too!

I trained your variational autoencoder (vae) and was a bit surprised at the magnitude of the losses when running the vae, which was much higher than the standard autoencoder. Could you please post your loss data, so I could compare it to mine.

This is from my run:

Train on 48000 samples, validate on 12000 samples
Epoch 1/500
   32/48000 [..............................] - ETA: 50:31 - loss: 231.1299 - vae_r_loss: 231.1293 - vae_kl_loss: 5.8350e-04
WARNING: Logging before flag parsing goes to stderr.
W0902 23:57:59.011679 139854072612672 callbacks.py:243] Method (on_train_batch_end) is slow compared to the batch update (0.172191). Check your callbacks.
47840/48000 [============================>.] - ETA: 0s - loss: 59.9847 - vae_r_loss: 56.8435 - vae_kl_loss: 3.1413
Epoch 00001: val_loss improved from inf to 53.52105, saving model to /models/VariationalAutoencoder/model_checkpoint.h5
48000/48000 [==============================] - 17s 364us/sample - loss: 59.9680 - vae_r_loss: 56.8250 - vae_kl_loss: 3.1430 - val_loss: 53.5210 - val_vae_r_loss: 49.5327 - val_vae_kl_loss: 3.9883
Epoch 2/500
47872/48000 [============================>.] - ETA: 0s - loss: 52.4285 - vae_r_loss: 48.5862 - vae_kl_loss: 3.8424
Epoch 00002: val_loss improved from 53.52105 to 50.94505, saving model to /models/VariationalAutoencoder/model_checkpoint.h5
48000/48000 [==============================] - 15s 321us/sample - loss: 52.4225 - vae_r_loss: 48.5799 - vae_kl_loss: 3.8426 - val_loss: 50.9451 - val_vae_r_loss: 47.0550 - val_vae_kl_loss: 3.8900
Epoch 3/500
47936/48000 [============================>.] - ETA: 0s - loss: 50.9089 - vae_r_loss: 46.7227 - vae_kl_loss: 4.1862
Epoch 00003: val_loss improved from 50.94505 to 49.68459, saving model to /models/VariationalAutoencoder/model_checkpoint.h5
48000/48000 [==============================] - 15s 310us/sample - loss: 50.9075 - vae_r_loss: 46.7216 - vae_kl_loss: 4.1860 - val_loss: 49.6846 - val_vae_r_loss: 45.5978 - val_vae_kl_loss: 4.0868
Epoch 4/500
47872/48000 [============================>.] - ETA: 0s - loss: 49.8313 - vae_r_loss: 45.4165 - vae_kl_loss: 4.4148
Epoch 00004: val_loss improved from 49.68459 to 48.86022, saving model to /models/VariationalAutoencoder/model_checkpoint.h5
48000/48000 [==============================] - 15s 317us/sample - loss: 49.8250 - vae_r_loss: 45.4101 - vae_kl_loss: 4.4149 - val_loss: 48.8602 - val_vae_r_loss: 44.5305 - val_vae_kl_loss: 4.3297
Epoch 5/500
47936/48000 [============================>.] - ETA: 0s - loss: 49.1510 - vae_r_loss: 44.6246 - vae_kl_loss: 4.5264
Epoch 00005: val_loss improved from 48.86022 to 48.15245, saving model to /models/VariationalAutoencoder/model_checkpoint.h5
48000/48000 [==============================] - 15s 322us/sample - loss: 49.1478 - vae_r_loss: 44.6210 - vae_kl_loss: 4.5268 - val_loss: 48.1525 - val_vae_r_loss: 43.4562 - val_vae_kl_loss: 4.6962
Epoch 6/500
47872/48000 [============================>.] - ETA: 0s - loss: 48.6135 - vae_r_loss: 44.0106 - vae_kl_loss: 4.6029
Epoch 00006: val_loss improved from 48.15245 to 47.97484, saving model to /models/VariationalAutoencoder/model_checkpoint.h5
48000/48000 [==============================] - 15s 317us/sample - loss: 48.6163 - vae_r_loss: 44.0133 - vae_kl_loss: 4.6030 - val_loss: 47.9748 - val_vae_r_loss: 43.2041 - val_vae_kl_loss: 4.7707
Epoch 7/500

I was surprised to see values like 44, etc, and the kl loss seems to be increasing...

Thanks!

Discrepancy between trainable weights and collected trainable weights

When I run the 04_03_wgangp_faces_train notebook, I get this error:

/home/danvk/GDL_code/venv/lib/python3.7/site-packages/keras/engine/training.py:297: UserWarning: Discrepancy between trainable weights and collected trainable weights, did you set `model.trainable` without calling `model.compile` after ?
  'Discrepancy between trainable weights and collected trainable'
/home/danvk/GDL_code/venv/lib/python3.7/site-packages/keras/engine/training.py:297: UserWarning: Discrepancy between trainable weights and collected trainable weights, did you set `model.trainable` without calling `model.compile` after ?
  'Discrepancy between trainable weights and collected trainable'

Here's what the full cell looks like:

The model seems to progress while training:

Unfortunately, it doesn't seem to checkpoint any of the weights or save sample images. Even trying to save images explicitly does nothing:

gan.sample_images(RUN_FOLDER)
# RUN_FOLDER is empty

I did have to update to tensorflow 2 to get it to recognize my GPU, so perhaps that's the culprit.

what does it mean "using genarators" when training MuseGan ? ( at 7_04 notebook)

In train_critic(self, x_train, batch_size, using_generator):

   if using_generator:
        true_imgs = next(x_train)[0]
        if true_imgs.shape[0] != batch_size:
            true_imgs = next(x_train)[0]
    else:
        idx = np.random.randint(0, x_train.shape[0], batch_size)
        true_imgs = x_train[idx]

pip install -r requirements.txt gave Tensorflow version error 1.12.0 not available

My pip (1.19.2) could not find Tensorflow 1.12.0 as specified in requirements.txt. After installing lowest version I could find (1.13.1) Tensorboard was incompatible - after several tries I was obliged to use 1.14.0 for both - the further checks in chapter1 worked fine. However will I perhaps have trouble running some of the book examples? I don't mind the version discrepancy otherwise - just wanted to report the issue in case someone else may also run into the problem.

I am using Python 3.7.2 on Windows 10.

NameError: name 'chord' is not defined in 07_01_notation_compose.ipynb

NameError: name 'chord' is not defined in 07_01_notation_compose.ipynb:

NameError                                 Traceback (most recent call last)
<ipython-input-5-fbeff2cc0330> in <module>
      4 for element in original_score.flat:
      5 
----> 6     if isinstance(element, chord.Chord):
      7         notes.append('.'.join(n.nameWithOctave for n in element.pitches))
      8         durations.append(element.duration.quarterLength)

NameError: name 'chord' is not defined

because chord is missing from the imports. There's a similar problem with note:

from music21 import converter

Using GPU devices in your code

What is the best way to configure your notebooks to utilize GPU Hardware?

03_03_vae_digits_train: TypeError: unsupported format string passed to numpy.ndarray.format

I am running on Ubuntu 18.04 with Python 3.6.9 and when running 03_03_vae_digits_train I encounter the following error:

vae.train(     
    x_train
    , batch_size = BATCH_SIZE
    , epochs = EPOCHS
    , run_folder = RUN_FOLDER
    , print_every_n_batches = PRINT_EVERY_N_BATCHES
    , initial_epoch = INITIAL_EPOCH
)

I installed using the newest pip with pip install -r requirements.txt and no errors occured and i had to install graphviz.

BTW numpy is 1.17.2 as required.

$ pip freeze | grep numpy
numpy==1.17.2
`

```log
Epoch 1/200
1874/1875 [============================>.] - ETA: 0s - loss: 58.4866 - reconstruction_loss: 55.2065 - kl_loss: 3.2801
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-a0cdb3ff19b5> in <module>
      5     , run_folder = RUN_FOLDER
      6     , print_every_n_batches = PRINT_EVERY_N_BATCHES
----> 7     , initial_epoch = INITIAL_EPOCH
      8 )

~/GDL_code/models/VAE.py in train(self, x_train, batch_size, epochs, run_folder, print_every_n_batches, initial_epoch, lr_decay)
    224             , epochs = epochs
    225             , initial_epoch = initial_epoch
--> 226             , callbacks = callbacks_list
    227         )
    228 

~/GDL_code/env/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in _method_wrapper(self, *args, **kwargs)
     64   def _method_wrapper(self, *args, **kwargs):
     65     if not self._in_multi_worker_mode():  # pylint: disable=protected-access
---> 66       return method(self, *args, **kwargs)
     67 
     68     # Running inside `run_distribute_coordinator` already.

~/GDL_code/env/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
    874           epoch_logs.update(val_logs)
    875 
--> 876         callbacks.on_epoch_end(epoch, epoch_logs)
    877         if self.stop_training:
    878           break

~/GDL_code/env/lib/python3.6/site-packages/tensorflow/python/keras/callbacks.py in on_epoch_end(self, epoch, logs)
    363     logs = self._process_logs(logs)
    364     for callback in self.callbacks:
--> 365       callback.on_epoch_end(epoch, logs)
    366 
    367   def on_train_batch_begin(self, batch, logs=None):

~/GDL_code/env/lib/python3.6/site-packages/tensorflow/python/keras/callbacks.py in on_epoch_end(self, epoch, logs)
   1175           self._save_model(epoch=epoch, logs=logs)
   1176       else:
-> 1177         self._save_model(epoch=epoch, logs=logs)
   1178     if self.model._in_multi_worker_mode():
   1179       # For multi-worker training, back up the weights and current training

~/GDL_code/env/lib/python3.6/site-packages/tensorflow/python/keras/callbacks.py in _save_model(self, epoch, logs)
   1194                   int) or self.epochs_since_last_save >= self.period:
   1195       self.epochs_since_last_save = 0
-> 1196       filepath = self._get_file_path(epoch, logs)
   1197 
   1198       try:

~/GDL_code/env/lib/python3.6/site-packages/tensorflow/python/keras/callbacks.py in _get_file_path(self, epoch, logs)
   1242         # `{mape:.2f}`. A mismatch between logged metrics and the path's
   1243         # placeholders can cause formatting to fail.
-> 1244         return self.filepath.format(epoch=epoch + 1, **logs)
   1245       except KeyError as e:
   1246         raise KeyError('Failed to format this callback filepath: "{}". '

TypeError: unsupported format string passed to numpy.ndarray.__format__

Issues with the tf-keras-contrib module on conda

I was trying to set up my conda environment for the TF 2 version of the book, but am not able to install the tf-keras-contrib module It doesn't seem to be available on conda/conda-forge/pip. I'm using Python 3.6.9 for reference.

I found this issue referencing keras-contrib, but unfortunately couldn't find a similiar thread for tf-keras-contrib. Searching for issues related to tf-keras-contrib seems to return results for keras.contrib.

I found this issue on the official repo that talks about this.

03_01_autoencoder_train /run/ folder issue

Heelo,

I was trying to run 03_01 code.
However, I got the following error from the second cell.

run params

SECTION = 'vae'
RUN_ID = '0001'
DATA_NAME = 'digits'
RUN_FOLDER = '/run/{}/'.format(SECTION)
RUN_FOLDER += '_'.join([RUN_ID, DATA_NAME])

if not os.path.exists(RUN_FOLDER):
os.mkdir(RUN_FOLDER)
os.mkdir(os.path.join(RUN_FOLDER, 'viz'))
os.mkdir(os.path.join(RUN_FOLDER, 'images'))
os.mkdir(os.path.join(RUN_FOLDER, 'weights'))

MODE = 'build' #'load' #

FileNotFoundError Traceback (most recent call last)
in
7
8 if not os.path.exists(RUN_FOLDER):
----> 9 os.mkdir(RUN_FOLDER)
10 os.mkdir(os.path.join(RUN_FOLDER, 'viz'))
11 os.mkdir(os.path.join(RUN_FOLDER, 'images'))

FileNotFoundError: [Errno 2] No such file or directory: './run/vae/0001_digits'

Issue with 05_01_cyclegan_train using TensorFlow 2.0

Trying to run CycleGAN to "Paint Like Monet" (I went with Ukiyoe however for the dataset, not Monet) and am running into issues when we save the model. I get the following error during training while executing self.*.save( ... ) inside CycleGAN.save_model (where * is combined, g_BA or g_AB):

NotImplementedError Traceback (most recent call last)
in ()
5 , test_B_file = TEST_B_FILE
6 , batch_size=BATCH_SIZE
----> 7 , sample_interval=PRINT_EVERY_N_BATCHES)

10 frames
/tensorflow-2.1.0/python3.6/tensorflow_core/python/keras/engine/base_layer.py in get_config(self)
497 # or that get_config has been overridden:
498 if len(extra_args) > 1 and hasattr(self.get_config, '_is_default'):
--> 499 raise NotImplementedError('Layers with arguments in __init__ must '
500 'override get_config.')
501 return config

NotImplementedError: Layers with arguments in __init__ must override get_config.

Attached is the ipynb file used to train the model (just change *.txt to *.ipynb)
05_02_cyclegan_train_ukiyoe2photo.txt

Error in cycleGAN.py: ImportError: cannot import name 'InstanceNormalization' from 'keras_contrib.layers.normalization

Hello,

When running 05_01_cyclegan_train.ipynb Jupyter notebook, which, in turn, runs cycleGAN.py, Line 4 produces the following error:

ImportError: cannot import name 'InstanceNormalization' from 'keras_contrib.layers.normalization

If Line 4 is replaced with the following line, the error is not produced and Jupyter notebook can run:

from keras_contrib.layers.normalization.instancenormalization import InstanceNormalization

Issue with loading library keras-contrib

Hello David! Thanks for the great book!

I have ran an error when running CycleGan notebook: 05_01_cyclegan_train.ipynb
It is due to an error 'keras_contrib' library not found.

So I have installed the library separately, from the github of keras-contrib .

import os
import matplotlib.pyplot as plt

from models.cycleGAN import CycleGAN
from utils.loaders import DataLoader

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-6-647ffcce73df> in <module>
      2 import matplotlib.pyplot as plt
      3 
----> 4 from models.cycleGAN import CycleGAN
      5 from utils.loaders import DataLoader

~/GAN/GDL_code/models/cycleGAN.py in <module>
      5 #have replaced above with below line on 2019 7 8
      6 #due to an error
----> 7 from keras_contrib.layers.normalization.instancenormalization import InstanceNormalization
      8 from keras.layers import Input, Dense, Reshape, Flatten, Dropout, Concatenate
      9 from keras.layers import BatchNormalization, Activation, ZeroPadding2D, Add

ModuleNotFoundError: No module named 'keras_contrib'

After installing the library, it seems to work okay, but need to change slightly the command
following this
From this :

from keras_contrib.layers.normalization import InstanceNormalization
To this:
from keras_contrib.layers.normalization.instancenormalization import InstanceNormalization

However from the notebook it still doesn't work.
I have tried it both from the same conda environment ..
Any ideas?

Getting Error GraphViz's executables not found

This occurs in 03_01 AutoEncoder_Train and 03_03_vae_digits_train. Running on Windows 10 using python 3.6

Will the code from the book ever be in the repo?

Hi David, I am enjoying reading your book, I am around page 115.

But something that is annoying me a lot is that not all the code from the book is in the repo. For example now I wanted to copy and rewrite the GAN code, but it's nowhere to be seen, and you only show how to use the GAN class, while all the other code you showed about how to build a GAN from scratch is not part of the notebook.

It might be only my opinion, but when I am reading a programming book, I expect every bit of code to be available and tested, because that's how people learns. "Hands on machine learning" is an example of this.

Will all the code from the book be part of the notebooks?

Question regarding setting the critic to trainable in W-GAN

I am referring to this script. After compiling the GAN, the critic's been set to trainable again. Could you explain why that might be? Most of the vanilla DCGANs suggest to set it to False.

06_03_qa_analysis.ipynb: could not broadcast input array

I found the following error while running 06_03_qa_analysis.ipynb:

ValueError                                Traceback (most recent call last)
<ipython-input-24-4b8575d1c9c7> in <module>()
     28     counter += 1
     29 
---> 30     word_preds, next_decoder_init_state = question_model.predict([word_tokens, next_decoder_init_state])
     31 
     32     next_decoder_init_state = np.squeeze(next_decoder_init_state, axis = 1)

4 frames
/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

ValueError: could not broadcast input array from shape (200) into shape (1)

The proposed solution is to add word_tokens = np.array(word_tokens) to the following code snippet:

...
ended = False
counter = 0

word_tokens = np.array(word_tokens)  # add this line

while not ended:
...

Divide by zero error when running 04_01_gan_camel_train, missing data?

I opened the Jupyter notebook, clicked "Run all cells". Execution failed at:

(x_train, y_train) = load_safari(DATA_NAME)

...with the error:

---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-5-595832a29a8f> in <module>
----> 1 (x_train, y_train) = load_safari(DATA_NAME)

~/lab/GDL_code/utils/loaders.py in load_safari(folder)
    188 
    189     slice_train = int(80000/len(txt_name_list))  ###Setting value to be 80000 for the final dataset
--> 190     i = 0
    191     seed = np.random.randint(1, 10e6)
    192 

ZeroDivisionError: division by zero

It looks like there's no data in the data diretory, which leads txt_name_list to be an empty array. Is there some step necessary to pull the data?