depthwise_conv2d and separable_conv2d supports multiplier=1 only

depthwise_conv2d supports depth_multiplier=1 only.
separable_conv2d supports stride=(1,1) and depth_multiplier=1 only.

installation process

It would be nice to have this as a single installation step:

If I have MXNet already: keep my version if it is high enough.
If I don’t have MXNet already: install the latest MXNet version along with this package.

multi-gpu tutorial: batch size and optimization

When running the tutorial with the default settings it doesn't seem like the batch size is optimized for the # of gpus (and learning rate should be modified too), so maybe that should be discussed and set. When I tried to tinker with it, I wasn't able to really increase gpu utilization much.

ubuntu@ip-172-31-9-178:~$ watch -n0.1 nvidia-smi

Every 0.1s: nvidia-smi                                                                                       Fri Feb 16 23:00:41 2018

Fri Feb 16 23:00:41 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.111                Driver Version: 384.111                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:1B.0 Off |                    0 |
| N/A   44C    P0    53W / 300W |    786MiB / 16152MiB |     15%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000000:00:1C.0 Off |                    0 |
| N/A   41C    P0    51W / 300W |    786MiB / 16152MiB |     17%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-SXM2...  On   | 00000000:00:1D.0 Off |                    0 |
| N/A   41C    P0    53W / 300W |    786MiB / 16152MiB |     18%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-SXM2...  On   | 00000000:00:1E.0 Off |                    0 |
| N/A   43C    P0    55W / 300W |    786MiB / 16152MiB |     19%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2774      C   python                                       776MiB |
|    1      2774      C   python                                       776MiB |
|    2      2774      C   python                                       776MiB |
|    3      2774      C   python                                       776MiB |
+-----------------------------------------------------------------------------+

KerasSymbol do not allow NDArray Types

We cannot perform the following:
KerasSymbol -/+* NDArray

Basically, cannot perform any operations in KerasSymbol class with another NDArray.

mxnet not using the GPU with mxnet-cu90 on Windows

I was using latest keras with tensorflow with GPU. I installed mxnet in the following way:

pip install keras-mxnet
pip install mxnet-cu90

I changed "backend": "mxnet" in the keras config file. In jupyter I see "Using MXNet backend", but when training only CPU is utilized and not the GPU.

Any advice?

multi-gpu tutorial: warnings at the start of training

/home/ubuntu/.local/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

And six of these repeated:

/home/ubuntu/.local/lib/python3.6/site-packages/Keras-2.1.3-py3.6.egg/keras/backend/mxnet_backend.py:4194: SyntaxWarning: assertion is always true, perhaps remove parentheses?

After the layers are described, it then gives these warnings:

[22:54:01] src/operator/././cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/module/bucketing_module.py:402: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.03125). Is this intended?
  force_init=force_init)

Input is getting overwritten after for loop mxnet_backend.py:ones_like

The following code:

keras-apache-mxnet/keras/backend/mxnet_backend.py

Line 716 in 546b50f

mx_shape = tuple([0 if x is None else x for x in x.shape])

produces the following error under python2.7:

AttributeError: 'long' object has no attribute 'shape'

The problem is that the for loop overrides the variable x and makes it a float.

A simple solution which worked for my case was to just rename the inner variable to y:
mx_shape = tuple([0 if x is None else y for y in x.shape])

Skipped conv3d backend test

in tests/keras/backend/backend_test.py, for test case test_conv3d, few test cases are skipped for MXNet backend.
@roywei is working on the fix and enable these edge cases.

Optimizers missing with MXNet backend

Following optimizers are not implemented yet for MXNet backend.

Adamax
NAdam
Adam_AMSGrad

Fail to install keras with Mxnet

I am trying to install keras with mxnet, howerver, it failed.

pip install git+https://github.com/awslab
s/keras-apache-mxnet

  File "/tmp/RtmpIpKgN0/chunk-code-26bb3d563203.txt", line 1, in <module>
    import keras
  File "/home/gpu-server-1/.virtualenvs/r-tensorflow/local/lib/python2.7/site-packages/keras/__init__.py", line 3, in <module>
    from . import utils
  File "/home/gpu-server-1/.virtualenvs/r-tensorflow/local/lib/python2.7/site-packages/keras/utils/__init__.py", line 6, in <module>
    from . import conv_utils
  File "/home/gpu-server-1/.virtualenvs/r-tensorflow/local/lib/python2.7/site-packages/keras/utils/conv_utils.py", line 9, in <module>
    from .. import backend as K
  File "/home/gpu-server-1/.virtualenvs/r-tensorflow/local/lib/python2.7/site-packages/keras/backend/__init__.py", line 36, in <module>
    assert _backend in {'theano', 'tensorflow', 'cntk'}
AssertionError

My environment:

Ubuntu 16.04.4 LTS \n \l
python 2.7.12
cuda9.1
mxnet-cu91

errors when installing keras-mxnet in python 2.7 mxnet conda env

Using a DLAMI Ubuntu v7 in the mxnet_p27 Conda environment I get the following errors after running pip install --upgrade h5py numpy keras-mxnet:

mkl-random 1.0.1 requires cython, which is not installed.
mkl-fft 1.0.0 requires cython, which is not installed.
botocore 1.10.2 has requirement python-dateutil<2.7.0,>=2.1, but you'll have python-dateutil 2.7.2 which is incompatible.
mxnet-cu90mkl 1.1.0 has requirement numpy<=1.13.3, but you'll have numpy 1.14.3 which is incompatible.

The mxnet_p36 env works fine.

Training acc does not increase when using mxnet mkldnn

I was doing performance profiling on CPU and found out if I use the latest mxnet-mkl (mxnet-mkl-1.2.0b20180507 pip install mxnet-mkl --pre) the accuracy does not go up. It's working fine when using mxnet-mkl 1.1.0 with pip install mxnet-mkl.
However if I use pure mxnet implementation, both packages working fine.

Instance type C5.18xlarge , ubuntu

Steps to reproduce:
Script from keras example:
https://github.com/awslabs/keras-apache-mxnet/blob/master/examples/cifar10_cnn.py

Using latest mxnet-mkl from master

pip install mxnet-mkl --pre
pip install keras-mxnet --pre
python cifar10_cnn.py

Built from source from release branch v1.2.0 with mkl

check out branch v1.2.0
installed mkl following option 2 of the guide: https://software.intel.com/en-us/articles/installing-and-building-mxnet-with-intel-mkl
$ make -j $(nproc) USE_OPENCV=1 USE_MKLDNN=1 USE_BLAS=atlas

Training Logs:
using keras mxnet-mkl 1.2.0b

Using MXNet backend
/usr/local/lib/python2.7/dist-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
x_train shape: (50000, 3, 32, 32)
50000 train samples
10000 test samples
Not using data augmentation.
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
/usr/local/lib/python2.7/dist-packages/mxnet/module/bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.0078125). Is this intended?
  force_init=force_init)
[18:29:26] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 3456 bytes with malloc directly
[18:29:26] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 36864 bytes with malloc directly
[18:29:26] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 3211264 bytes with malloc directly
  128/50000 [..............................] - ETA: 24s - loss: 2.3026 - acc: 0.0859[18:29:26] src/operator/nn/mkldnn/mkldnn_base.cc:60: Allocate 14745600 bytes with malloc directly
50000/50000 [==============================] - 18s 359us/step - loss: 2.3029 - acc: 0.0993 - val_loss: 2.3562 - val_acc: 0.1000
Epoch 2/10
50000/50000 [==============================] - 16s 312us/step - loss: 2.3077 - acc: 0.0994 - val_loss: 2.3122 - val_acc: 0.1003
Epoch 3/10
50000/50000 [==============================] - 17s 330us/step - loss: 2.3029 - acc: 0.0998 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 4/10
50000/50000 [==============================] - 15s 306us/step - loss: 2.3028 - acc: 0.0973 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 5/10
50000/50000 [==============================] - 17s 332us/step - loss: 2.3027 - acc: 0.0989 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 6/10
50000/50000 [==============================] - 16s 315us/step - loss: 2.3027 - acc: 0.0987 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 7/10
50000/50000 [==============================] - 16s 321us/step - loss: 2.3027 - acc: 0.0967 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 8/10
50000/50000 [==============================] - 16s 330us/step - loss: 2.3027 - acc: 0.0978 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 9/10
50000/50000 [==============================] - 16s 324us/step - loss: 2.3027 - acc: 0.0982 - val_loss: 2.3026 - val_acc: 0.1000
Epoch 10/10
50000/50000 [==============================] - 16s 326us/step - loss: 2.3027 - acc: 0.0970 - val_loss: 2.3026 - val_acc: 0.1000
Saved trained model at /home/ubuntu/examples/saved_models/keras_cifar10_trained_model.h5 
10000/10000 [==============================] - 1s 122us/step
Test loss: 2.3025997383117676

using keras mxnet-mkl 1.1.0

Using MXNet backend
x_train shape: (50000, 3, 32, 32)
50000 train samples
10000 test samples
Not using data augmentation.
Train on 50000 samples, validate on 10000 samples
Epoch 1/10
/usr/local/lib/python2.7/dist-packages/mxnet/module/bucketing_module.py:403: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.0078125). Is this intended?
  force_init=force_init)
MKL Build:20171227
50000/50000 [==============================] - 21s 414us/step - loss: 1.9739 - acc: 0.2748 - val_loss: 1.6251 - val_acc: 0.4083
Epoch 2/10
50000/50000 [==============================] - 18s 364us/step - loss: 1.6319 - acc: 0.4060 - val_loss: 1.4385 - val_acc: 0.4752
Epoch 3/10
50000/50000 [==============================] - 18s 356us/step - loss: 1.4564 - acc: 0.4748 - val_loss: 1.3948 - val_acc: 0.5152
Epoch 4/10
50000/50000 [==============================] - 18s 368us/step - loss: 1.3202 - acc: 0.5272 - val_loss: 1.2079 - val_acc: 0.5705
Epoch 5/10
50000/50000 [==============================] - 20s 390us/step - loss: 1.2088 - acc: 0.5703 - val_loss: 1.1262 - val_acc: 0.6073
Epoch 6/10
50000/50000 [==============================] - 20s 406us/step - loss: 1.1309 - acc: 0.5998 - val_loss: 1.0989 - val_acc: 0.6047
Epoch 7/10
50000/50000 [==============================] - 18s 370us/step - loss: 1.0664 - acc: 0.6226 - val_loss: 1.0102 - val_acc: 0.6443
Epoch 8/10
50000/50000 [==============================] - 20s 393us/step - loss: 1.0096 - acc: 0.6437 - val_loss: 0.8982 - val_acc: 0.6853
Epoch 9/10
50000/50000 [==============================] - 20s 395us/step - loss: 0.9638 - acc: 0.6629 - val_loss: 0.9069 - val_acc: 0.6795
Epoch 10/10
50000/50000 [==============================] - 18s 368us/step - loss: 0.9268 - acc: 0.6768 - val_loss: 0.8839 - val_acc: 0.6861
Saved trained model at /home/ubuntu/examples/saved_models/keras_cifar10_trained_model.h5 
10000/10000 [==============================] - 2s 202us/step
Test loss: 0.883913392448
Test accuracy: 0.6861

using native mxnet-mkl 1.2.0b

INFO:root:Epoch[0] Batch [128]  Speed: 5276.20 samples/sec      accuracy=0.118883
INFO:root:Epoch[0] Batch [256]  Speed: 6193.33 samples/sec      accuracy=0.194824
INFO:root:Epoch[0] Batch [384]  Speed: 5896.97 samples/sec      accuracy=0.261047
INFO:root:Epoch[0] Train-accuracy=0.290365
INFO:root:Epoch[0] Time cost=8.733
INFO:root:Epoch[0] Validation-accuracy=0.345827
INFO:root:Epoch[1] Batch [128]  Speed: 6422.00 samples/sec      accuracy=0.314377
INFO:root:Epoch[1] Batch [256]  Speed: 6050.25 samples/sec      accuracy=0.357117
INFO:root:Epoch[1] Batch [384]  Speed: 6316.05 samples/sec      accuracy=0.381226
INFO:root:Epoch[1] Train-accuracy=0.395833
INFO:root:Epoch[1] Time cost=8.000
INFO:root:Epoch[1] Validation-accuracy=0.435225
INFO:root:Epoch[2] Batch [128]  Speed: 6226.67 samples/sec      accuracy=0.408127
INFO:root:Epoch[2] Batch [256]  Speed: 5576.82 samples/sec      accuracy=0.429443
INFO:root:Epoch[2] Batch [384]  Speed: 5928.68 samples/sec      accuracy=0.447205
INFO:root:Epoch[2] Train-accuracy=0.451823
INFO:root:Epoch[2] Time cost=8.462
INFO:root:Epoch[2] Validation-accuracy=0.498418
INFO:root:Epoch[3] Batch [128]  Speed: 6598.50 samples/sec      accuracy=0.475896
INFO:root:Epoch[3] Batch [256]  Speed: 6662.20 samples/sec      accuracy=0.498230
INFO:root:Epoch[3] Batch [384]  Speed: 7016.46 samples/sec      accuracy=0.502808
INFO:root:Epoch[3] Train-accuracy=0.526042
INFO:root:Epoch[3] Time cost=7.391
INFO:root:Epoch[3] Validation-accuracy=0.561511
INFO:root:Epoch[4] Batch [128]  Speed: 6485.46 samples/sec      accuracy=0.527132
INFO:root:Epoch[4] Batch [256]  Speed: 6002.80 samples/sec      accuracy=0.541931
INFO:root:Epoch[4] Batch [384]  Speed: 6302.73 samples/sec      accuracy=0.544739
INFO:root:Epoch[4] Train-accuracy=0.540365
INFO:root:Epoch[4] Time cost=7.992
INFO:root:Epoch[4] Validation-accuracy=0.571796
INFO:root:Epoch[5] Batch [128]  Speed: 7212.56 samples/sec      accuracy=0.560441
INFO:root:Epoch[5] Batch [256]  Speed: 7343.56 samples/sec      accuracy=0.574829
INFO:root:Epoch[5] Batch [384]  Speed: 7326.81 samples/sec      accuracy=0.580383
INFO:root:Epoch[5] Train-accuracy=0.591146
INFO:root:Epoch[5] Time cost=6.850
INFO:root:Epoch[5] Validation-accuracy=0.627670
INFO:root:Epoch[6] Batch [128]  Speed: 7072.83 samples/sec      accuracy=0.591206
INFO:root:Epoch[6] Batch [256]  Speed: 6632.22 samples/sec      accuracy=0.600403
INFO:root:Epoch[6] Batch [384]  Speed: 6356.41 samples/sec      accuracy=0.606262
INFO:root:Epoch[6] Train-accuracy=0.579427
INFO:root:Epoch[6] Time cost=7.475
INFO:root:Epoch[6] Validation-accuracy=0.640032
INFO:root:Epoch[7] Batch [128]  Speed: 6647.45 samples/sec      accuracy=0.613069
INFO:root:Epoch[7] Batch [256]  Speed: 7264.74 samples/sec      accuracy=0.624451
INFO:root:Epoch[7] Batch [384]  Speed: 7258.86 samples/sec      accuracy=0.626160
INFO:root:Epoch[7] Train-accuracy=0.608073
INFO:root:Epoch[7] Time cost=7.089
INFO:root:Epoch[7] Validation-accuracy=0.652690
INFO:root:Epoch[8] Batch [128]  Speed: 6471.52 samples/sec      accuracy=0.632510
INFO:root:Epoch[8] Batch [256]  Speed: 7245.21 samples/sec      accuracy=0.638000
INFO:root:Epoch[8] Batch [384]  Speed: 7245.78 samples/sec      accuracy=0.638794
INFO:root:Epoch[8] Train-accuracy=0.643229
INFO:root:Epoch[8] Time cost=7.166
INFO:root:Epoch[8] Validation-accuracy=0.674644
INFO:root:Epoch[9] Batch [128]  Speed: 6291.10 samples/sec      accuracy=0.652374
INFO:root:Epoch[9] Batch [256]  Speed: 6623.54 samples/sec      accuracy=0.654236
INFO:root:Epoch[9] Batch [384]  Speed: 6954.76 samples/sec      accuracy=0.651367
INFO:root:Epoch[9] Train-accuracy=0.664062
INFO:root:Epoch[9] Time cost=7.549
INFO:root:Epoch[9] Validation-accuracy=0.681171

Broken Unit Tests and Integration Tests

Following unit tests and integration tests are broken with MXNet backend. These need to be fixed before completion of phase 1.

Performance Improvement - Keras with MXNet backend

Avoid transpose operation.
Keras passes conv kernel in channels_last format.
We can avoid the transpose of kernel, by making a change in Keras conv layer something like:


        if self.data_format == 'channels_first':
            kernel_shape = (self.filters, input_dim) + self.kernel_size
        else:
            kernel_shape = self.kernel_size + (input_dim, self.filters)

Not supported keras/examples with MXNet backend

Checked examples are tested to be working with MXNet backend
Not supported examples have clear error message specifying the exact functionality MXNet does not support yet

MXNet backend uses native batch_norm operator

MXNet backend uses mxnet batchnorm operator directly without going through Keras batchnorm normalization layer.
Reason: MXNet do not support

moving_average_update() API.
batch_update
add_update

Accuracy tests are failing with MXNet optimizers

Accuracy with MXNet backend optimizers are not up to the tolerance with other backends in tests
tests/keras/optimizers_test.py

The difference looks like 0.75 expected v/s with mxnet 0.73.

Changes made on Keras code

Using this issue to track all the changes made on keras code, to avoid confusion for keras developers and future work.

Override Keras model with mxnet model
Kernel shape in Conv layers, provide in both channels last and channels first format. Need to implement that for every new Conv layer. In build() and compute_output_shape()
change on multi_gpu_model to pass mxnet context
Batch Norm
Embedding layers

MXNet backend - Cannot call Model.predict without Compile

With MXNet backend, we cannot call Model.predict() without calling Model.compile().

Unable to use Linear Regression with MXNet backend

tests/keras/wrappers/scikit_learn_test.py is brokenn for linear_regression usecases.

Following code should work.

model = Sequential()
    model.add(Dense(input_dim, input_shape=(input_dim,)))
    model.add(Activation('relu'))
    model.add(Dense(hidden_dims))
    model.add(Activation('relu'))
    model.add(Dense(1))
    model.add(Activation('linear'))
    model.compile(optimizer='sgd', loss='mean_absolute_error',
                  metrics=['accuracy'])

[Feature Requests] Keras2 with MXNet backend

Allow users to choose the bucket in 'save_mxnet_model()' API. For example, save 'train' bucket symbol rather than always saving 'pred' bucket symbol. This allows users to save checkpoints and continue for retraining. Contact @dmadeka for more details.
keras.models.get_mxnet_model(model) API and return symbols and params rather than storing on disk. To use MXNet for inference in the same session.
MXNet provides easy to use interface for running large-scale model training jobs across multiple machines - http://mxnet.incubator.apache.org/faq/multi_devices.html?highlight=distributed%20training We should provide such functionality for Keras users. We can start with MXNet backend and extend it to other backends.

Modulo operator is not supported

See this issue for more details - dmlc#95

Does not utilize all the GPUs on a multi-gpu machine

Hello,

I have noticed that while running keras/example/cifar10_cnn.py(uses Sequential API) and keras/example/lstm_text_generation.py(uses Sequential API) on multi-gpu machine such as Amazon AWS P3.8xLarge instance, the code does not utilize all the GPUs. However, it just uses a single GPU.

I have used keras-apache-mxnet/benchmark template scripts and modified cifar10_cnn.py and lstm_text_generation.py based on that template script.

Command:

sh run_mxnet_backend.sh 4_gpu_config

Here, the test is only using 1 GPU instead of 4 GPUs.

However, the when I ran the benchmark_resnet.py(uses functional API) test with 4_gpu_config option, it utilizes all the GPUs.

I think the problem is with setting the model context when test uses keras sequential() API.

More information:

(Pdb) self
<keras.models.Sequential object at 0x7f3002ad9630>

(Pdb) self._context
[gpu(0), gpu(1), gpu(2), gpu(3)]

(Pdb) self.model
<keras.backend.mxnet_backend.get_model.<locals>.Model object at 0x7f2ffb4ec9b0>

(Pdb) self.model._context
[gpu(0)]

Thank-You.

[Checklist] Phase 1 - Support CNN on Keras with MXNet

This issue is to track all the pending tasks to support CNN on Keras with MXNet backend.
(Note this is a running list)

Better messaging for handling channels_first and channels_last

There is a need to show some code samples on how to do this for your own dataset, not just the standard built-in ones. For to_channels_first API
I’d still like to see the CLI output for the examples we highlight to show # of GPUs, what my current channel order config is, and the image/sec stats. Could be something to bake in as well.

Unable to serialize model with custom loss

Need to fix test case - tests/keras/losses_test.py-> test_serializing_model_with_loss_class

Unable to save and load models

apache/mxnet#8911

Support multi-machine Distributed Training on Keras with MXNet backend

MXNet provides easy to use interface for running large-scale model training jobs across multiple machines - http://mxnet.incubator.apache.org/faq/multi_devices.html?highlight=distributed%20training

We should provide such functionality for Keras users. We can start with MXNet backend and extend it to other backends.

pip install keras-mxnet installs theano

And that seems unintended... If not, please explain. And maybe it's an opportunity to compare backends.

Operators to be implemented/fixed in MXNet Backend

Below is the checklist of operators to be implemented or revisited for fixing issues in MXNet backend for Keras:

Loading CNN trained with TF backend produces redefinition error

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/keras-team/keras.git --upgrade --no-deps
If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.
If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Trying to load a pretrained resnet50 I get:

AssertionError: Redefinition of variable conv1/kernel1

Cannot slice axis on tensor with MXNet backend

Minimum reproducible code:

import numpy as np

import keras
from keras import backend as K

# In Numpy
data = np.array([[[1,2,3], [4,5,6]]])
data.shape
(1,2,3)
data[:,-1,:].shape
(1,3)

# In Keras with MXNet backend
var1 = K.variable(data)
var1.shape
(1,2,3)
K.eval(var1[:,-1,:])
<<ERROR>>

Check failed: b < e (1 vs. 0) slicing with begin=[1]=1, end[1]=0, and step[1]=1 is invalid

Reason:

MXNet does not support slicing the axis with mx.sym.slice operator.
In KerasSymbol, getitem method, we need to special case this kind of slicing and use mx.sym.slice_axis()

New tutorial - Use pretrained Keras Model for inference in MXNet

Load a pre-trained Keras model (h5py) file in Keras-MXNet, export as MXNet model using save_mxnet_model() API and then perform inference with native MXNet.

Can't converge when switching to MXnet backend

My training keras-tensorflow pipeline works fine, but when switching to keras-mxnet got warning:
Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.125). Is this intended? force_init=force_init).
Can it be the reason why my network can't converge? And what are possible mistakes that produce rescale_grad warning?
I'm not using multi-gpu.

multi-gpu tutorial: model parameter (n and version)

The comments say that if we're doing ResNet110 - which I think we are - then n should be set to 18 or 12, depending on what version we're running. We're still running n=3 which is ResNet20.0 on version 1. Also, should be we be running version 2 instead of 1?

save mxnet model not working for channels last format and on GPU

Following this tutorial for saving mxnet native models, works fine if data_format is channels_first in ~/.keras/keras.json:
https://github.com/awslabs/keras-apache-mxnet/blob/master/docs/mxnet_backend/save_mxnet_model.md

If I change to channels_last,

data_names, data_shapes = save_mxnet_model(model=model, prefix='mnist_cnn', epoch=0)
print(data_names)
print(data_shapes)

gives the following output:

['/conv2d_1_input1'] 
[DataDesc[/conv2d_1_input1,(128L, 28L, 28L, 1L),float32,NCHW]]

In DataDesc, it's still NCHW (channels first) format.
Leading to an error when loading back into mxnet code (note: changed data shape to channels last):

import numpy as np
import mxnet as mx

# Step1: Load the model in MXNet

# Use the same prefix and epoch parameters we used in save_mxnet_model API.
sym, arg_params, aux_params = mx.model.load_checkpoint(prefix='mnist_cnn', epoch=0)

# We use the data_names and data_shapes returned by save_mxnet_model API.
mod = mx.mod.Module(symbol=sym, 
                    data_names=['/conv2d_1_input1'], 
                    context=mx.cpu(), 
                    label_names=None)
mod.bind(for_training=False, 
         data_shapes=[('/conv2d_1_input1', (1, 28, 28, 1))], 
         label_shapes=mod._label_shapes)

Error Message:

 from ._conv import register_converters as _register_converters
infer_shape error. Arguments:
  /conv2d_1_input1: (1, 1L, 28L, 28L)
Traceback (most recent call last):
  File "test_mnist.py", line 27, in <module>
    result = mod.predict(data_iter)
  File "/usr/local/lib/python2.7/dist-packages/mxnet/module/base_module.py", line 371, in predict
    self.forward(eval_batch, is_train=False)
  File "/usr/local/lib/python2.7/dist-packages/mxnet/module/module.py", line 610, in forward
    self.reshape(new_dshape, new_lshape)
  File "/usr/local/lib/python2.7/dist-packages/mxnet/module/module.py", line 471, in reshape
    self._exec_group.reshape(self._data_shapes, self._label_shapes)
  File "/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py", line 382, in reshape
    self.bind_exec(data_shapes, label_shapes, reshape=True)
  File "/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py", line 358, in bind_exec
    allow_up_sizing=True, **dict(data_shapes_i + label_shapes_i))
  File "/usr/local/lib/python2.7/dist-packages/mxnet/executor.py", line 402, in reshape
    arg_shapes, _, aux_shapes = self._symbol.infer_shape(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/mxnet/symbol/symbol.py", line 990, in infer_shape
    res = self._infer_shape_impl(False, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/mxnet/symbol/symbol.py", line 1120, in _infer_shape_impl
    ctypes.byref(complete)))
  File "/usr/local/lib/python2.7/dist-packages/mxnet/base.py", line 149, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator conv2d_1/conv2d2: Shape inconsistent, Provided = [32,1,3,3], inferred shape=(32,28,3,3)

Operators missing for MXNet backend

Variables and Placeholders

Support for constraints in Keras variables and Placeholders

Update Operators

update
update_add
update_sub

Graph Manipulations

gradients

Layers

RNNs

rnn

CNNs

Higher Order Functions

map_fn
foldl
foldr

Sparse Tensors

NN Operators

sparse_categorical_crossentropy

Optimizers

Adam_AMSGrad

Others

Dropout not working for LSTM layer

LSTM layer not working if dropout parameters are specified here:

removing the params from
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2, input_shape=input_shape))
to
model.add(LSTM(128, input_shape=input_shape)))
works

in_topk operators returns 0/1 instead of Boolean True/False

MXNet supports only 0/1 as output for conditional checks.
Hence, in Keras with MXNet backend, in_topk operator returns 0/1 tensors instead of Boolean True/False.

MXNet uses '0' as placeholder for unknown dimension, Keras uses 'None'

MXNet requires '0' as a placeholder for unknown dimension.
Keras uses 'None' as a placeholder for unknown dimension.

This leads to few issues. To be fixed.
Example test case to reproduce the error - tests/keras/test_sequential_model => test_clone_functional_model

Move skip tests list to a nosetest config file

Currently, for MXNet backend, we skip around 190 tests out of ~650 tests. We skip tests with @skip_if decorator. This involves changes in the code. Pulling latest from Keras and merging into this repo, creates merge conflicts. It would very useful if we can remove those MXNet specific skip test code from test files and put it as a config or have a test runner that manages the list of tests to run or skip.

One closest reference is ONNX tests - https://github.com/onnx/onnx/blob/master/onnx/backend/test/runner/__init__.py

load_weights () is not loading weights if pre-trained model is used

I am using keras-mxnet 2.1.6.1.

I trained a model on cifar10 data using densenet121 (code below). There is no issue in training. However if I load weights to continue the training or call predict, it seems weight doesn't get loaded since training again starts from same loss/accuracy as the first time. Predict results are all nan.

from __future__ import print_function
import keras
from keras.applications.densenet import DenseNet121
from keras.layers.pooling import GlobalAveragePooling2D
from keras.layers.core import Dense
from keras.layers import Input
from keras.models import Model
from keras.regularizers import *


def get_model():
        aliases = {}
        Input_1 = Input(shape=(3, 221, 221), name='Input_1')
        DenseNet121_1_model = DenseNet121(include_top= False, input_tensor = Input_1)
        DenseNet121_1 = DenseNet121_1_model(Input_1)
        aliases['DenseNet121_1'] = DenseNet121_1_model.name
        num_layers = len(DenseNet121_1_model.layers)
        for i, layer in enumerate(DenseNet121_1_model.layers):
                if ((i * 100) / (num_layers - 1)) <= (100 - 10):
                        layer.trainable = False
        GlobalAveragePooling2D_1 = GlobalAveragePooling2D(name='GlobalAveragePooling2D_1')(DenseNet121_1)
        Dense_1 = Dense(name='Dense_1',units= 10,activation= 'softmax' )(GlobalAveragePooling2D_1)

        model = Model([Input_1],[Dense_1])
        return model


from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
import os
from skimage.transform import resize
import numpy as np

batch_size = 16
num_classes = 10
epochs = 2

# The data, split between train and test sets:
(x_train1, y_train1), (x_test1, y_test1) = cifar10.load_data()

y_train = y_train1[:x_train1.shape[0]//5]
y_test = y_test1[:x_test1.shape[0]//5]  

x_train = np.ndarray((x_train1.shape[0]//5, 3,221,221), dtype=np.float32)
x_test = np.ndarray((x_test1.shape[0]//5, 3,221,221), dtype=np.float32)

for i in range(x_train.shape[0]):
    x_train[i] = resize(x_train1[i], (3,221,221), anti_aliasing=True)
   
for i in range(x_test.shape[0]):
    x_test[i] = resize(x_test1[i], (3,221,221), anti_aliasing=True)


# Convert class vectors to binary class matrices.
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# initiate RMSprop optimizer
opt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)

model=get_model()
#model = keras.models.load_model("cifar.h5")
# Let's train the model using RMSprop
model.compile(loss='categorical_crossentropy',
              optimizer=opt, context=["gpu(0)"],
              metrics=['accuracy'])

x_train /= 255
x_test /= 255

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          validation_data=(x_test, y_test),
          shuffle=True)
model.save("cifar.h5")

fix need in docs

benchmark script:
https://github.com/awslabs/keras-apache-mxnet/tree/dev/benchmark
dash missing for layers param
/ missing for data path
it should be:

python benchmark_resnet.py --dataset imagenet --version 1 --layers 56 --gpus 8 --epoch 20 --train_mode train_on_batch --data_path /home/ubuntu/imagenet/train

Keras-MXNet RNN CPU performance is slower

Keras-MXNet is slower on CPU.
See early benchmark results here - https://github.com/awslabs/keras-apache-mxnet/blob/master/benchmark/benchmark_result/RNN_result.md

This issue is likely due to:

MXNet broadcast_add operator being slower on CPU - apache/mxnet#8219
MXNet dot operator being slower on CPU - apache/mxnet#10881

Convolution in MXNet backend only support "valid" padding mode

Convolution operator with MXNet support supports only with "valid" padding mode.
Add support for "full"/"same" padding mode.
Enable convolutional layer tests after it.

Conv2d_transpose and conv3d_transpose test cases failing

Changes in constant() operator causes failure in Conv2d_transpose() and conv3d_transpose() test cases with the following error:

    def _sync_weights(self):
        if self._weights_dirty:
            args, auxs = self._module.get_params()
            for name in self._arg_names:
>               self._args[name][:] = args[name]
E               KeyError: 'conv2d_transpose_2/kernel1'

K.gather not working for Embedding Layer using MXNet backend

In Keras Embedding layer, K.gather operator is used (MXNet backend use mx.sym.take and TensorFlow backend use tf.gather for implementation). However, K.gather is giving error when using MXNet backend, and this issue only occurs in Embedding layer use case, not in other use cases.
Test Case:
tests/keras/layers/embeddings_test.py
Error Message:
Error in operator broadcast_mul0: [13:05:25] src/operator/tensor/./elemwise_binary_broadcast_op.h:67: Check failed: l == 1 || r == 1 operands could not be broadcast together with shapes [3,2] [3]
This PR fixed it and is using directly mx.sym.Embedding instead of K.gather to implement the Keras Embedding Layer. Note this fix will break original Keras code and have conflict in future merge.

convolution tests not enabled in backend_test

Some convolution tests are not working and not enabled in backend_test.py

including:

Errors are from kernel shape miss match and output value difference between mxnet and numpy implementation of those convolution operators
see:
https://github.com/awslabs/keras-apache-mxnet/blob/master/tests/keras/backend/reference_operations.py#L10

Need to fix these tests for mxnet backend

Keras-MXNet CNN for training on CPU is relatively slower

Hi,

I am running the keras/examples/mnist_cnn.py and keras/exaples/cifar10_cnn.py and I can see that the training time it took on every epoch on Keras using MXNet backend is higher as compared to Keras using Tensorflow backend when ran on CPU.

The image data is already a channels_first when using MXNet backend and channels_last when using TensorFlow backend, which means, no transpose overhead operation.

Below are the pieces of information:

Machine: MacBook Pro(2.5GHz intel core i7 and 16 GB 2133 MHz RAM)

Python version: 2.7.14

MXNet version: 1.1.0

Tensorflow version: 1.5.0

Keras version: 2.1.4

Results:

Backend	mnist_cnn.py	cifar10_cnn.py
Keras+MXNet training performance	272 sec/epoch	388 sec/epoch
Keras+TensorFlow training performance	150 sec/epoch	239 sec/epoch

Thank-You!

Keras with MxNet running only on CPU

@sandeep-krishnamurthy Great work! I have used Mxnet as backend for Keras 1.2 in your previous repository: https://github.com/dmlc/keras and I really loved it! Congratulations!

I came across your new forked repository for keras 2.x but I am having trouble to use the GPU. More specifically I followed the instructions from https://github.com/deep-learning-tools/keras/wiki/Installation-Guide---Keras-with-MXNet-backend and I installed the mxnet-cu80 since I have CUDA 8. MxNet loads/works fine but when I try to run a Keras example: https://github.com/deep-learning-tools/keras/blob/master/examples/cifar10_cnn.py the experiment runs on the CPU and not the GPU. What am I doing wrong? Thanks

awslabs / keras-apache-mxnet Goto Github PK

keras-apache-mxnet's People

Contributors

Stargazers

Watchers

Forkers

keras-apache-mxnet's Issues