tensorflow / model-optimization Goto Github PK

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

Home Page: https://www.tensorflow.org/model_optimization

License: Apache License 2.0

Python 81.47% Shell 0.40% Jupyter Notebook 13.53% Starlark 4.52% Dockerfile 0.08%

tensorflow machine-learning deep-learning optimization quantized-neural-networks quantized-networks quantized-training keras model-compression compression

model-optimization's Introduction

TensorFlow Model Optimization Toolkit

The TensorFlow Model Optimization Toolkit is a suite of tools that users, both novice and advanced, can use to optimize machine learning models for deployment and execution.

Supported techniques include quantization and pruning for sparse weights. There are APIs built specifically for Keras.

For an overview of this project and individual tools, the optimization gains, and our roadmap refer to tensorflow.org/model_optimization. The website also provides various tutorials and API docs.

The toolkit provides stable Python APIs.

Installation

For installation instructions, see tensorflow.org/model_optimization/guide/install.

Contribution guidelines

If you want to contribute to TensorFlow Model Optimization, be sure to review the contribution guidelines. This project adheres to TensorFlow's code of conduct. By participating, you are expected to uphold this code.

We use GitHub issues for tracking requests and bugs.

Maintainers

Subpackage	Maintainers
tfmot.clustering	Arm ML Tooling
tfmot.quantization	TensorFlow Model Optimization
tfmot.sparsity	TensorFlow Model Optimization

Community

As part of TensorFlow, we're committed to fostering an open and welcoming environment.

TensorFlow Blog: Stay up to date on content from the TensorFlow team and best articles from the community.

model-optimization's People

Contributors

Stargazers

Watchers

Forkers

github30 albertbj mrgoogol hxl1990 leo-xxx rubenszimbres monoloxo kuri-leo shubhampachori12110095 yyqgood vonxaver cndavy stjordanis sitpbear chaossir cweifu ricardozon youtang1993 linhailan nanaakwasiabayieboateng bbsyinnv1131 hivewang citymap sprinterzzj regzhuce zmsunnyday zhukkang mbyase huangsiyuzhoujie floopypoppy tiandiao123 asi-sx srkm009 hyfeng zhenngbolun winnerineast irentang vestigegroup markoorescanin anasus1 rahimentezari jasperty lisathomas9 ferdi83 oohasri95 coffeeshaychildren kingsleyokoro nguyenducnhaty fullstackhan liuwenhaha fightingyoung s36srini luoxingyiaff wenson-wu mycaster yangyangyangongithub inger dlts85 ericosmic jiayiliu gernag hubbucket-team phymucs akarmi dapenggg erickque leba0 tensorflow-devs cooparation shekerkamma jatrix123 pengxu-peter ssitb harryingit3 sandeepsabbella matteoarm shiversvert web0king malook665 autle47 superstark02 lgeiger keijolus saeedahmed1245 chenguru860530 ohmagod76 sahil23024 pengs25 ryanturnerwebdev narebass james-conant nat74768 bosst0736 a-ayesha bsit-2a shawn6900 jgt90-0 dimomohit czzlegend etarakci-hvl

model-optimization's Issues

After pruning the model, is model type always NoneType?

I am trying to prune a pre-trained custom model.
After calling prune_low_magnitude, i see that the pruned_model has type "NoneType".
Is this wrong?

Opensource Mobilenet V1 Image Classification Training Script

This is the script we used to produce the Mobilenet V1 results with pruning here, together with Distribution Strategies.

do you have the pruning api for tensorflow model?

i use tensorflow, not keras, but i find your api is for keras

Does anyone try to apply this project to BERT?

https://github.com/google-research/bert
Thank you very much!

Did not support `<class 'tensorflow.python.keras.layers.normalization_v2.BatchNormalization'>`?

I follow the guide here and want to prune mine own model, and raise error like this:

ValueError: Please initialize Prunewith a supported layer. Layers should either be aPrunableLayer instance, or should be supported by the PruneRegistry. You passed: <class 'tensorflow.python.keras.layers.normalization_v2.BatchNormalization'>

Lottery ticket hypothesis

What about the lottery ticket hypothesis and follow-up works like large scale and Uber Supermask? Is there something reletad in the roadmap?

Pruning support for functional api

Hi, Can I prune a model developed using the functional api? I got an error when I tried to prune a model. If possible please point me to the right resources.

Thanks in advance.

spasity function only work on keras sequential model/layer?

Since i'd like to use sparsity.keras.prune_low_magnitude on my model and failed due to unknown reason, i was wondering whether it only work on keras sequential models?
BTW my keras model has a lot skip connections , so i can only apply on pretrained models（which it works）or sequential 'block's of my model？

Deprecation of Python 2

See https://python3statement.org/ and other related sites.

Internally, we currently run most unit tests on both Python 2 and 3. We will eventually turn down Python 2 support, with a target release version here to be posted here.

Update the API compatibility docs once this is done.

version issue (Python)

Describe the bug
Hello everyone, the problem with the versions in Python due to inept manipulations, as a result, I can’t install the libraries using cmd and Pycharm IDE, as well as the launch itself through Pycharm.Hello everyone, the problem with the versions in Python due to inept manipulations, as a result, I can’t install the libraries using cmd and Pycharm IDE, as well as the launch itself through Pycharm.
I tried to clean all the Python files from the computer (including from path), but I can not delete the python.exe and python3.exe files at C: Users / darku / AppData / Local / Microsoft / WindowsApps.
When deleting everything except these two files, cmd at the python command redirects to the Misrosoft Store with a request to purchase Python 3.7.5

I downloaded and installed Python 3.7.5 as a result of cmd with the python command you can write codes. The pip version is the latest. Some libraries, for example, numpy, can be installed, and some, for example, pymorphy2, can not.

ERROR: Command errored out with exit status 1: command: 'c:\users\darku\appdata\local\programs\python\python37\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\darku\\AppData\\Local\\Temp\\pip-install-s6sp3zyv\\pymorphy\\setup.py'"'"'; __file__='"'"'C:\\Users\\darku\\AppData\\Local\\Temp\\pip-install-s6sp3zyv\\pymorphy\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\darku\AppData\Local\Temp\pip-install-s6sp3zyv\pymorphy\pip-egg-info cwd: C:\Users\darku\AppData\Local\Temp\pip-install-s6sp3zyv\pymorphy\ Complete output (20 lines): Traceback (most recent call last): File "C:\Users\darku\AppData\Local\Temp\pip-install-s6sp3zyv\pymorphy\pymorphy\morph.py", line 2, in <module> from pymorphy_speedups._morph import * ModuleNotFoundError: No module named 'pymorphy_speedups' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<string>", line 1, in <module> File "C:\Users\darku\AppData\Local\Temp\pip-install-s6sp3zyv\pymorphy\setup.py", line 14, in <module> from pymorphy.version import __version__ File "C:\Users\darku\AppData\Local\Temp\pip-install-s6sp3zyv\pymorphy\pymorphy\__init__.py", line 1, in <module> from pymorphy.morph import get_morph File "C:\Users\darku\AppData\Local\Temp\pip-install-s6sp3zyv\pymorphy\pymorphy\morph.py", line 10, in <module> from pymorphy._morph import * File "C:\Users\darku\AppData\Local\Temp\pip-install-s6sp3zyv\pymorphy\pymorphy\_morph.py", line 6, in <module> from pymorphy.constants import * File "C:\Users\darku\AppData\Local\Temp\pip-install-s6sp3zyv\pymorphy\pymorphy\constants.py", line 83, in <module> RU_TENSES_STANDARD.items() + RU_VOICES_STANDARD.items()) TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items' ---------------------------------------- ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

[Feature request/Proposal] Clustering optimisation techniques in model optimisation techniques

Hi guys,

I was wondering if you have any plans supporting clustering in the Model Optimization Toolkit? By clustering I mean when the weights of different operators are clustered in a way so that only a few unique weights values are used. An example can be found here https://arxiv.org/abs/1510.00149

This can reduce the storage needed for storing different networks and certain types of hardware may benefit from it as well. Clustering can also be combined with pruning and quantization reducing network’s storage requirements even further.

Is this something you see would be useful for this repository? If so, I could look into drafting up an initial pull request so that we could see if this is an acceptable direction early.

What do you think?

Thanks,
Konstantin

P.S. I was trying to understand if this is the right place to post such proposals? The repo main page says to use this link to make feature requests. However, there is no special button for Feature Request hence I opened a regular issue. If that was a wrong way, could you please advise a better communication channel for these matters?

Block Sparsity not working for Squeezable Pointwise Convolution Layers

ValueError: Block Sparsity can only be used for layers which have 2-dimensional weights. I checked the source code and saw this comment:
# TODO(pulkitb): Check if squeeze operations should now be removed since we are only accepting 2-D weights.

How is pruning going to work with pointwise convolutional layers of format 1x1xiCxoC if block sparsity is only supported for 2D tensors?

Make Pruning TF 2.X Ready

Subsequently update https://www.tensorflow.org/model_optimization/guide/pruning to indicate compatibility.

Progress: submitted commits that are relevant:

As a byproduct of this (by updating calls to TF to tf. instead of from ... import), we will also:

ensure we don't mix 1.X and 2.X behavior (we currently do)
prevent end user breakages when using the pip package
- For instance, we used the tf.keras.experimental.load_saved_model and tf.keras.experimental.save_saved_model APIs, but the path switched from "from tensorflow.python.keras.saving import saved_model" to "from tensorflow.python.keras.saving import saved_model_experimental", making the code no longer compatible with older TF releases. By switching to just "tf.keras.experimental.load_saved_model", this would have been prevented. This can happen in general since TF doesn't guarantee the file paths don't change, only the public APIs won't change.
improve maintainability

Remaining issues:

Consider use case of loading 1.X model into 2.X code.
Consider case of importing SavedModel. Our whitelist doesn't include the SavedModel versions of the layers. E.g. "<class 'tensorflow.python.keras.saving.saved_model.load.ZeroPadding2D'>" isn't the same as tf.keras.layers.ZeroPadding2D. Note that SavedModel no longer uses its versions of the layers, as of TF nightly due to tensorflow/tensorflow@568427a

AttributeError: 'NoneType' object has no attribute 'compile'

when I use prune_model.compile, raise the error

[Pruning] Error message when tf.keras model object is not passed

See the following issues to see the confusion the lack of an error message creates.
#12 (comment)
#164 (comment)
#164 (comment)

Until minimal subclassed model support is provided, the error message should suggest that
the passed in object must be an instance of a Sequential or Functional model.

Accidentally created issue.

prune_low_magnitude retruns None

I am trying to prune a trained model , but model is saved a model_architecture.json and weights.h5 and model is loaded as follows:
with open(modeljson) as f:
self.model = model_from_json(f.read())
self.model.load_weights(modelfile)

But whenever I try to prune it the output from:
new_pruned_model = sparsity.prune_low_magnitude(model, **new_pruning_params)
is always none so it crashes at the model.compile part as type none has no attribute compile
Any help please

Pruning not working for tf.keras.Batchnorm

Describe the bug
ValueError: Please initialize Prune with a supported layer. Layers should either be a PrunableLayer instance, or should be supported by the PruneRegistry. You passed: <class 'tensorflow.python.keras.layers.normalization.BatchNormalization'>

System information

TensorFlow installed from (source or binary): binary

TensorFlow version: 2.1.0

TensorFlow Model Optimization version: 0.2.1

Python version: 3.5.6

can we use it for tensorflow model?

i want to apply it for bert trained by tensorflow

Use my own class as a layer to build model

Thanks for your great work!
I use my own Upsample class to realize Convolutional upsample. There occurs an error:
ValueError: Unknown layer: Upsample

What should I do to deal with it? Or "model-optimization" only apply on layers written by Keras/Tensorflow?

Pruning: improve custom training loop API

The recommended path for pruning with a custom training loop is not as simple as it could be.

pruned_model = setup_pruned_model()

loss = tf.keras.losses.categorical_crossentropy
optimizer = keras.optimizers.Adam()

log_dir = tempfile.mkdtemp()

# This is all not boilerplate.
pruned_model.optimizer = optimizer
step_callback = tfmot.sparsity.keras.UpdatePruningStep()
step_callback.set_model(pruned_model)
log_callback = tfmot.sparsity.keras.PruningSummaries(log_dir=log_dir) # optional Tensorboard logging.
log_callback.set_model(pruned_model)

step_callback.on_train_begin()
for _ in range(3):
    # only one batch given batch_size = 20 and input shape.
    step_callback.on_train_batch_begin(batch=unused_arg)
    inp = np.reshape(x_train,
                     [self._BATCH_SIZE, 10])  # original shape: from [10].
    with tf.GradientTape() as tape:
      logits = pruned_model(inp, training=True)
      loss_value = loss(y_train, logits)
      grads = tape.gradient(loss_value, pruned_model.trainable_variables)
      optimizer.apply_gradients(zip(grads, pruned_model.trainable_variables))

    step_callback.on_epoch_end(batch=unused_arg)
    log_callback.on_epoch_end(batch=unused_arg)
...

The set_model and pruned_model.optimizer setting is unusual and could be missed.

Pruning: Training with Near 100% Target Sparsity Fails

Describe the bug
Pruning with high target sparsity (e.g. 0.99) causes a error.

System information

TensorFlow installed from (source or binary):

TensorFlow version: any

TensorFlow Model Optimization version: 0.2.1

Python version: any

Describe the expected behavior
Target sparsity of 0.99 should work.

Describe the current behavior
Training errors out with something like:

InvalidArgumentError: indices = -1 is not in [0, 40)
[[{{node prune_low_magnitude_dense_1/cond/cond/pruning_ops/GatherV2}}]]

Code to reproduce the issue
testPruneWithHighSparsity_Fails in prune_integration_test.py

Can search for "model-optimization/issues/215" in codebase to find unit test also.

pruning_with_keras.ipynb issue

Hi,

When I run official guide in Colab, the guide raised an error of "unsupported op(FusedBatchNormV3)" during running Convert the model with TFLiteConverter.

It seems that this op was brought to TF days ago --> as the instruction says, I install the nightly-preview, may a stable version is much suitable?

Attached is the full log:

---------------------------------------------------------------------------
ConverterError                            Traceback (most recent call last)
<ipython-input-29-62dfa126ef60> in <module>()
      1 tflite_model_file = '/tmp/sparse_mnist.tflite'
      2 converter = tf.lite.TFLiteConverter.from_keras_model_file(pruned_keras_file)
----> 3 tflite_model = converter.convert()
      4 with open(tflite_model_file, 'wb') as f:
      5   f.write(tflite_model)

2 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/lite/python/convert.py in toco_convert_protos(model_flags_str, toco_flags_str, input_data_str)
    170       stderr = _try_convert_to_unicode(stderr)
    171       raise ConverterError(
--> 172           "TOCO failed. See console for info.\n%s\n%s\n" % (stdout, stderr))
    173   finally:
    174     # Must manually cleanup files.

ConverterError: TOCO failed. See console for info.
WARNING: Logging before flag parsing goes to stderr.
W0610 10:27:17.025386 140352251447168 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py:94: The name tf.AUTO_REUSE is deprecated. Please use tf.compat.v1.AUTO_REUSE instead.

W0610 10:27:17.025773 140352251447168 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py:94: The name tf.AttrValue is deprecated. Please use tf.compat.v1.AttrValue instead.

W0610 10:27:17.025909 140352251447168 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py:94: The name tf.COMPILER_VERSION is deprecated. Please use tf.version.COMPILER_VERSION instead.

W0610 10:27:17.026037 140352251447168 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py:94: The name tf.CXX11_ABI_FLAG is deprecated. Please use tf.sysconfig.CXX11_ABI_FLAG instead.

W0610 10:27:17.026163 140352251447168 deprecation_wrapper.py:118] From /usr/local/lib/python3.6/dist-packages/tensorflow/__init__.py:94: The name tf.ConditionalAccumulator is deprecated. Please use tf.compat.v1.ConditionalAccumulator instead.

2019-06-10 10:27:17.071062: I tensorflow/lite/toco/import_tensorflow.cc:1336] Converting unsupported operation: FusedBatchNormV3
2019-06-10 10:27:17.093824: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] Before Removing unused ops: 35 operators, 57 arrays (0 quantized)
2019-06-10 10:27:17.094147: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] Before general graph transformations: 35 operators, 57 arrays (0 quantized)
2019-06-10 10:27:17.094683: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] After general graph transformations pass 1: 10 operators, 29 arrays (0 quantized)
2019-06-10 10:27:17.118420: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] After general graph transformations pass 2: 9 operators, 28 arrays (0 quantized)
2019-06-10 10:27:17.118589: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] After general graph transformations pass 3: 8 operators, 26 arrays (0 quantized)
2019-06-10 10:27:17.118704: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] Before Group bidirectional sequence lstm/rnn: 8 operators, 26 arrays (0 quantized)
2019-06-10 10:27:17.118783: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] Before dequantization graph transformations: 8 operators, 26 arrays (0 quantized)
2019-06-10 10:27:17.118915: I tensorflow/lite/toco/allocate_transient_arrays.cc:345] Total transient array allocated size: 125440 bytes, theoretical optimal value: 125440 bytes.
2019-06-10 10:27:17.119187: E tensorflow/lite/toco/toco_tooling.cc:462] We are continually in the process of adding support to TensorFlow Lite for more ops. It would be helpful if you could inform us of how this conversion went by opening a github issue at https://github.com/tensorflow/tensorflow/issues/new?template=40-tflite-op-request.md
 and pasting the following:

Some of the operators in the model are not supported by the standard TensorFlow Lite runtime. If those are native TensorFlow operators, you might be able to use the extended runtime by passing --enable_select_tf_ops, or by setting target_ops=TFLITE_BUILTINS,SELECT_TF_OPS when calling tf.lite.TFLiteConverter(). Otherwise, if you have a custom implementation for them you can disable this error with --allow_custom_ops, or by setting allow_custom_ops=True when calling tf.lite.TFLiteConverter(). Here is a list of builtin operators you are using: CONV_2D, DEPTHWISE_CONV_2D, FULLY_CONNECTED, MAX_POOL_2D, SOFTMAX. Here is a list of operators for which you will need custom implementations: FusedBatchNormV3.
Traceback (most recent call last):
  File "/usr/local/bin/toco_from_protos", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/lite/toco/python/toco_from_protos.py", line 59, in main
    app.run(main=execute, argv=[sys.argv[0]] + unparsed)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/lite/toco/python/toco_from_protos.py", line 33, in execute
    output_str = tensorflow_wrap_toco.TocoConvert(model_str, toco_str, input_str)
Exception: We are continually in the process of adding support to TensorFlow Lite for more ops. It would be helpful if you could inform us of how this conversion went by opening a github issue at https://github.com/tensorflow/tensorflow/issues/new?template=40-tflite-op-request.md
 and pasting the following:

Some of the operators in the model are not supported by the standard TensorFlow Lite runtime. If those are native TensorFlow operators, you might be able to use the extended runtime by passing --enable_select_tf_ops, or by setting target_ops=TFLITE_BUILTINS,SELECT_TF_OPS when calling tf.lite.TFLiteConverter(). Otherwise, if you have a custom implementation for them you can disable this error with --allow_custom_ops, or by setting allow_custom_ops=True when calling tf.lite.TFLiteConverter(). Here is a list of builtin operators you are using: CONV_2D, DEPTHWISE_CONV_2D, FULLY_CONNECTED, MAX_POOL_2D, SOFTMAX. Here is a list of operators for which you will need custom implementations: FusedBatchNormV3.

thanks!

How to disable change the output tensor name？

My model has multiple outputs when using model-optimization, the outputs node name has been changed like this:

982/982 [==============================] - 434s 442ms/step - loss: 17.4818 - prune_low_magnitude_l1_loss: 3.6453 - prune_low_magnitude_l2_loss: 10.2949 - prune_low_magnitude_l1_p: 0.8551 - prune_low_magnitude_l1_r: 0.6556 - prune_low_magnitude_l2_p: 0.7639 - prune_low_magnitude_l2_r: 0.4423 - val_loss: 16.9125 - val_prune_low_magnitude_l1_loss: 3.4677 - val_prune_low_magnitude_l2_loss: 9.9504 - val_prune_low_magnitude_l1_p: 0.8921 - val_prune_low_magnitude_l1_r: 0.7815 - val_prune_low_magnitude_l2_p: 0.8306 - val_prune_low_magnitude_l2_r: 0.5363

how to remove the prefix 'prune_low_magnitude' ?

Channel Wise Pruning leading very poor results

Hello, I'm updating the channel_mask method to introduce channel wise sparsity for point-wise convolution layers (MobileNet V1). I have yet to check the final weights of the model as it's still undergoing training, however the training accuracy is starting to saturate at around 35%. Can anyone let me know if I made a mistake somewhere along in my tensor manipulation?

The reason for channel-wise pruning is to be able to use network surgery to eliminate channels that contribute very little to the inferential ability of the model, allowing for faster inference time with embedded devices.

 def _update_block_channel_mask(self, weights):
      """
      Performs channel wise masking of the weights (function below doesn't work).
      Args:
        weights: The weight tensor that needs to be masked.

      Returns:
        new_threshold: The new value of the threshold based on weights, and sparsity at the current global_step
        new_mask: A numpy array of the same size and shape as the weights containing 0 or 1 to indicate which of the values in weights falls below
        the threshold

      Raises:
        ValueError: if block pooling function is not AVG or MAX

      """
      # The weights should be of shape (1, 1, j, k), representing the shape of a pointwise convolutional layer.

      sparsity = self._pruning_schedule(self._step_fn())[1]
      with ops.name_scope('pruning_ops'):
        abs_weights = math_ops.abs(weights)

        k = math_ops.cast(sparsity * math_ops.cast(abs_weights.shape[-1], dtypes.float32), dtypes.int32)

        # Tranpose to have rows as values per output channel tensor
        squeezed_weights = tf.transpose(array_ops.squeeze(abs_weights))

        # Calculating the sum per output channel
        channel_sums = tf.sort(tf.reduce_sum(squeezed_weights, 1))

        # Grab the smallest K magnitude channels
        min_sums = tf.slice(channel_sums, [0], [k])

        current_threshold = array_ops.gather(min_sums, k - 1)

        # If any row sums matches the min K sums, set it to zero (prune it), otherwise set it to 1
        new_mask = tf.map_fn(lambda x: tf.cond(tf.reduce_any(tf.equal(tf.reduce_sum(x), min_sums)), lambda: tf.zeros_like(x),  lambda: tf.ones_like(x)), squeezed_weights)

        # Tranpose back to I x O and reshape into 1x1xIxO
        new_mask = tf.transpose(new_mask)
        new_mask = tf.reshape(new_mask, abs_weights.shape)

      return current_threshold, new_mask

Pruning fails for siamese networks

Describe the bug
Follow along with MNIST siamese, where one set of weights is used twice in the same network. Try to make one layer prune, get error: tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [Prune() wrapper requires the UpdatePruningStep callback to be provided during training. Please add it as a callback to your model.fit call.] [Condition x >= y did not hold element-wise:x (assert_greater_equal/ReadVariableOp:0) = ]. Even though you supply the pruning callback to fit():

System information

TensorFlow installed from (source or binary):
binary

TensorFlow version:
2.1.0

TensorFlow Model Optimization version:
0.2.1

Python version:
3.6.10

Describe the expected behavior
Script doesn't crash

Describe the current behavior
Script crashes

Code to reproduce the issue

import numpy as np
import tensorflow as tf

from tensorflow.keras.datasets import mnist
from tensorflow import Variable, float32
from tensorflow.keras.layers import Input, Flatten, Dense, Dropout, Lambda, Conv2D, MaxPooling2D
from tensorflow.keras.models import Model
from tensorflow_model_optimization.sparsity import keras as sparsity

def euclidean_distance(vects):
    x, y = vects
    sum_square = keras_backend.sum(keras_backend.square(x - y), axis=1, keepdims=True)
    return keras_backend.sqrt(keras_backend.maximum(sum_square, keras_backend.epsilon()))


def eucl_dist_output_shape(shapes):
    shape1, shape2 = shapes
    return shape1[0], 1

def random_other_digit(digit, max_num):
    inc = random.randrange(1, max_num)
    other_digit = (digit + inc) % max_num
    return other_digit

def create_pairs(x, digit_indices_in_dataset, num_classes=10): 
    """Massage input from MNIST
    """
    pairs = []
    labels = []
    minimal_digit_set_size = min([len(digit_indices_in_dataset[d]) for d in range(num_classes)]) - 1
    if minimal_digit_set_size <= 0:
        raise ValueError("Impossible ", minimal_digit_set_size)
    for digit in range(num_classes):
        for i in range(minimal_digit_set_size):
            indices_for_digit = digit_indices_in_dataset[digit]

            index_for_digit = indices_for_digit[i]
            digit_image_1 = x[index_for_digit]

            digit_index_same = indices_for_digit[i + 1]
            digit_image_same = x[digit_index_same]
            pair_same_digits = [digit_image_1, digit_image_same]
            pairs.append(pair_same_digits)

            other_digit = random_other_digit(digit, num_classes)
            digit_image_other = x[digit_indices_in_dataset[other_digit][i]]
            pair_different_digits = [digit_image_1, digit_image_other]
            pairs.append(pair_different_digits)

            # [Same, Different]
            labels += [1, 0]
    return np.array(pairs), np.array(labels)

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train[0:5000]
y_train = y_train[0:5000]

x_test = x_test[0:5000]
y_test = y_test[0:5000]

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

x_train /= 255
x_test /= 255
input_shape = x_train.shape[1:]

# create training+test positive and negative pairs
digit_indices = [np.where(y_train == i)[0] for i in range(num_classes)]

tr_pairs, tr_y = create_pairs(x_train, digit_indices, num_classes)
print("Pairs created")

print("Creating testing pairs...")
digit_indices = [np.where(y_test == i)[0] for i in range(num_classes)]
te_pairs, te_y = create_pairs(x_test, digit_indices, num_classes)
def create_base_network_pruned(input_shape, begin_step, end_step):
    """Base network to be shared (eq. to feature extraction).
    """
    pruning_params = {
        'pruning_schedule': sparsity.PolynomialDecay(initial_sparsity=0.50,
                                                     final_sparsity=0.90,
                                                     begin_step=begin_step,
                                                     end_step=end_step,
                                                     frequency=100)
    }
    input_base = Input(shape=input_shape)
    x = Flatten()(input_base)
    x = Dense(128, activation='relu')(x)
    x = Dropout(0.1)(x)
    x = sparsity.prune_low_magnitude(Dense(128, activation='relu'),
                                     **pruning_params)(x)
    x = Dropout(0.1)(x)
    x = Dense(128, activation='relu')(x)
    return Model(input_base, x)

def siamese_dense_pruned(input_shape, begin_step, end_step):
    base_network = create_base_network_pruned(input_shape, begin_step, end_step)
    input_a = Input(shape=input_shape)
    input_b = Input(shape=input_shape)
    # because we re-use the same instance `base_network`,
    # the weights of the network
    # will be shared across the two branches
    processed_a = base_network(input_a)
    processed_b = base_network(input_b)
    distance = Lambda(euclidean_distance,
                      output_shape=eucl_dist_output_shape)([processed_a, processed_b])
    return Model([input_a, input_b], distance)

epochs = 20
batch_size = 128
num_train_samples = x_train.shape[0]
end_step = np.ceil(1.0 * num_train_samples / batch_size).astype(np.int32) * epochs
begin_step = end_step / 5
model = siamese_dense_pruned(input_shape, begin_step, end_step)
rms = RMSprop()
model.compile(loss=contrastive_loss, optimizer=rms, metrics=[accuracy])
callbacks = [
       sparsity.UpdatePruningStep(),
       sparsity.PruningSummaries(log_dir=logdir, profile_batch=0)
]  
model.fit(training_pairs, tr_y,
              batch_size=batch_size,
              epochs=epochs,
              verbose=1
              validation_data=(testing_pairs, te_y),
              callbacks=callbacks)

Additional context
Might have something to do with referencing the same layers twice in a model.

hassan

Describe the bug
A clear and concise description of what the bug is.

System information

TensorFlow installed from (source or binary):
TensorFlow version (use command below):
TensorFlow Model Optimization version:
Python version:

Describe the expected behavior

Describe the current behavior

Code to reproduce the issue
Provide a reproducible code that is the bare minimum necessary to generate the
problem.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Make Model Optimization Project Contributor Friendly

Master tracker issue for this.

At the minimum:

we need to finish the first bullet point for infrastructure for open source.

Basic Doc for Contributors

Pathway for recommending and contributing techniques (see this doc - initial version in PR - though RFC process needs to be finalized)
Suggestions for other types of contributions (see this doc)

Basic Communication Channels

Templates for technique requests / feature requests / bugs / etc.

Infrastructure for open source

To accept pull requests (in the same veins as TensorFlow and other project) and accredit them to contributor. With the current infrastructure, any accepted pull requests would be overridden by the state of the code base within Google. (done: verified via #125, #203, #171 - still need to consider squashing commits).
Open source continuous integration testing - started under https://github.com/tensorflow/model-optimization/tree/master/ci/kokoro.

Out-of-Scope

Nightly builds

The quantization example , mnist_cnn.py is wrong,array softmax/Softmax, is lacking min/max data, which is necessary for quantization

the code

2019-09-20 17:04:32.110476: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] Before Group bidirectional sequence lstm/rnn: 8 operators, 17 arrays (1 quantized)
2019-09-20 17:04:32.110534: F tensorflow/lite/toco/tooling_util.cc:1709] Array quantize_emulate_wrapper_3/BiasAdd, which is an input to the Softmax operator producing the output array softmax/Softmax, is lacking min/max data, which is necessary for quantization. If accuracy matters, either target a non-quantized output format, or run quantized training with your model from a floating point checkpoint to change the input graph to contain min/max information. If you don't care about accuracy, you can pass --default_ranges_min= and --default_ranges_max= for easy experimentation.
Fatal Python error: Aborted

the
https://github.com/tensorflow/model-optimization/blob/master/tensorflow_model_optimization/python/examples/quantization/keras/mnist_cnn.py

'prune_low_magnitude' of 'keras.layers.convolutional.Conv2D' returns None

When trying to prune the Cov2D layer in a custom network the below error is observed.
TypeError: 'NoneType' object is not callable

Below is a snippet of the code:

from tensorflow_model_optimization.sparsity import keras as sparsity
from keras.layers.convolutional import Conv2D, Conv2DTranspose

pruning_prm = {'pruning_schedule': sparsity.PolynomialDecay(initial_sparsity=0, final_sparsity=0.5,begin_step=15000, end_step=75000, frequency=100)}
x = Conv2D(filters=n_filters, kernel_size=(kernel_size, kernel_size), kernel_initializer="he_normal",
padding="same")
x = sparsity.prune_low_magnitude(x, **pruning_prm) --> returns None

Information on versions of different modules:
Keras 2.3.0
tensorflow-estimator (2.0.0)
tensorflow-gpu (2.0.0)
tensorflow-hub (0.6.0)
tensorflow-model-optimization (0.1.3)
tensorflow-probability (0.8.0rc0)
tensorflow-serving-api-gpu (1.14.0)

Any pointers/leads is much appreciated.

Pruning: Keras subclassed model increased support

Currently the pruning API will throw an error when a subclassed model is passed to it. Users can get around this by diving into the subclassed models and applying pruning to individual Sequential/Functional models and tf.keras layers.

Better support is important for various cases (e.g. Object Detection, BERT examples) and issues such as this one.

We can provide better support for pruning an entire subclassed model.

Pruning some layers of the model would still require going into the model definition itself, though now you can prune a whole subclassed
model inside a subclassed model.
This would only prune variables that live inside a tf.keras.Layer (whether a built-in layer or a custom layer using a the PrunableLayer interface).

Implementation-wise, we can iterate through the layers of a subclassed model (and nested models) and applying pruning to all of them. Replacing a layer in an already created model will be tricky and we'd have to do this without clone_model.

Unreadable Notebook

I was unable to open model-optimization/tensorflow_model_optimization/g3doc/guide/pruning/pruning_with_keras.ipynb

Error:
NotJSONError('Notebook does not appear to be JSON: \'{\\n "cells": [\\n {\\n "cell_typ...',)

Pruned model size is bigger than unpruned?

Describe the bug
A clear and concise description of what the bug is.

System information

TensorFlow installed from: Binary
TensorFlow version (use command below): 2.0.0
TensorFlow Model Optimization version: 0.1.3
Python version: 3.6.5

Describe the expected behavior

The pruned model size should be smaller than the one unpruned

Describe the current behavior

Size of the unpruned model before compression: 12.52 Mb
Size of the unpruned model after compression: 11.58 Mb
Size of the pruned model before compression: 50.02 Mb
Size of the pruned model after compression: 20.16 Mb

Code to reproduce the issue

I did the experiment according to jupyter ,i adjusted the code as my style as below

import tensorflow as tf
import tempfile
import zipfile
import os
import tensorboard
import numpy as np

from tensorflow_model_optimization.sparsity import keras  as sparsity

## global parameters
batch_size = 128
num_classes = 10
epochs = 10
# input image dimensions
img_rows, img_cols = 28, 28
logdir = tempfile.mkdtemp()
print('Writing training logs to ' + logdir)

def prepare_trainval(img_rows, img_cols):
    # the data, shuffled and split between train and test sets
    (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

    if tf.keras.backend.image_data_format() == 'channels_first':
      x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
      x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
      input_shape = (1, img_rows, img_cols)
    else:
      x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
      x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
      input_shape = (img_rows, img_cols, 1)

    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_train /= 255
    x_test /= 255
    print('x_train shape:', x_train.shape)
    print(x_train.shape[0], 'train samples')
    print(x_test.shape[0], 'test samples')

    # convert class vectors to binary class matrices
    y_train = tf.keras.utils.to_categorical(y_train, num_classes)
    y_test = tf.keras.utils.to_categorical(y_test, num_classes)
    return x_train,x_test,y_train,y_test


def build_prune_model(input_shape,end_step):
    l = tf.keras.layers
    print('End step: ' + str(end_step))
    pruning_params = {
          'pruning_schedule': sparsity.PolynomialDecay(initial_sparsity=0.50,
                                                       final_sparsity=0.90,
                                                       begin_step=2000,
                                                       end_step=end_step,
                                                       frequency=100)
    }
    pruned_model = tf.keras.Sequential([
        sparsity.prune_low_magnitude(
            l.Conv2D(32, 5, padding='same', activation='relu'),
            input_shape=input_shape,**pruning_params),
        l.MaxPooling2D((2, 2), (2, 2), padding='same'),
        l.BatchNormalization(),
        sparsity.prune_low_magnitude(
            l.Conv2D(64, 5, padding='same', activation='relu'), **pruning_params),
        l.MaxPooling2D((2, 2), (2, 2), padding='same'),
        l.Flatten(),sparsity.prune_low_magnitude(l.Dense(1024, activation='relu'),
                                     **pruning_params),
        l.Dropout(0.4),
        sparsity.prune_low_magnitude(l.Dense(num_classes, activation='softmax'),
                                     **pruning_params)
    ])
    pruned_model.compile(
        loss=tf.keras.losses.categorical_crossentropy,
        optimizer='adam',
        metrics=['accuracy'])

    pruned_model.summary()
    return pruned_model

def train_prune_model(x_train,x_test,y_train,y_test,epochs,prune_model_file):
    input_shape = (img_rows, img_cols,1)
    num_train_samples = x_train.shape[0]
    end_step = np.ceil(1.0 * num_train_samples / batch_size).astype(np.int32) * epochs
    pruned_model = build_prune_model(input_shape,end_step)
    # Add a pruning step callback to peg the pruning step to the optimizer's
    # step. Also add a callback to add pruning summaries to tensorboard
    callbacks = [
        sparsity.UpdatePruningStep(),
        sparsity.PruningSummaries(log_dir=logdir, profile_batch=0)
    ]
    pruned_model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=10,
              verbose=1,
              callbacks=callbacks,
              validation_data=(x_test, y_test))
    score = pruned_model.evaluate(x_test, y_test, verbose=0)

    print('Saving pruned model to: ', prune_model_file)
    # saved_model() sets include_optimizer to True by default. Spelling it out here
    # to highlight.
    tf.keras.models.save_model(pruned_model, prune_model_file, include_optimizer=True)
    print('Test loss:', score[0])
    print('Test accuracy:', score[1])



def build_clean_model(input_shape):
    l = tf.keras.layers
    model = tf.keras.Sequential([
        l.Conv2D(
            32, 5, padding='same', activation='relu', input_shape=input_shape),
        l.MaxPooling2D((2, 2), (2, 2), padding='same'),
        l.BatchNormalization(),
        l.Conv2D(64, 5, padding='same', activation='relu'),
        l.MaxPooling2D((2, 2), (2, 2), padding='same'),
        l.Flatten(),
        l.Dense(1024, activation='relu'),
        l.Dropout(0.4),
        l.Dense(num_classes, activation='softmax')
    ])
    model.compile(
        loss=tf.keras.losses.categorical_crossentropy,
        optimizer='adam',
        metrics=['accuracy'])
    model.summary()
    return model


def train_clean_model(x_train,x_test,y_train,y_test,epochs,keras_file):
    callbacks = [tf.keras.callbacks.TensorBoard(log_dir=logdir, profile_batch=0)]
    input_shape = (img_rows, img_cols, 1)
    model = build_clean_model(input_shape)
    model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              verbose=1,
              callbacks=callbacks,
              validation_data=(x_test, y_test))
    score = model.evaluate(x_test, y_test, verbose=0)
    print('Saving model to: ', keras_file)
    tf.keras.models.save_model(model, keras_file, include_optimizer=False)
    print('Test loss:', score[0])
    print('Test accuracy:', score[1])

#x_train,x_test,y_train,y_test = prepare_trainval(img_rows, img_cols)
keras_file = "./ori_mnist_classifier.h5"
#train_clean_model(x_train,x_test,y_train,y_test,epochs,keras_file)
prune_model_file = "./prune_mnist_classifier.h5"
#train_prune_model(x_train,x_test,y_train,y_test,epochs,prune_model_file)

_, zip1 = tempfile.mkstemp('.zip')
with zipfile.ZipFile(zip1, 'w', compression=zipfile.ZIP_DEFLATED) as f:
  f.write(keras_file)
print("Size of the unpruned model before compression: %.2f Mb" %
      (os.path.getsize(keras_file) / float(2**20)))
print("Size of the unpruned model after compression: %.2f Mb" %
      (os.path.getsize(zip1) / float(2**20)))

_, zip2 = tempfile.mkstemp('.zip')
with zipfile.ZipFile(zip2, 'w', compression=zipfile.ZIP_DEFLATED) as f:
  f.write(prune_model_file)
print("Size of the pruned model before compression: %.2f Mb" %
      (os.path.getsize(prune_model_file) / float(2**20)))
print("Size of the pruned model after compression: %.2f Mb" %
      (os.path.getsize(zip2) / float(2**20)))

Screenshots

Model optimization without tensorflow lite

I'd like to optimized a pre-trained model without converting it to tensorflow lite.

I'm dealing with a variable-sized input which standard tensorflow handles fine but this function (as well as other) is not (yet?) supported by tensorflow lite. Can I just optimize the model without converting it to tf-lite?

Tensorflow lite actually seems to optimize the graph before the conversion (_run_graph_optimizations is called before the actual conversion), but I couldn't find any documentation regarding this.

#175

Describe the bug
A clear and concise description of what the bug is.

System information

TensorFlow installed from (source or binary):
TensorFlow version (use command below):
TensorFlow Model Optimization version:
Python version:

Describe the expected behavior

Describe the current behavior

Code to reproduce the issue
Provide a reproducible code that is the bare minimum necessary to generate the
problem.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Are all keras layers supported for weight pruning?

Hi, thanks for a very convenient package!
Are the layers like LSTM, ConvLSTM2D and TimeDistributed supported for weight pruning? In the other issue, someone had a problem with TimeDistributed layer.
If they are not supported out of the box, would inheriting from PrunableLayer and implementing get_prunable_weights() (perhaps as return empty list) fix the issue?

Thanks!

raise FailedPreconditionError while training model

I encountered this error when I trained model:
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable prune_low_magnitude_embedding/mask from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/prune_low_magnitude_embedding/mask/class tensorflow::Var does not exist.
[[{{node prune_low_magnitude_embedding/Mul/ReadVariableOp_1}}]]

Is there anybody have same question?

Pruning: Checkpointed Models Not Guaranteed To Be Sparse

Models on final export are sparse and pruning generally handles checkpointing correctly.

However, upon inspection, the weights of a checkpoint are not sparse themselves.

Reproduce

See testPruneCheckpoints_CheckpointsNotSparse in prune_integration_test.py

Can search for "model-optimization/issues/206" in codebase to find unit test also.

Theory

We sparsify the weights on two occasions: the UpdatePruningStep callback on epoch end and the call to self.layer.add_update(self.pruning_obj.weight_mask_op()) under the Pruning Wrapper. The default ModelCheckpoint callback checkpoints on epoch end.

The following order of events would cause the checkpointed models to not be sparse.

PruningWrapper update (weights are sparse) -> backwards propagation (weights aren't sparse) -> Checkpoint Callback -> UpdatePruningStep Callback (weights are sparse again)

This order seems feasible because we don't know of any guarantees that callbacks execute in a particular order. If both callbacks happen at the same time, then the checkpointed model could be partially sparse (e.g. weights are 30% sparse even when the mask is 50% sparse). If the checkpoint callback happens prior, then the weights wouldn't be sparse at all.

Supporting Evidence

When we save the model at the end after pruning, it has always compressed.
Previously when checkpointing was done on epoch_begin, the checkpointed models were compressed. The theory is that on_epoch_begin forces checkpointing to happen after UpdatePruningStep as follows:

PruningWrapper update (weights are sparse) -> backwards propagation (weights aren't sparse) -> UpdatePruningStep Callback (weights are sparse again) -> Checkpoint Callback

Notes

Any changes made will have to consider backwards-compatibility (e.g. existing training behavior is unchanged except
for fixing this pathway). Otherwise, it'll have to go into the next major release.

Does TF-MOT quantization will keep use tfLiteConverter?

I just noticed that quantization guide in tf-mot is using tfLite Converter.
Then, will TF-MOT library will partially use TF-Lite?

Support of Tensorflow 2.0

Do you plan to support Tensorflow 2.0?

can model-optimization speed up model?

not only compress the model, but also make it faster

Keras layer wrappers don't supported

An current layer of Keras is not supported:
ValueError: Please initialize Prune with a supported layer. Layers should either be a PrunableLayer instance, or should be supported by the PruneRegistry. You passed: <class 'tensorflow.python.keras.layers.wrappers.TimeDistributed'>

Can we prune pre-trained model like VGG16 etc... using this optimization library

I tried to create a model like:

`def Vgg16():
    vgg16 = VGG16(include_top=False, 
                                           weights='imagenet',
                                           input_shape = (32, 32, 3))
    top_model = Sequential()
    top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
    top_model.add(Dense(512, activation='relu'))
    top_model.add(Dropout(0.5))
    top_model.add(Dense(256, activation='relu'))
    top_model.add(Dropout(0.5))
    top_model.add(Dense(10, activation='sigmoid'))
    model = Model(vgg16.input,top_model(vgg16.output))
    return model`

and when I call

`new_pruning_params = {
      'pruning_schedule': sparsity.PolynomialDecay(initial_sparsity=0.5,
                                                   final_sparsity=0.80,
                                                   begin_step=0,
                                                   end_step=end_step,
                                                   frequency=100)
}

**pruned_model = sparsity.prune_low_magnitude(loaded_model, **new_pruning_params)`**

it generates error as:
Please initialize Prune with a supported layer. Layers should either be a PrunableLayer instance, or should be supported by the PruneRegistry. You passed: <class 'tensorflow.python.keras.engine.sequential.Sequential'>

sample

Sparsity Runtime Integration with TF/TFLite for Latency Improvements

As suggested here, model pruning currently only provides benefits in model compression/size reduction. Further framework support is necessary to provide latency improvements in TF/TFLite.

Pruned model doesn't work with multi-gpu

Feature Request:
Pruned model isn't working with tensorflow.keras.utils.multi_gpu_model, could you add this support?

Even when sparsity.UpdatePruningStep() is added in the callbacks list, the following error is shown:

tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: assertion failed: [Prune() wrapper requires the UpdatePruningStep callback to be provided during training. Please add it as a callback to your model.fit call.] [Condition x >= y did not hold element-wise:x (replica_1_1/sequential/prune_low_magnitude_conv2d/cond/assert_greater_equal/ReadVariableOp:0) = ] [-1] [y (replica_1_1/sequential/prune_low_magnitude_conv2d/cond/assert_greater_equal/y:0) = ] [0]
	 [[{{node replica_1_1/sequential/prune_low_magnitude_conv2d/cond/assert_greater_equal/Assert/Assert}}]]
  (1) Invalid argument: assertion failed: [Prune() wrapper requires the UpdatePruningStep callback to be provided during training. Please add it as a callback to your model.fit call.] [Condition x >= y did not hold element-wise:x (replica_1_1/sequential/prune_low_magnitude_conv2d/cond/assert_greater_equal/ReadVariableOp:0) = ] [-1] [y (replica_1_1/sequential/prune_low_magnitude_conv2d/cond/assert_greater_equal/y:0) = ] [0]
	 [[{{node replica_1_1/sequential/prune_low_magnitude_conv2d/cond/assert_greater_equal/Assert/Assert}}]]
	 [[replica_0_1/sequential/prune_low_magnitude_conv2d/cond/Sub/ReadVariableOp/_1251]]

The code below shows an example of errors thrown for pruned_parallel_model.fit.

#!/usr/bin/env python
# coding: utf-8

# ##### Copied from https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras

import numpy as np
import tempfile
import os

import tensorflow as tf
from tensorflow_model_optimization.sparsity import keras as sparsity
import tensorflow.keras.backend as K
from tensorflow.keras.utils import multi_gpu_model

def main():
    K.clear_session()
    config = tf.ConfigProto(allow_soft_placement=True)
    config.gpu_options.allow_growth = True
    session = tf.Session(config=config)

    session.run(tf.compat.v1.global_variables_initializer())
    session.run(tf.compat.v1.tables_initializer())
    K.set_session(session)

    # ## Prepare the training data

    batch_size = 128
    num_classes = 10
    epochs = 10

    # input image dimensions
    img_rows, img_cols = 28, 28

    # the data, shuffled and split between train and test sets
    (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

    if tf.keras.backend.image_data_format() == 'channels_first':
      x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
      x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
      input_shape = (1, img_rows, img_cols)
    else:
      x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
      x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
      input_shape = (img_rows, img_cols, 1)

    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_train /= 255
    x_test /= 255
    print('x_train shape:', x_train.shape)
    print(x_train.shape[0], 'train samples')
    print(x_test.shape[0], 'test samples')

    # convert class vectors to binary class matrices
    y_train = tf.keras.utils.to_categorical(y_train, num_classes)
    y_test = tf.keras.utils.to_categorical(y_test, num_classes)


    # ## Train a MNIST model without pruning

    # ### Build the MNIST model

    l = tf.keras.layers

    model = tf.keras.Sequential([
        l.Conv2D(
            32, 5, padding='same', activation='relu', input_shape=input_shape),
        l.MaxPooling2D((2, 2), (2, 2), padding='same'),
        l.BatchNormalization(),
        l.Conv2D(64, 5, padding='same', activation='relu'),
        l.MaxPooling2D((2, 2), (2, 2), padding='same'),
        l.Flatten(),
        l.Dense(1024, activation='relu'),
        l.Dropout(0.4),
        l.Dense(num_classes, activation='softmax')
    ])

    model.summary()

    logdir = tempfile.mkdtemp()
    print('Writing training logs to ' + logdir)

    # ### Train the model to reach an accuracy >99%

    callbacks = [tf.keras.callbacks.TensorBoard(log_dir=logdir, profile_batch=0)]

    #~ model.compile(
        #~ loss=tf.keras.losses.categorical_crossentropy,
        #~ optimizer='adam',
        #~ metrics=['accuracy'])
    parallel_model = multi_gpu_model(model, 2)

    parallel_model.compile(
        loss=tf.keras.losses.categorical_crossentropy,
        optimizer='adam',
        metrics=['accuracy'])

    parallel_model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              verbose=1,
              callbacks=callbacks,
              validation_data=(x_test, y_test))
    #~ score = model.evaluate(x_test, y_test, verbose=0)
    #~ print('Test loss:', score[0])
    #~ print('Test accuracy:', score[1])


    # ### Save the original model for size comparison later

    # Backend agnostic way to save/restore models
    _, keras_file = tempfile.mkstemp('.h5')
    print('Saving model to: ', keras_file)
    tf.keras.models.save_model(model, keras_file, include_optimizer=False)


    # ### Prune a whole model

    # Load the serialized model
    loaded_model = tf.keras.models.load_model(keras_file)

    epochs = 4
    num_train_samples = x_train.shape[0]
    end_step = np.ceil(1.0 * num_train_samples / batch_size).astype(np.int32) * epochs
    print('End step: ' + str(end_step))

    new_pruning_params = {
          'pruning_schedule': sparsity.PolynomialDecay(initial_sparsity=0.50,
                                                       final_sparsity=0.90,
                                                       begin_step=0,
                                                       end_step=end_step,
                                                       frequency=100)
    }

    new_pruned_model = sparsity.prune_low_magnitude(model, **new_pruning_params)
    new_pruned_model.summary()

    # Add a pruning step callback to peg the pruning step to the optimizer's
    # step. Also add a callback to add pruning summaries to tensorboard
    callbacks = [
        sparsity.UpdatePruningStep(),
        sparsity.PruningSummaries(log_dir=logdir, profile_batch=0)
    ]

    for i in range(2):
        if i:
            cur_model = multi_gpu_model(new_pruned_model, 2)
        else:
            cur_model = tf.keras.models.clone_model(new_pruned_model)

        cur_model.compile(
            loss=tf.keras.losses.categorical_crossentropy,
            optimizer='adam',
            metrics=['accuracy'])

        cur_model.fit(x_train, y_train,
                  batch_size=batch_size,
                  epochs=epochs,
                  verbose=1,
                  callbacks=callbacks,
                  validation_data=(x_test, y_test))

        print("i", i, "ran successfully")

        '''
        Throws errors for multi_gpu_model (when i=1):
        Invalid argument: assertion failed: [Prune() wrapper requires the UpdatePruningStep callback to be provided during training.
        Please add it as a callback to your model.fit call.]
        '''

    #~ score = new_pruned_model.evaluate(x_test, y_test, verbose=0)
    #~ print('Test loss:', score[0])
    #~ print('Test accuracy:', score[1])


if __name__ == "__main__":
    main()

AttributeError: 'Conv2D' object has no attribute 'kernel' on running MNIST Keras pruning Example

When I am trying to run the MNIST Example of keras pruning Link, I get the following Traceback.
Tensorflow Version:
Name: tensorflow Version: 2.0.0a0

Traceback

--> 156   layerwise_model = build_layerwise_model(input_shape, **pruning_params)
    157   sequential_model = build_sequential_model(input_shape)
    158   sequential_model = prune.prune_low_magnitude(

<ipython-input-9-8bf64bf9319d> in build_layerwise_model(input_shape, **pruning_params)
     73       l.Dropout(0.4),
     74       prune.prune_low_magnitude(
---> 75           l.Dense(num_classes, activation='softmax'), **pruning_params)
     76   ])
     77 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/base.py in _method_wrapper(self, *args, **kwargs)
    454     self._setattr_tracking = False  # pylint: disable=protected-access
    455     try:
--> 456       result = method(self, *args, **kwargs)
    457     finally:
    458       self._setattr_tracking = previous_value  # pylint: disable=protected-access

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/sequential.py in __init__(self, layers, name)
    106     if layers:
    107       for layer in layers:
--> 108         self.add(layer)
    109 
    110   @property

/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/base.py in _method_wrapper(self, *args, **kwargs)
    454     self._setattr_tracking = False  # pylint: disable=protected-access
    455     try:
--> 456       result = method(self, *args, **kwargs)
    457     finally:
    458       self._setattr_tracking = previous_value  # pylint: disable=protected-access

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/sequential.py in add(self, layer)
    167           # and create the node connecting the current layer
    168           # to the input layer we just created.
--> 169           layer(x)
    170           set_inputs = True
    171 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    592           # Build layer if applicable (if the `build` method has been
    593           # overridden).
--> 594           self._maybe_build(inputs)
    595           # Explicitly pass the learning phase placeholder to `call` if
    596           # the `training` argument was left unspecified by the user.

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py in _maybe_build(self, inputs)
   1711     # Only call `build` if the user has manually overridden the build method.
   1712     if not hasattr(self.build, '_is_default'):
-> 1713       self.build(input_shapes)
   1714     # We must set self.built since user defined build functions are not
   1715     # constrained to set self.built.

/usr/local/lib/python3.6/dist-packages/tensorflow_model_optimization/python/core/sparsity/keras/pruning_wrapper.py in build(self, input_shape)
    173     weight_vars, mask_vars, threshold_vars = [], [], []
    174 
--> 175     self.prunable_weights = self.layer.get_prunable_weights()
    176 
    177     # For each of the prunable weights, add mask and threshold variables

/usr/local/lib/python3.6/dist-packages/tensorflow_model_optimization/python/core/sparsity/keras/prune_registry.py in get_prunable_weights()
    167 
    168     def get_prunable_weights():
--> 169       return [getattr(layer, weight) for weight in cls._weight_names(layer)]
    170 
    171     def get_prunable_weights_rnn():  # pylint: disable=missing-docstring

/usr/local/lib/python3.6/dist-packages/tensorflow_model_optimization/python/core/sparsity/keras/prune_registry.py in <listcomp>(.0)
    167 
    168     def get_prunable_weights():
--> 169       return [getattr(layer, weight) for weight in cls._weight_names(layer)]
    170 
    171     def get_prunable_weights_rnn():  # pylint: disable=missing-docstring

AttributeError: 'Conv2D' object has no attribute 'kernel'

I am not using absl-py to run the main(), because I am using colab, therefore, I have also commented out the eager execution line.

Can anyone please help me this, where I am going wrong, or is this a bug?

ValueError: Variable <tf.Variable 'conv2d_286/kernel:0' shape=(3, 3, 3, 32) dtype=float32> has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable).

I am Pruning an InceptionV3 model

So I create a end to end model

base_model = applications.InceptionV3(weights='imagenet', include_top=False)#, input_tensor=input_tensor)
x = GlobalAveragePooling2D()(base_model.output)
x = Dense(1024, activation='relu')(x)
predictions = Dense(3, kernel_initializer="glorot_uniform", kernel_regularizer=l2(.0005), activation='softmax')(x)

model = Model(base_model.input, predictions)

Then I prune the model by using this command

ConstantSparsity = pruning_schedule.ConstantSparsity

pruning_params = {
    'pruning_schedule': ConstantSparsity(0.75, begin_step=2000, frequency=100)
}

pruned_model = sparsity.prune_low_magnitude(model, **pruning_params)

opt = SGD(lr=.01, momentum=.9)
pruned_model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

But When I try to Fit the data for training

callbacks = [
        pruning_callbacks.UpdatePruningStep(),
        pruning_callbacks.PruningSummaries(log_dir = '/content/')#log_dir=FLAGS.output_dir)
    ]

pruned_model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples,
    epochs=epochs,
    validation_data=validation_generator, callbacks=callbacks)

I get the following Error:

Epoch 1/10
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-36-06fb096a2de2> in <module>()
      3     steps_per_epoch=nb_train_samples,
      4     epochs=epochs,
----> 5     validation_data=validation_generator, callbacks=callbacks)#[lr_scheduler, csv_logger, checkpointer])
      6 
      7 pruned_model.save_weights("{}/final_{}_{}.h5".format(job_path, job_name, model_name))

5 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py in get_gradients(self, loss, params)
    396                            "gradient defined (i.e. are differentiable). "
    397                            "Common ops without gradient: "
--> 398                            "K.argmax, K.round, K.eval.".format(param))
    399       if hasattr(self, "clipnorm"):
    400         grads = [clip_ops.clip_by_norm(g, self.clipnorm) for g in grads]

ValueError: Variable <tf.Variable 'conv2d_286/kernel:0' shape=(3, 3, 3, 32) dtype=float32> has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

Can Anybody please help with this, I can't seem to find a solution for this on StackOverflow or elsewhere.

Thank you

Object detection API

Hello all,

I was wondering if your pruning tools could be used on my object detection models (SSD / Faster RCNN, ...) trained with the tensorflow object detection API.

When training, I don't use directly Keras, but I follow tutorials of the object detection API.

Thanks for your help :)

What is the difference between tf-mot and tensorrt?

Now I am considering model optimization with tf2.0 model.

And there is tfmot, which is pretty new but made by tf itself. Also, there is tensorrt made by nvidia.

I wonder what is the difference between them and why TF team started to make this project.

tensorflow / model-optimization Goto Github PK

model-optimization's Introduction

TensorFlow Model Optimization Toolkit

Installation

Contribution guidelines

Maintainers

Community

model-optimization's People

Contributors

Stargazers

Watchers

Forkers

model-optimization's Issues

Reproduce

Theory

Supporting Evidence

Notes

Recommend Projects

Recommend Topics

Recommend Org