Giter VIP home page Giter VIP logo

reformers's Introduction

"Affiliates" - Jekyll Template by WowThemes.net

Live Demo   |   Download   |   Buy me a coffee   |   Documentation   |   More Jekyll Themes

affiliates

Copyright

Copyright (C) 2019 WowThemes.net.

Affiliates for Jekyll is designed by Sal and it is licensed MIT. If this project helps you reduce time to develop or you want to remove the attribution credit, you can give me a cup of coffee :)

Buy me a coffee


Contribute

  1. Fork the repo.
  2. Clone a copy of your fork on your local
  3. Create a branch off of master and give it a meaningful name (e.g. my-new-mediumish-feature).
  4. Make necessary changes, commit, push and open a pull request on GitHub.

Thank you!

reformers's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

reformers's Issues

TFReformer does not serialize to SavedModel

Using TensorFlow 2.5.1 (installed via Anaconda), the following code obtains a TypeError when attempting to persist to the TensorFlow SavedModel format per this guide :

from reformers import TFReformerLM
import tensorflow as tf

tflm = TFReformerLM(
    num_tokens = 256,
    emb = 128,
    depth = 5,
    max_seq_len = 8192,
    heads = 8,
    lsh_dropout = 0.1,
    causal = False,
    bucket_size = 64,
    ff_chunks=128,
    use_full_attn = False
)

tflm.build(input_shape=(1,8192))

tflm(tf.zeros((1,8192)))
tf.saved_model.save(tflm, "tflm.pb")

The result is


/home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/keras/saving/saving_utils.py:130 _wrapped_model  *
        outputs = model(inputs, training=False)
    /home/$USER/reformers/TFreformers.py:78 call  *
        inputs = self.reformer(inputs)
    /home/$USER/reformers/TFreformers.py:64 call  *
        x = self.model_layers(x)
    /home/$USER/reformers/blocks.py:139 call  *
        h = block(h, training=training)
    /home/$USER/reformers/blocks.py:304 call  *
        f_x2 = self.f(x2, training=training)
    /home/$USER/reformers/TFutils.py:85 call  *
        return self.fn(inputs)
    /home/$USER/reformers/TFefficient_attention.py:280 merge_heads  *
        return tf.reshape(tf.transpose(tf.reshape(v, (b, t, h, -1)), perm=[0, 2, 1, 3]), (b * h, t, -1))
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:206 wrapper  **
        return target(*args, **kwargs)
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py:195 reshape
        result = gen_array_ops.reshape(tensor, shape, name)
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/ops/gen_array_ops.py:8397 reshape
        _, _, _op, _outputs = _op_def_library._apply_op_helper(
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py:525 _apply_op_helper
        raise err
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py:511 _apply_op_helper
        values = ops.convert_to_tensor(
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/profiler/trace.py:163 wrapped
        return func(*args, **kwargs)
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:1566 convert_to_tensor
        ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/constant_op.py:339 _constant_tensor_conversion_function
        return constant(v, dtype=dtype, name=name)
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/constant_op.py:264 constant
        return _constant_impl(value, dtype, shape, name, verify_shape=False,
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/constant_op.py:281 _constant_impl
        tensor_util.make_tensor_proto(
    /home/$USER/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/tensor_util.py:554 make_tensor_proto
        raise TypeError("Failed to convert object of type %s to Tensor. "

    TypeError: Failed to convert object of type <class 'tuple'> to Tensor. Contents: (None, 8192, 8, -1). Consider casting elements to a supported type.

I have searched the web for other instances of this error and found nothing enlightening. 1. Is this an issue with the computation graph containing a non-tensor object that does not convert to a tensor? Why are two dimensions effectively None or -1 (both the batch and the num_heads elements)?
I get the a TypeError when running even a fully specified tuple in the offending code (in tensor_util:make_tensor_proto):

    str_values = [compat.as_bytes(x) for x in (1, 8192, 8, 1)]

The result is:

TypeError: Expected binary or unicode string, got 1

This suggests to me that this part of the conversion pipeline is expecting something other than numeric values ("binary or unicode string") inside the generator, which says to me that the model construction is injecting something the converter doesn't like or isn't expecting to see at this point. The serialization works fine with other TensorFlow model types.

version of the packages

Hi, would you please tell me the version of packages (tensorflow and pytorch) in the requirement.txt?

Wrong use of tf.gather

I think the use of tf.gather in this implementation is wrong.

tf.gather is different than torch.gather.

Max Sequence Length?

Do both torch and tensorflow require max_seq_len?
If so, and the max sequence length is defined as 1000 for example, can the models get less sequence items?

I have a very long tail, where the average number of tokens I have is 1000, but the max number of tokens is roughly 2 million, and those are outliers I'm interested in capturing. Do I have to pad all of my data to 2 million?

Multiply by float('-inf') causes nan output

dots = tf.math.multiply(dots, tf.cast(mask, tf.float32)) + (1-tf.cast(mask, tf.float32)) * float('-inf')
in TFLSHAttention call() function causes nan output. Changed to * -1e9 has fixed it.

All nan when using TFReformerLM

I am using TensorFlow 2.0 on Linux but when I try to do a forward pass the output tensor contains NaN only.

Also if i try using the predict function it throws the following error:

(None, 3200)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-3e4eafdb6073> in <module>()
      1 code_vec=np.zeros((1,3200),dtype=np.int8)
----> 2 model_tf.predict(code_vec)

8 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py in wrapper(*args, **kwargs)
    235       except Exception as e:  # pylint:disable=broad-except
    236         if hasattr(e, 'ag_error_metadata'):
--> 237           raise e.ag_error_metadata.to_exception(e)
    238         else:
    239           raise

TypeError: in converted code:

    /content/master/reformers/TFreformers.py:78 call  *
        inputs = self.reformer(inputs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py:773 __call__
        outputs = call_fn(cast_inputs, *args, **kwargs)
    /content/master/reformers/TFreformers.py:64 call  *
        x = self.model_layers(x)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py:773 __call__
        outputs = call_fn(cast_inputs, *args, **kwargs)
    /content/master/reformers/blocks.py:139 call  *
        h = block(h, training=training)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py:773 __call__
        outputs = call_fn(cast_inputs, *args, **kwargs)
    /content/master/reformers/blocks.py:304 call  *
        f_x2 = self.f(x2, training=training)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py:773 __call__
        outputs = call_fn(cast_inputs, *args, **kwargs)
    /content/master/reformers/TFutils.py:85 call  *
        return self.fn(inputs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py:773 __call__
        outputs = call_fn(cast_inputs, *args, **kwargs)
    /content/master/reformers/TFefficient_attention.py:271 merge_heads  *
        return tf.reshape(tf.transpose(tf.reshape(v, (b, t, h, -1)), perm=[0, 2, 1, 3]), (b * h, t, -1))
    /tmp/tmpa6sn2593.py:21 merge_heads
        retval__1 = fscope_1.mark_return_value(ag__.converted_call(tf.reshape, (ag__.converted_call(tf.transpose, (ag__.converted_call(tf.reshape, (v, (b, t, h, -1)), None, fscope_1),), dict(perm=[0, 2, 1, 3]), fscope_1), (b * h, t, -1)), None, fscope_1))
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/array_ops.py:193 reshape
        result = gen_array_ops.reshape(tensor, shape, name)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_array_ops.py:7443 reshape
        "Reshape", tensor=tensor, shape=shape, name=name)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py:471 _apply_op_helper
        raise err
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py:468 _apply_op_helper
        preferred_dtype=default_dtype)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1314 convert_to_tensor
        ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/constant_op.py:317 _constant_tensor_conversion_function
        return constant(v, dtype=dtype, name=name)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/constant_op.py:258 constant
        allow_broadcast=True)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/constant_op.py:296 _constant_impl
        allow_broadcast=allow_broadcast))
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/tensor_util.py:547 make_tensor_proto
        "supported type." % (type(values), values))

    TypeError: Failed to convert object of type <class 'tuple'> to Tensor. Contents: (None, 3200, 8, -1). Consider casting elements to a supported type.```

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.