The paraphrase-datasets-pretrained-models from hetpandya

Fine-Tuning mt5 on tapaco de

Hey, nice work! I tried to run your example script with the german tapaco dataset and the mt5 model instead of t5. When training, I get these warnings:

/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py:3365: FutureWarning: 
`prepare_seq2seq_batch` is deprecated and will be removed in version 5 of HuggingFace Transformers. Use the regular
`__call__` method to prepare your inputs and the tokenizer under the `as_target_tokenizer` context manager to prepare
your targets.

Here is a short example:

model_inputs = tokenizer(src_texts, ...)
with tokenizer.as_target_tokenizer():
    labels = tokenizer(tgt_texts, ...)
model_inputs["labels"] = labels["input_ids"]

See the documentation of your specific tokenizer for more details on the specific arguments to the tokenizer of choice.
For a more complete example, see the implementation of `prepare_seq2seq_batch`.

  warnings.warn(formatted_warning, FutureWarning)
Using Adafactor for T5
Epoch 1 of 1: 100%
1/1 [4:37:28<00:00, 16648.54s/it]
Epochs 0/1. Running Loss: nan: 100%
9347/9347 [4:37:23<00:00, 1.38s/it]
/usr/local/lib/python3.7/dist-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
WARNING:root:NaN or Inf found in input tensor.
(584, nan)

As you can see, the training did finish and the model was saved. But when i try to generate paraphrases, I get these weird outputs

Generating outputs: 100%
1/1 [00:00<00:00, 1.60it/s]
Decoding outputs: 100%
5/5 [00:01<00:00, 1.65s/it]
[['<extra_id_0>.',
  '<extra_id_0>.',
  '<extra_id_0>',
  '<extra_id_0>) <extra_id_36> ein.',
  '<extra_id_0> waren']]

I trained the model only for one epoch instead of four. Is this the reason for this or these warnings while training? Another thing I didn't quite understand is the dataset. In your example (and in the german part of tapaco) there are text and paraphrase pairs which are not paraphrases. For example the second your in your example notebook:

In [ ]:

dataset_df.head()

Out[ ]:

Text	Paraphrase
I ate the cheese.	I eat cheese.
I'm eating a yogurt.	I'm eating cheese.
I'm having some cheese.	I eat some cheese.
It's Monday.	It is Monday today.
It's Monday today.	Today is Monday.

hetpandya / paraphrase-datasets-pretrained-models Goto Github PK

paraphrase-datasets-pretrained-models's People

Contributors

Stargazers

Watchers

Forkers

paraphrase-datasets-pretrained-models's Issues

Fine-Tuning mt5 on tapaco de

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent