hyperparticle / udify Goto Github PK

A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology tags, lemmas, and dependency trees.

Home Page: https://arxiv.org/abs/1904.02099

License: MIT License

Python 99.32% Shell 0.68%

pytorch deep-learning universal-dependencies allennlp dependency-parser neural-network syntax

udify's People

Stargazers

Watchers

udify's Issues

Training seems not no begin

I am trying to reproduce the experiment, but it look as if the training process stay stuck at start :

2019-07-09 17:15:45,951 - INFO - allennlp.training.trainer - Beginning training.
2019-07-09 17:15:45,951 - INFO - allennlp.training.trainer - Epoch 0/79
2019-07-09 17:15:45,951 - INFO - allennlp.training.trainer - Peak CPU memory usage MB: 19202.28
2019-07-09 17:15:46,225 - INFO - allennlp.training.trainer - GPU 0 memory usage MB: 1694
2019-07-09 17:15:46,226 - INFO - allennlp.training.trainer - GPU 1 memory usage MB: 37
2019-07-09 17:15:46,231 - INFO - allennlp.training.trainer - Training
0%| | 0/46617 [00:00<?, ?it/s]

After a night, the progress bar has not moved at all.

Cpu usage is 100% for 1 core, memory use is slightly increasing, and gpus are not working.

Could you please indicate which versions of python, allennlp and pytorch you are using ?

Mine are
python=3.6
allennlp==0.8.4
pytorch-pretrained-bert==0.6.1
pytorch=1.0.0

Out-of-directory calls

Hello, and thanks for making this parser available!

I wrote a shell script which calls predict.py from an external directory, but I encounter the following error:

Traceback (most recent call last):
  File "../udify/predict.py", line 57, in <module>
    batch_size=args.batch_size)
  File "/media/sf_thesis/udify/udify/util.py", line 143, in predict_model_with_archive
    cuda_device=cuda_device)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/models/archival.py", line 230, in load_archive
    cuda_device=cuda_device)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/models/model.py", line 327, in load
    return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/models/model.py", line 265, in _load
    model = Model.from_params(vocab=vocab, params=model_params)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/common/from_params.py", line 365, in from_params
    return subclass.from_params(params=params, **extras)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/common/from_params.py", line 386, in from_params
    kwargs = create_kwargs(cls, params, **extras)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/common/from_params.py", line 133, in create_kwargs
    kwargs[name] = construct_arg(cls, name, annotation, param.default, params, **extras)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/common/from_params.py", line 229, in construct_arg
    return annotation.from_params(params=subparams, **subextras)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/common/from_params.py", line 365, in from_params
    return subclass.from_params(params=params, **extras)
  File "/media/sf_thesis/udify/udify/modules/text_field_embedder.py", line 163, in from_params
    for name, subparams in token_embedder_params.items()
  File "/media/sf_thesis/udify/udify/modules/text_field_embedder.py", line 163, in <dictcomp>
    for name, subparams in token_embedder_params.items()
  File "/usr/local/lib/python3.6/dist-packages/allennlp/common/from_params.py", line 365, in from_params
    return subclass.from_params(params=params, **extras)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/common/from_params.py", line 388, in from_params
    return cls(**kwargs)  # type: ignore
  File "/media/sf_thesis/udify/udify/modules/bert_pretrained.py", line 589, in __init__
    model = BertModel(BertConfig.from_json_file(bert_config))
  File "/usr/local/lib/python3.6/dist-packages/pytorch_pretrained_bert/modeling.py", line 206, in from_json_file
    with open(json_file, "r", encoding='utf-8') as reader:
FileNotFoundError: [Errno 2] No such file or directory: 'config/archive/bert-base-multilingual-cased/bert_config.json'

The program works fine when called from within the udify directory itself, though. Would it be possible to modify the addresses in the source code to work for external calls as well? I'd be willing to help sift through the source files for this if needed.

Updating conllu library

Hi Dan, I see in the code and in #5 that updating the conllu library is on the agenda.

I have made a few modifications on my forked version of UDify. From what I understand, parser.py contains some source code from the conllu library with a few modifications, mainly to handle multi-word tokens, where the desired output (example from fr_gsd-ud-train.conllu) looks like:

multiword_ids ['3-4', '72-73', '87-88', '105-106', '110-111', '121-122']
multiword_forms ['du', 'des', 'des', 'des', 'du', 'du']

In my forked version, I am still using the conllu library to return the annotation but do the MWT processing in a subsequent step in a process_MWTs function. In this version, I confirmed that the outputs are the same:

multiword_ids ['3-4', '72-73', '87-88', '105-106', '110-111', '121-122']
multiword_forms ['du', 'des', 'des', 'des', 'du', 'du']

I have done another few checks to make sure the data is the same, where updated is the forked version and original is the current version e.g.:

cat fr_gsd_original/vocabulary/tokens.txt | md5sum
e80f1f1e341fc5734c8f3a3d1c779c55 
cat fr_gsd_updated/vocabulary/tokens.txt | md5sum
e80f1f1e341fc5734c8f3a3d1c779c55

There are a few benefits I can see from this:

Supports most recent conllu library.
Reduces the amount of code needed in parser.py

There are probably more elegant ways of going about MWT processing but I just thought I'd post it here in case you find it helpful. If you do, I can do more tests and once confirming behaviour is exactly the same, I can submit a PR.

Scalar mix

I was not able to use scalar mix option by changing combine_layers to mix from all. mix_embedding is set to 12. Is there anything else that need to change in the config file?

training a udify model only for Korean

Hello, I am training a udify model only for Korean where I train only on the Korean data from UD2.3. However, I am running into the following issue. The same code runs fine on other languages from UD.
Traceback (most recent call last):
File "train.py", line 113, in
train_model(train_params, serialization_dir, recover=bool(args.resume))
File "/home/user/anaconda3/envs/dependency_parse/lib/python3.7/site-packages/allennlp/commands/train.py", line 226, in train_model
cache_prefix)
File "/home/user/anaconda3/envs/dependency_parse/lib/python3.7/site-packages/allennlp/training/trainer_pieces.py", line 65, in from_params
model = Model.from_params(vocab=vocab, params=params.pop('model'))
File "/home/user/anaconda3/envs/dependency_parse/lib/python3.7/site-packages/allennlp/common/from_params.py", line 365, in from_params
return subclass.from_params(params=params, **extras)
File "/home/user/anaconda3/envs/dependency_parse/lib/python3.7/site-packages/allennlp/common/from_params.py", line 386, in from_params
kwargs = create_kwargs(cls, params, **extras)
File "/home/user/anaconda3/envs/dependency_parse/lib/python3.7/site-packages/allennlp/common/from_params.py", line 133, in create_kwargs
kwargs[name] = construct_arg(cls, name, annotation, param.default, params, **extras)
File "/home/user/anaconda3/envs/dependency_parse/lib/python3.7/site-packages/allennlp/common/from_params.py", line 257, in construct_arg
value_dict[key] = value_cls.from_params(params=value_params, **subextras)
File "/home/user/anaconda3/envs/dependency_parse/lib/python3.7/site-packages/allennlp/common/from_params.py", line 365, in from_params
return subclass.from_params(params=params, **extras)
File "/home/user/anaconda3/envs/dependency_parse/lib/python3.7/site-packages/allennlp/common/from_params.py", line 388, in from_params
return cls(**kwargs) # type: ignore
File "/home/user/udify-master/udify/models/tag_decoder.py", line 105, in init
div_value=4.0)
File "/home/user/anaconda3/envs/dependency_parse/lib/python3.7/site-packages/torch/nn/modules/adaptive.py", line 133, in init
raise ValueError("cutoffs should be a sequence of unique, positive "
ValueError: cutoffs should be a sequence of unique, positive integers sorted in an increasing order, where each value is between 1 and n_classes-1

UdifyTextPredictor fails when output_conllu=true

I'm feeding this raw input to the predict.py - "Il est assez sûr de lui pour danser et chanter en public ." by setting --raw_text flag and since I want the output in CoNLLU format, I've set output_conllu=True in UdifyTextPredictor.

The dump_line in UdifyPredictor is erroring out.

File udify/udify/predictors/text_predictor.py", line 63, in dump_line
return self.predictor.dump_line(outputs)
File udify/udify/predictors/predictor.py", line 82, in dump_line
multiword_ids = [[id] + [int(x) for x in id.split("-")] for id in outputs["multiword_ids"]]
File udify/udify/predictors/predictor.py", line 82, in
multiword_ids = [[id] + [int(x) for x in id.split("-")] for id in outputs["multiword_ids"]]
File udify/udify/predictors/predictor.py", line 82, in
multiword_ids = [[id] + [int(x) for x in id.split("-")] for id in outputs["multiword_ids"]]
ValueError: invalid literal for int() with base 10: 'N'

Could you please take a look?

Thanks,
Ranjita

Frozen features and gold tags

Hi,

Is it possible to disable fine-tuning by simply changing requires_grad to false in token_embedder for a frozen feature based embedder?
Also, is there a preferred approach to just evaluate dependency parsing using gold tags for pos/feats/lemmas?

Thanks.

Training with XLM-RoBERTa

Hi, has anybody looked into training a version of udify with XLM-RoBERTa? Seems like it could help with the low-resource languages in multilingual BERT so I'm planning on giving it a go if nobody else has already.

How to run the UDify+Lang experiments?

Is there an example config somewhere showing how to fine-tune on a specific treebank using BERT weights saved from fine-tuning on all UD treebanks combined (using the saved pretrained models)? Corresponding to the UDify+Lang experiments in table 2 in the paper.

Poor/bad scores or metrics when fine-tuning

If you are seeing poor fine-tuning evaluation UAS/LAS scores, then this additional info might help.

It should take about 10 epochs to start seeing good scores coming from all the metrics, and 80 epochs to be competitive with UDPipe Future. If it's still not showing this, then there might be something off about your training.

One caveat is that if you use a subset of treebanks for fine-tuning instead of all 124 UD v2.3 treebanks, you must modify the configuration file. Make sure to tune the learning rate scheduler in your config to the number of training steps. Copy the udify_bert_finetune_multilingual.json config and modify the "warmup_steps" and "start_step" values. A good initial choice would be to set both to be equal to the number of training batches of one epoch (run the training script first to see the batches remaining).

"nan" loss on training

Hi!

Thanks for releasing the library. I'm encountering "nan" loss on training with the following commit, which I think is the most recent version: 60f35edc52862109555f4acf66236becc29705ad

Here are instructions to reproduce:

pip install -r ./requirements.txt
bash ./scripts/download_ud_data.sh
python train.py --config config/ud/en/udify_bert_train_en_ewt.json --name en_ewt --dataset_dir data/ud-treebanks-v2.3/

The end of the training is this:

2020-06-10 16:23:38,177 - INFO - allennlp.training.trainer - Training
  0%|          | 0/392 [00:00<?, ?it/s]Traceback (most recent call last):
  File "train.py", line 110, in <module>
    train_model(train_params, serialization_dir, recover=bool(args.resume))
  File "/home/gneubig/anaconda3/envs/python3/lib/python3.7/site-packages/allennlp/commands/train.py", line 252, in train_model
    metrics = trainer.train()
  File "/home/gneubig/anaconda3/envs/python3/lib/python3.7/site-packages/allennlp/training/trainer.py", line 478, in train
    train_metrics = self._train_epoch(epoch)
  File "/home/gneubig/anaconda3/envs/python3/lib/python3.7/site-packages/allennlp/training/trainer.py", line 323, in _train_epoch
    raise ValueError("nan loss encountered")
ValueError: nan loss encountered

I've attached the full log below as well:
udify-log.txt

My pip environment is also here:
pip-list.txt

Do you have an idea what the issue is? I'd be happy to help debug further (cc: @antonisa and @LeYonan)

Continuing training on a new data.

I have a udify model on one dataset and I want to continue training on a new dataset. I used --resume option giving the serialization directory of the model trained on first dataset. However, that didn't work, even after first epoch the model seemed to have reset its parameters and started training from scratch. I also used a lower learning rate in the same config file but it didn't work. Is there anything I am doing wrong?

spacy integration

Hi, @Hyperparticle

I integrated udify into spaCy, and opensourced it yesterday.
This is a documentation for udify.

Regards,

predict.py to work with raw text files

Hello!

First of all, thank you for the research and shared code, it's immensely helpful.

I wanted to know if there's an easy way for me to make predict.py work with raw text files, since this seems like the purpose of the architecture. Is there a reason my input files have to conform to the CoNLL -U format besides calculating evaluation metrics?

Sliding window bug

Hi, there seems to be a bug in the calculation of final_window_start:

udify/udify/modules/bert_pretrained.py

Lines 488 to 509 in cbabef6

 # Next, select indices of the sequence such that it will result in embeddings representing the original 

 # sentence. To capture maximal context, the indices will be the middle part of each embedded window 

 # sub-sequence (plus any leftover start and final edge windows), e.g., 

 # 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

 # "[CLS] I went to the very fine [SEP] [CLS] the very fine store to eat [SEP]" 

 # with max_pieces = 8 should produce max context indices [2, 3, 4, 10, 11, 12] with additional start 

 # and final windows with indices [0, 1] and [14, 15] respectively. 

 # Find the stride as half the max pieces, ignoring the special start and end tokens 

 # Calculate an offset to extract the centermost embeddings of each window 

 stride = (self.max_pieces - self.start_tokens - self.end_tokens) // 2 

 stride_offset = stride // 2 + self.start_tokens 

 first_window = list(range(stride_offset)) 

 max_context_windows = [i for i in range(full_seq_len) 

 if stride_offset - 1 < i % self.max_pieces < stride_offset + stride] 

 final_window_start = full_seq_len - (full_seq_len % self.max_pieces) + stride_offset + stride 

 final_window = list(range(final_window_start, full_seq_len)) 

 select_indices = first_window + max_context_windows + final_window

On the test case from your comment, final_window_start is greater than full_seq_len:

full_seq_len = 16
max_pieces = 8
start_tokens = 1
end_tokens = 1

# Next, select indices of the sequence such that it will result in embeddings representing the original
# sentence. To capture maximal context, the indices will be the middle part of each embedded window
# sub-sequence (plus any leftover start and final edge windows), e.g.,
#  0     1 2    3  4   5    6    7     8     9   10   11   12    13 14  15
# "[CLS] I went to the very fine [SEP] [CLS] the very fine store to eat [SEP]"
# with max_pieces = 8 should produce max context indices [2, 3, 4, 10, 11, 12] with additional start
# and final windows with indices [0, 1] and [14, 15] respectively.

# Find the stride as half the max pieces, ignoring the special start and end tokens
# Calculate an offset to extract the centermost embeddings of each window
stride = (max_pieces - start_tokens - end_tokens) // 2
stride_offset = stride // 2 + start_tokens

first_window = list(range(stride_offset))

max_context_windows = [i for i in range(full_seq_len)
                       if stride_offset - 1 < i % max_pieces < stride_offset + stride]

final_window_start = full_seq_len - (full_seq_len % max_pieces) + stride_offset + stride
final_window = list(range(final_window_start, full_seq_len))

select_indices = first_window + max_context_windows + final_window
print(select_indices)

Output is [0, 1, 2, 3, 4, 10, 11, 12] and [14, 15] is missing.

Using other transformer models

Hello,

I am trying to use the XLMRoberta model instead of BERT and I made the following changes to the bert_pretrained.py:

from transformers import XLMRobertaTokenizer
from transformers import XLMRobertaModel, XLMRobertaConfig

However, I get the following error:

super().__init__(vocab=bert_tokenizer.vocab, AttributeError: 'XLMRobertaTokenizer' object has no attribute 'vocab'

Any guidance would be much appreciated!

Prediction of multi-word expression

Is it possible to predict multi-word expression (MWE) from raw text?
I run predict.py with option --raw_text to find that MWE cannot be predicted.

For example, in Italy, "della" is abbreviation of "di la" and UD annotates such token like as follows:

31-32	della	_	_	_	_	_	_	_	_
31	di	di	ADP	E	_	35	case	35:case	_
32	la	il	DET	RD	Definite=Def|Gender=Fem|Number=Sing|PronType=Art	35	det	35:det	_

However, the output of UDify is something like this:

31	della	della	ADP	_	_	3	case	_	_

I hope to obtain the conllu output with proper MWE. Are there any way to realize it?

Naming bug for `head_arc` and `child_arc` weights/representations

This is for future reference, for anyone working off of this model's edge score outputs.

There is a semantic naming bug in the dependency decoder that actually switches the head_arc_representation and child_arc_representation and the weights that compute them.

The code that computes the arc logits in dependency_decoder.py is:

head_arc_representation = self._dropout(self.head_arc_feedforward(encoded_text))
child_arc_representation = self._dropout(self.child_arc_feedforward(encoded_text))
attended_arcs = self.arc_attention(head_arc_representation, child_arc_representation)

This looks like attended_arcs (shape batch_size, sent_len, sent_len) is a tensor of scores with attended_arcs[b,i,j] representing the score for an arc from i to j (head->child). The rest of the code, however, uses this tensor as if it were child->head, so the tensors are in effect transposed. This also implies that the weights,head_arc_feedforward and child_arc_feedforward, are named backwards.

This causes no performance bugs for udify itself, but if you use the outputs of the model arc scores for something else, you need to transpose these scores to get them to be actually head->child.

Can I load the model using HuggingFace AutoModel ?

Hello,

Is it possible to load the UDify bert-based (udify-bert-tar-gz) model using AutoModel class of HuggingFace library ?
When downloading the model, the vocab.txt file was missing , is it the same as bert-multilingual-base ?

Thanks in advance

Training udify model for Russian.

Hello, I am training a udify model only for Russian where I train only on the Russian data from UD2.3. However, I am running into the following issue. The same code runs fine on other languages from UD.
Traceback (most recent call last):
File "train.py", line 69, in
train_model(train_params, serialization_dir, recover=bool(args.resume))
File "/usr1/home/user/anaconda2/envs/py36/lib/python3.6/site-packages/allennlp/commands/train.py", line 226, in train_model
cache_prefix)
File "/usr1/home/user/anaconda2/envs/py36/lib/python3.6/site-packages/allennlp/training/trainer_pieces.py", line 65, in from_params
model = Model.from_params(vocab=vocab, params=params.pop('model'))
File "/usr1/home/user/anaconda2/envs/py36/lib/python3.6/site-packages/allennlp/common/from_params.py", line 365, in from_params
return subclass.from_params(params=params, **extras)
File "/usr1/home/user/anaconda2/envs/py36/lib/python3.6/site-packages/allennlp/common/from_params.py", line 386, in from_params
kwargs = create_kwargs(cls, params, **extras)
File "/usr1/home/user/anaconda2/envs/py36/lib/python3.6/site-packages/allennlp/common/from_params.py", line 133, in create_kwargs
kwargs[name] = construct_arg(cls, name, annotation, param.default, params, **extras)
File "/usr1/home/user/anaconda2/envs/py36/lib/python3.6/site-packages/allennlp/common/from_params.py", line 257, in construct_arg
value_dict[key] = value_cls.from_params(params=value_params, **subextras)
File "/usr1/home/user/anaconda2/envs/py36/lib/python3.6/site-packages/allennlp/common/from_params.py", line 365, in from_params
return subclass.from_params(params=params, **extras)
File "/usr1/home/user/anaconda2/envs/py36/lib/python3.6/site-packages/allennlp/common/from_params.py", line 388, in from_params
return cls(**kwargs) # type: ignore
File "/usr1/home/user/udify/udify/models/tag_decoder.py", line 106, in init
div_value=4.0)
File "/usr1/home/user/anaconda2/envs/py36/lib/python3.6/site-packages/torch/nn/modules/adaptive.py", line 116, in init
raise ValueError("cutoffs should be a sequence of unique, positive "
ValueError: cutoffs should be a sequence of unique, positive integers sorted in an increasing order, where each value is between 1 and n_classes-1

Integrate UDify into AllenNLP

It would be useful to integrate the UDify model directly into AllenNLP as a PR, as the code merely extends the library to handle a few extra features. Since the release of the UDify code, AllenNLP also has added a multilingual UD dataset reader and a multilingual dependency parser with a corresponding model, which should make things easier.

Here is a list of things that need to be done:

"Training on other datasets" directions missing required flag

This command in the "Training on other datasets" of the README causes the following error:

(python3) gneubig@ogma:~/work/udify$ python train.py --config config/ud/en/udify_bert_train_en_ewt.json --name en_ewt
Traceback (most recent call last):
  File "train.py", line 46, in <module>
    train_path = glob.glob(pathname).pop()
IndexError: pop from empty list

Adding the --dataset_dir resolves the error.

python train.py --config config/ud/en/udify_bert_train_en_ewt.json --name en_ewt --dataset_dir data/ud-treebanks-v2.3/

predict.py to work with .conllu files NOT annotated for dependencies?

Hi there,

I was wondering whether there was a way for me to use predict.py with my corpus data (.conllu) which is not annotated for dependencies, but is annotated for POS. My goal is not to calculate evaluation metrics, at the moment, but rather have my pretrained model give me predictions on dependencies to hopefully get a head start with dependency annotations. I am working on an underdocumented language and would like to have a first row of dependencies predictions that I would then go back to, verify and update to create the GOLD standard for my language.

Is there a reason my input file have to conform to the conllu format other than for evaluation metrics? My issue seems to be that my "head" and "deprel" columns are not integers but simply "_" because they're empty. I would preferably like to keep the .conllu format of my input file as it contains POS information already which could give me better predictions.

Thank you for the research, it's super helpful, especially for underdocumented languages.

Here is my error message :

UDify for lemmatization

To my knowledge, the pretrained models don't currently have support for the SIGMORPHON 2019 shared task; I was wondering if it would be possible to release the pretrained models for those? I'm currently looking for a good multilingual lemmatizer and this seemed like a great choice.

RuntimeError: unexpected EOF

I'm trying to run udify on some data and have followed the instructions, e.g.

$ git clone https://github.com/Hyperparticle/udify
$ pip install -r ./requirements.txt
$ curl --remote-name-all https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11234/1-3042{/udify-model.tar.gz,/udify-bert.tar.gz}

I get the following output:

fran@ipek:~/source/udify$ python3.8 predict.py --device -1 udify-model.tar.gz test.0.conllu.input logs/pred.0.conllu --eval_file logs/pred.0.json
2021-01-15 16:27:42,512 - INFO - allennlp.models.archival - loading archive file /home/fran/source/udify from cache at /home/fran/source/udify
2021-01-15 16:27:42,548 - INFO - allennlp.common.registrable - instantiating registered subclass udify_model of <class 'allennlp.models.model.Model'>
2021-01-15 16:27:42,548 - INFO - allennlp.common.params - vocabulary.type = default
2021-01-15 16:27:42,548 - INFO - allennlp.common.registrable - instantiating registered subclass default of <class 'allennlp.data.vocabulary.Vocabulary'>
2021-01-15 16:27:42,548 - INFO - allennlp.data.vocabulary - Loading token dictionary from /home/fran/source/udify/vocabulary.
2021-01-15 16:27:44,391 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'decoders': {'deps': {'arc_representation_dim': 768, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'pos_embed_dim': None, 'tag_representation_dim': 256, 'type': 'udify_dependency_decoder'}, 'feats': {'adaptive': True, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'feats', 'type': 'udify_tag_decoder'}, 'lemmas': {'adaptive': True, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'lemmas', 'type': 'udify_tag_decoder'}, 'upos': {'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'upos', 'type': 'udify_tag_decoder'}}, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'layer_dropout': 0.08, 'mix_embedding': 12, 'tasks': ['upos', 'feats', 'lemmas', 'deps'], 'text_field_embedder': {'allow_unmatched_keys': True, 'dropout': 0.4, 'embedder_to_indexer_map': {'bert': ['bert', 'bert-offsets']}, 'token_embedders': {'bert': {'bert_config': 'config/archive/bert-base-multilingual-cased/bert_config.json', 'combine_layers': 'all', 'dropout': 0.1, 'layer_dropout': 0.08, 'requires_grad': True, 'type': 'udify-bert-predictor'}}, 'type': 'udify_embedder'}, 'type': 'udify_model', 'word_dropout': 0.1} and extras {'vocab'}
2021-01-15 16:27:44,391 - INFO - allennlp.common.params - model.type = udify_model
2021-01-15 16:27:44,392 - INFO - allennlp.common.from_params - instantiating class <class 'udify.models.udify_model.UdifyModel'> from params {'decoders': {'deps': {'arc_representation_dim': 768, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'pos_embed_dim': None, 'tag_representation_dim': 256, 'type': 'udify_dependency_decoder'}, 'feats': {'adaptive': True, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'feats', 'type': 'udify_tag_decoder'}, 'lemmas': {'adaptive': True, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'lemmas', 'type': 'udify_tag_decoder'}, 'upos': {'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'upos', 'type': 'udify_tag_decoder'}}, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'layer_dropout': 0.08, 'mix_embedding': 12, 'tasks': ['upos', 'feats', 'lemmas', 'deps'], 'text_field_embedder': {'allow_unmatched_keys': True, 'dropout': 0.4, 'embedder_to_indexer_map': {'bert': ['bert', 'bert-offsets']}, 'token_embedders': {'bert': {'bert_config': 'config/archive/bert-base-multilingual-cased/bert_config.json', 'combine_layers': 'all', 'dropout': 0.1, 'layer_dropout': 0.08, 'requires_grad': True, 'type': 'udify-bert-predictor'}}, 'type': 'udify_embedder'}, 'word_dropout': 0.1} and extras {'vocab'}
2021-01-15 16:27:44,392 - INFO - allennlp.common.params - model.tasks = ['upos', 'feats', 'lemmas', 'deps']
2021-01-15 16:27:44,392 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.text_field_embedders.text_field_embedder.TextFieldEmbedder'> from params {'allow_unmatched_keys': True, 'dropout': 0.4, 'embedder_to_indexer_map': {'bert': ['bert', 'bert-offsets']}, 'token_embedders': {'bert': {'bert_config': 'config/archive/bert-base-multilingual-cased/bert_config.json', 'combine_layers': 'all', 'dropout': 0.1, 'layer_dropout': 0.08, 'requires_grad': True, 'type': 'udify-bert-predictor'}}, 'type': 'udify_embedder'} and extras {'vocab'}
2021-01-15 16:27:44,392 - INFO - allennlp.common.params - model.text_field_embedder.type = udify_embedder
2021-01-15 16:27:44,392 - INFO - allennlp.common.params - model.text_field_embedder.allow_unmatched_keys = True
2021-01-15 16:27:44,392 - INFO - allennlp.common.params - model.text_field_embedder.dropout = 0.4
2021-01-15 16:27:44,392 - INFO - allennlp.common.params - model.text_field_embedder.output_dim = None
2021-01-15 16:27:44,392 - INFO - allennlp.common.params - model.text_field_embedder.sum_embeddings = None
2021-01-15 16:27:44,392 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.token_embedders.token_embedder.TokenEmbedder'> from params {'bert_config': 'config/archive/bert-base-multilingual-cased/bert_config.json', 'combine_layers': 'all', 'dropout': 0.1, 'layer_dropout': 0.08, 'requires_grad': True, 'type': 'udify-bert-predictor'} and extras {'vocab'}
2021-01-15 16:27:44,393 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.bert.type = udify-bert-predictor
2021-01-15 16:27:44,393 - INFO - allennlp.common.from_params - instantiating class <class 'udify.modules.bert_pretrained.UdifyPredictionBertEmbedder'> from params {'bert_config': 'config/archive/bert-base-multilingual-cased/bert_config.json', 'combine_layers': 'all', 'dropout': 0.1, 'layer_dropout': 0.08, 'requires_grad': True} and extras {'vocab'}
2021-01-15 16:27:44,393 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.bert.bert_config = config/archive/bert-base-multilingual-cased/bert_config.json
2021-01-15 16:27:44,393 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.bert.requires_grad = True
2021-01-15 16:27:44,393 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.bert.dropout = 0.1
2021-01-15 16:27:44,393 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.bert.layer_dropout = 0.08
2021-01-15 16:27:44,393 - INFO - allennlp.common.params - model.text_field_embedder.token_embedders.bert.combine_layers = all
2021-01-15 16:27:46,710 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'input_dim': 768, 'type': 'pass_through'} and extras {'vocab'}
2021-01-15 16:27:46,710 - INFO - allennlp.common.params - model.encoder.type = pass_through
2021-01-15 16:27:46,711 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.pass_through_encoder.PassThroughEncoder'> from params {'input_dim': 768} and extras {'vocab'}
2021-01-15 16:27:46,711 - INFO - allennlp.common.params - model.encoder.input_dim = 768
2021-01-15 16:27:46,711 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'arc_representation_dim': 768, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'pos_embed_dim': None, 'tag_representation_dim': 256, 'type': 'udify_dependency_decoder'} and extras {'vocab'}
2021-01-15 16:27:46,711 - INFO - allennlp.common.params - model.decoders.deps.type = udify_dependency_decoder
2021-01-15 16:27:46,711 - INFO - allennlp.common.from_params - instantiating class <class 'udify.models.dependency_decoder.DependencyDecoder'> from params {'arc_representation_dim': 768, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'pos_embed_dim': None, 'tag_representation_dim': 256} and extras {'vocab'}
2021-01-15 16:27:46,711 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'input_dim': 768, 'type': 'pass_through'} and extras {'vocab'}
2021-01-15 16:27:46,711 - INFO - allennlp.common.params - model.decoders.deps.encoder.type = pass_through
2021-01-15 16:27:46,711 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.pass_through_encoder.PassThroughEncoder'> from params {'input_dim': 768} and extras {'vocab'}
2021-01-15 16:27:46,711 - INFO - allennlp.common.params - model.decoders.deps.encoder.input_dim = 768
2021-01-15 16:27:46,712 - INFO - allennlp.common.params - model.decoders.deps.tag_representation_dim = 256
2021-01-15 16:27:46,712 - INFO - allennlp.common.params - model.decoders.deps.arc_representation_dim = 768
2021-01-15 16:27:46,712 - INFO - allennlp.common.params - model.decoders.deps.pos_embed_dim = None
2021-01-15 16:27:46,712 - INFO - allennlp.common.params - model.decoders.deps.use_mst_decoding_for_validation = True
2021-01-15 16:27:46,712 - INFO - allennlp.common.params - model.decoders.deps.dropout = 0.5
2021-01-15 16:27:46,712 - INFO - allennlp.common.registrable - instantiating registered subclass elu of <class 'allennlp.nn.activations.Activation'>
2021-01-15 16:27:46,718 - INFO - allennlp.common.registrable - instantiating registered subclass linear of <class 'allennlp.nn.activations.Activation'>
2021-01-15 16:27:46,722 - INFO - allennlp.common.registrable - instantiating registered subclass elu of <class 'allennlp.nn.activations.Activation'>
2021-01-15 16:27:46,867 - INFO - udify.models.dependency_decoder - Found POS tags corresponding to the following punctuation : {}. Ignoring words with these POS tags for evaluation.
2021-01-15 16:27:46,867 - INFO - allennlp.nn.initializers - Initializing parameters
2021-01-15 16:27:46,867 - INFO - allennlp.nn.initializers - Done initializing parameters; the following parameters are using their default initialization from their code
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    _head_sentinel
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    arc_attention._bias
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    arc_attention._weight_matrix
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    child_arc_feedforward._linear_layers.0.bias
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    child_arc_feedforward._linear_layers.0.weight
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    child_tag_feedforward._linear_layers.0.bias
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    child_tag_feedforward._linear_layers.0.weight
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    head_arc_feedforward._linear_layers.0.bias
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    head_arc_feedforward._linear_layers.0.weight
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    head_tag_feedforward._linear_layers.0.bias
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    head_tag_feedforward._linear_layers.0.weight
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    tag_bilinear.bias
2021-01-15 16:27:46,868 - INFO - allennlp.nn.initializers -    tag_bilinear.weight
2021-01-15 16:27:46,869 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'adaptive': True, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'feats', 'type': 'udify_tag_decoder'} and extras {'vocab'}
2021-01-15 16:27:46,869 - INFO - allennlp.common.params - model.decoders.feats.type = udify_tag_decoder
2021-01-15 16:27:46,869 - INFO - allennlp.common.from_params - instantiating class <class 'udify.models.tag_decoder.TagDecoder'> from params {'adaptive': True, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'feats'} and extras {'vocab'}
2021-01-15 16:27:46,869 - INFO - allennlp.common.params - model.decoders.feats.task = feats
2021-01-15 16:27:46,869 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'input_dim': 768, 'type': 'pass_through'} and extras {'vocab'}
2021-01-15 16:27:46,870 - INFO - allennlp.common.params - model.decoders.feats.encoder.type = pass_through
2021-01-15 16:27:46,870 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.pass_through_encoder.PassThroughEncoder'> from params {'input_dim': 768} and extras {'vocab'}
2021-01-15 16:27:46,870 - INFO - allennlp.common.params - model.decoders.feats.encoder.input_dim = 768
2021-01-15 16:27:46,870 - INFO - allennlp.common.params - model.decoders.feats.label_smoothing = 0.03
2021-01-15 16:27:46,870 - INFO - allennlp.common.params - model.decoders.feats.dropout = 0.5
2021-01-15 16:27:46,871 - INFO - allennlp.common.params - model.decoders.feats.adaptive = True
2021-01-15 16:27:46,871 - INFO - allennlp.common.params - model.decoders.feats.features = None
2021-01-15 16:27:46,895 - INFO - allennlp.nn.initializers - Initializing parameters
2021-01-15 16:27:46,895 - INFO - allennlp.nn.initializers - Done initializing parameters; the following parameters are using their default initialization from their code
2021-01-15 16:27:46,895 - INFO - allennlp.nn.initializers -    task_output.head.weight
2021-01-15 16:27:46,895 - INFO - allennlp.nn.initializers -    task_output.tail.0.0.weight
2021-01-15 16:27:46,895 - INFO - allennlp.nn.initializers -    task_output.tail.0.1.weight
2021-01-15 16:27:46,895 - INFO - allennlp.nn.initializers -    task_output.tail.1.0.weight
2021-01-15 16:27:46,896 - INFO - allennlp.nn.initializers -    task_output.tail.1.1.weight
2021-01-15 16:27:46,896 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'adaptive': True, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'lemmas', 'type': 'udify_tag_decoder'} and extras {'vocab'}
2021-01-15 16:27:46,896 - INFO - allennlp.common.params - model.decoders.lemmas.type = udify_tag_decoder
2021-01-15 16:27:46,896 - INFO - allennlp.common.from_params - instantiating class <class 'udify.models.tag_decoder.TagDecoder'> from params {'adaptive': True, 'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'lemmas'} and extras {'vocab'}
2021-01-15 16:27:46,897 - INFO - allennlp.common.params - model.decoders.lemmas.task = lemmas
2021-01-15 16:27:46,898 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'input_dim': 768, 'type': 'pass_through'} and extras {'vocab'}
2021-01-15 16:27:46,898 - INFO - allennlp.common.params - model.decoders.lemmas.encoder.type = pass_through
2021-01-15 16:27:46,898 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.pass_through_encoder.PassThroughEncoder'> from params {'input_dim': 768} and extras {'vocab'}
2021-01-15 16:27:46,899 - INFO - allennlp.common.params - model.decoders.lemmas.encoder.input_dim = 768
2021-01-15 16:27:46,899 - INFO - allennlp.common.params - model.decoders.lemmas.label_smoothing = 0.03
2021-01-15 16:27:46,900 - INFO - allennlp.common.params - model.decoders.lemmas.dropout = 0.5
2021-01-15 16:27:46,900 - INFO - allennlp.common.params - model.decoders.lemmas.adaptive = True
2021-01-15 16:27:46,900 - INFO - allennlp.common.params - model.decoders.lemmas.features = None
2021-01-15 16:27:47,014 - INFO - allennlp.nn.initializers - Initializing parameters
2021-01-15 16:27:47,014 - INFO - allennlp.nn.initializers - Done initializing parameters; the following parameters are using their default initialization from their code
2021-01-15 16:27:47,014 - INFO - allennlp.nn.initializers -    task_output.head.weight
2021-01-15 16:27:47,015 - INFO - allennlp.nn.initializers -    task_output.tail.0.0.weight
2021-01-15 16:27:47,015 - INFO - allennlp.nn.initializers -    task_output.tail.0.1.weight
2021-01-15 16:27:47,015 - INFO - allennlp.nn.initializers -    task_output.tail.1.0.weight
2021-01-15 16:27:47,015 - INFO - allennlp.nn.initializers -    task_output.tail.1.1.weight
2021-01-15 16:27:47,015 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.models.model.Model'> from params {'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'upos', 'type': 'udify_tag_decoder'} and extras {'vocab'}
2021-01-15 16:27:47,015 - INFO - allennlp.common.params - model.decoders.upos.type = udify_tag_decoder
2021-01-15 16:27:47,015 - INFO - allennlp.common.from_params - instantiating class <class 'udify.models.tag_decoder.TagDecoder'> from params {'dropout': 0.5, 'encoder': {'input_dim': 768, 'type': 'pass_through'}, 'label_smoothing': 0.03, 'task': 'upos'} and extras {'vocab'}
2021-01-15 16:27:47,015 - INFO - allennlp.common.params - model.decoders.upos.task = upos
2021-01-15 16:27:47,015 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.seq2seq_encoder.Seq2SeqEncoder'> from params {'input_dim': 768, 'type': 'pass_through'} and extras {'vocab'}
2021-01-15 16:27:47,015 - INFO - allennlp.common.params - model.decoders.upos.encoder.type = pass_through
2021-01-15 16:27:47,015 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.modules.seq2seq_encoders.pass_through_encoder.PassThroughEncoder'> from params {'input_dim': 768} and extras {'vocab'}
2021-01-15 16:27:47,015 - INFO - allennlp.common.params - model.decoders.upos.encoder.input_dim = 768
2021-01-15 16:27:47,016 - INFO - allennlp.common.params - model.decoders.upos.label_smoothing = 0.03
2021-01-15 16:27:47,016 - INFO - allennlp.common.params - model.decoders.upos.dropout = 0.5
2021-01-15 16:27:47,016 - INFO - allennlp.common.params - model.decoders.upos.adaptive = False
2021-01-15 16:27:47,016 - INFO - allennlp.common.params - model.decoders.upos.features = None
2021-01-15 16:27:47,016 - INFO - allennlp.nn.initializers - Initializing parameters
2021-01-15 16:27:47,016 - INFO - allennlp.nn.initializers - Done initializing parameters; the following parameters are using their default initialization from their code
2021-01-15 16:27:47,016 - INFO - allennlp.nn.initializers -    task_output._module.bias
2021-01-15 16:27:47,016 - INFO - allennlp.nn.initializers -    task_output._module.weight
2021-01-15 16:27:47,017 - INFO - allennlp.common.params - model.dropout = 0.5
2021-01-15 16:27:47,017 - INFO - allennlp.common.params - model.word_dropout = 0.1
2021-01-15 16:27:47,017 - INFO - allennlp.common.params - model.mix_embedding = 12
2021-01-15 16:27:47,017 - INFO - allennlp.common.params - model.layer_dropout = 0.08
2021-01-15 16:27:47,017 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file config/archive/bert-base-multilingual-cased/vocab.txt
2021-01-15 16:27:47,258 - INFO - allennlp.nn.initializers - Initializing parameters
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers - Done initializing parameters; the following parameters are using their default initialization from their code
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps._head_sentinel
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.arc_attention._bias
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.arc_attention._weight_matrix
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.child_arc_feedforward._linear_layers.0.bias
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.child_arc_feedforward._linear_layers.0.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.child_tag_feedforward._linear_layers.0.bias
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.child_tag_feedforward._linear_layers.0.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.head_arc_feedforward._linear_layers.0.bias
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.head_arc_feedforward._linear_layers.0.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.head_tag_feedforward._linear_layers.0.bias
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.head_tag_feedforward._linear_layers.0.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.tag_bilinear.bias
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.deps.tag_bilinear.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.feats.task_output.head.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.feats.task_output.tail.0.0.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.feats.task_output.tail.0.1.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.feats.task_output.tail.1.0.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.feats.task_output.tail.1.1.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.lemmas.task_output.head.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.lemmas.task_output.tail.0.0.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.lemmas.task_output.tail.0.1.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.lemmas.task_output.tail.1.0.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.lemmas.task_output.tail.1.1.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.upos.task_output._module.bias
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    decoders.upos.task_output._module.weight
2021-01-15 16:27:47,259 - INFO - allennlp.nn.initializers -    scalar_mix.deps.gamma
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.0
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.1
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.10
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.11
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.2
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.3
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.4
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.5
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.6
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.7
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.8
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.deps.scalar_parameters.9
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.gamma
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.0
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.1
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.10
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.11
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.2
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.3
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.4
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.5
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.6
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.7
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.8
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.feats.scalar_parameters.9
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.gamma
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.0
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.1
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.10
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.11
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.2
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.3
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.4
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.5
2021-01-15 16:27:47,260 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.6
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.7
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.8
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.lemmas.scalar_parameters.9
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.gamma
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.0
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.1
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.10
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.11
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.2
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.3
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.4
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.5
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.6
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.7
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.8
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    scalar_mix.upos.scalar_parameters.9
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.embeddings.LayerNorm.bias
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.embeddings.LayerNorm.weight
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.embeddings.position_embeddings.weight
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.embeddings.token_type_embeddings.weight
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.embeddings.word_embeddings.weight
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.LayerNorm.bias
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.LayerNorm.weight
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.dense.bias
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.output.dense.weight
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.key.bias
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.key.weight
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.query.bias
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.query.weight
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.value.bias
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.attention.self.value.weight
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.intermediate.dense.bias
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.intermediate.dense.weight
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.LayerNorm.bias
2021-01-15 16:27:47,261 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.LayerNorm.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.dense.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.0.output.dense.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.LayerNorm.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.LayerNorm.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.dense.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.output.dense.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.key.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.key.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.query.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.query.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.value.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.attention.self.value.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.intermediate.dense.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.intermediate.dense.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.LayerNorm.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.LayerNorm.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.dense.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.1.output.dense.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.LayerNorm.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.LayerNorm.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.dense.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.output.dense.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.key.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.key.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.query.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.query.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.value.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.attention.self.value.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.intermediate.dense.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.intermediate.dense.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.LayerNorm.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.LayerNorm.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.dense.bias
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.10.output.dense.weight
2021-01-15 16:27:47,262 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.LayerNorm.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.LayerNorm.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.dense.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.output.dense.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.key.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.key.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.query.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.query.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.value.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.attention.self.value.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.intermediate.dense.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.intermediate.dense.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.LayerNorm.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.LayerNorm.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.dense.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.11.output.dense.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.LayerNorm.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.LayerNorm.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.dense.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.output.dense.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.key.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.key.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.query.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.query.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.value.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.attention.self.value.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.intermediate.dense.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.intermediate.dense.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.LayerNorm.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.LayerNorm.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.dense.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.2.output.dense.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.LayerNorm.bias
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.LayerNorm.weight
2021-01-15 16:27:47,263 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.dense.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.output.dense.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.key.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.key.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.query.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.query.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.value.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.attention.self.value.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.intermediate.dense.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.intermediate.dense.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.LayerNorm.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.LayerNorm.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.dense.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.3.output.dense.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.LayerNorm.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.LayerNorm.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.dense.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.output.dense.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.key.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.key.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.query.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.query.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.value.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.attention.self.value.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.intermediate.dense.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.intermediate.dense.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.LayerNorm.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.LayerNorm.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.dense.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.4.output.dense.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.LayerNorm.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.LayerNorm.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.dense.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.output.dense.weight
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.key.bias
2021-01-15 16:27:47,264 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.key.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.query.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.query.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.value.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.attention.self.value.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.intermediate.dense.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.intermediate.dense.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.LayerNorm.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.LayerNorm.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.dense.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.5.output.dense.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.LayerNorm.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.LayerNorm.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.dense.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.output.dense.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.key.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.key.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.query.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.query.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.value.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.attention.self.value.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.intermediate.dense.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.intermediate.dense.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.LayerNorm.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.LayerNorm.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.dense.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.6.output.dense.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.LayerNorm.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.LayerNorm.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.dense.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.output.dense.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.key.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.key.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.query.bias
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.query.weight
2021-01-15 16:27:47,265 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.value.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.attention.self.value.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.intermediate.dense.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.intermediate.dense.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.LayerNorm.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.LayerNorm.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.dense.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.7.output.dense.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.LayerNorm.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.LayerNorm.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.dense.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.output.dense.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.key.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.key.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.query.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.query.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.value.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.attention.self.value.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.intermediate.dense.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.intermediate.dense.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.LayerNorm.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.LayerNorm.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.dense.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.8.output.dense.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.LayerNorm.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.LayerNorm.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.dense.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.output.dense.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.key.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.key.weight
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.query.bias
2021-01-15 16:27:47,266 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.query.weight
2021-01-15 16:27:47,267 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.value.bias
2021-01-15 16:27:47,267 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.attention.self.value.weight
2021-01-15 16:27:47,267 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.intermediate.dense.bias
2021-01-15 16:27:47,267 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.intermediate.dense.weight
2021-01-15 16:27:47,267 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.LayerNorm.bias
2021-01-15 16:27:47,267 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.LayerNorm.weight
2021-01-15 16:27:47,267 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.dense.bias
2021-01-15 16:27:47,267 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.encoder.layer.9.output.dense.weight
2021-01-15 16:27:47,267 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.pooler.dense.bias
2021-01-15 16:27:47,267 - INFO - allennlp.nn.initializers -    text_field_embedder.token_embedder_bert.bert_model.pooler.dense.weight
2021-01-15 16:27:47,268 - INFO - udify.models.udify_model - Total number of parameters: 212246786
2021-01-15 16:27:47,268 - INFO - udify.models.udify_model - Total number of trainable parameters: 212246786
Traceback (most recent call last):
  File "predict.py", line 59, in <module>
    util.predict_and_evaluate_model_with_archive(predictor, params, archive_dir, args.input_file,
  File "/home/fran/source/udify/udify/util.py", line 163, in predict_and_evaluate_model_with_archive
    predict_model_with_archive(predictor, params, archive, segment_file, pred_file, batch_size)
  File "/home/fran/source/udify/udify/util.py", line 142, in predict_model_with_archive
    archive = load_archive(archive,
  File "/home/fran/.local/lib/python3.8/site-packages/allennlp/models/archival.py", line 227, in load_archive
    model = Model.load(config.duplicate(),
  File "/home/fran/.local/lib/python3.8/site-packages/allennlp/models/model.py", line 327, in load
    return cls.by_name(model_type)._load(config, serialization_dir, weights_file, cuda_device)
  File "/home/fran/.local/lib/python3.8/site-packages/allennlp/models/model.py", line 275, in _load
    model_state = torch.load(weights_file, map_location=util.device_mapping(cuda_device))
  File "/home/fran/.local/lib/python3.8/site-packages/torch/serialization.py", line 529, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/fran/.local/lib/python3.8/site-packages/torch/serialization.py", line 709, in _legacy_load
    deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
RuntimeError: unexpected EOF, expected 316407350 more bytes. The file might be corrupted.
corrupted double-linked list
Avortat
fran@ipek:~/source/udify$

The MD5 sums of the two tarballs are:

$ md5sum *.tar.gz
facd2798e9786636ced131804ac67398  udify-bert.tar.gz
42aacc00e0ed6272b31ca7329055c108  udify-model.tar.gz

Pretrained proxy unavailable for raw text predictor

Hi, really impressed by your work. I was trying to utilize the pretrained multilingual bert model as a tagger for downstream task, but not able to download the pre-trained model at http://hdl.handle.net/11234/1-3042. Is there any other way to obtain that? Thank you!

Issue with AllenNLP integration causes predict to not work (ArrayField.empty_field)

When I try to run a clean checkout of UDify, I get the following error:

(udify-venv) fran@tlazolteotl /var/lib/home/fran/source/udify $ python predict.py udify-model.tar.gz  data/UD_Kiche-IU/quc_iu-ud-test.conllu logs/pred.conllu --eval_file logs/pred.json
Traceback (most recent call last):
  File "predict.py", line 14, in <module>
    from allennlp.models.archival import archive_model
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/allennlp/models/__init__.py", line 6, in <module>
    from allennlp.models.model import Model
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/allennlp/models/model.py", line 16, in <module>
    from allennlp.data import Instance, Vocabulary
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/allennlp/data/__init__.py", line 1, in <module>
    from allennlp.data.dataset_readers.dataset_reader import DatasetReader
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/allennlp/data/dataset_readers/__init__.py", line 10, in <module>
    from allennlp.data.dataset_readers.ccgbank import CcgBankDatasetReader
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/allennlp/data/dataset_readers/ccgbank.py", line 9, in <module>
    from allennlp.data.dataset_readers.dataset_reader import DatasetReader
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/allennlp/data/dataset_readers/dataset_reader.py", line 8, in <module>
    from allennlp.data.instance import Instance
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/allennlp/data/instance.py", line 3, in <module>
    from allennlp.data.fields.field import DataArray, Field
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/allennlp/data/fields/__init__.py", line 7, in <module>
    from allennlp.data.fields.array_field import ArrayField
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/allennlp/data/fields/array_field.py", line 10, in <module>
    class ArrayField(Field[numpy.ndarray]):
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/allennlp/data/fields/array_field.py", line 50, in ArrayField
    @overrides
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/overrides/overrides.py", line 88, in overrides
    return _overrides(method, check_signature, check_at_runtime)
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/overrides/overrides.py", line 114, in _overrides
    _validate_method(method, super_class, check_signature)
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/overrides/overrides.py", line 135, in _validate_method
    ensure_signature_is_compatible(super_method, method, is_static)
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/overrides/signature.py", line 93, in ensure_signature_is_compatible
    ensure_return_type_compatibility(super_type_hints, sub_type_hints, method_name)
  File "/mnt/partuuid-46caa556-c2c4-eb47-907a-5d2092050724/var/lib/home/fran/source/udify-venv/lib/python3.7/site-packages/overrides/signature.py", line 288, in ensure_return_type_compatibility
    f"{method_name}: return type `{sub_return}` is not a `{super_return}`."
TypeError: ArrayField.empty_field: return type `None` is not a `<class 'allennlp.data.fields.field.Field'>`.

Multi-GPU training?

Hi, cool work!
I was wondering if the toolkit supported multi-GPU training as-is, and, if not, if there are any plans to support it.

Thanks!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

	# Next, select indices of the sequence such that it will result in embeddings representing the original
	# sentence. To capture maximal context, the indices will be the middle part of each embedded window
	# sub-sequence (plus any leftover start and final edge windows), e.g.,
	# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
	# "[CLS] I went to the very fine [SEP] [CLS] the very fine store to eat [SEP]"
	# with max_pieces = 8 should produce max context indices [2, 3, 4, 10, 11, 12] with additional start
	# and final windows with indices [0, 1] and [14, 15] respectively.

	# Find the stride as half the max pieces, ignoring the special start and end tokens
	# Calculate an offset to extract the centermost embeddings of each window
	stride = (self.max_pieces - self.start_tokens - self.end_tokens) // 2
	stride_offset = stride // 2 + self.start_tokens

	first_window = list(range(stride_offset))

	max_context_windows = [i for i in range(full_seq_len)
	if stride_offset - 1 < i % self.max_pieces < stride_offset + stride]

	final_window_start = full_seq_len - (full_seq_len % self.max_pieces) + stride_offset + stride
	final_window = list(range(final_window_start, full_seq_len))

	select_indices = first_window + max_context_windows + final_window