Comments (3)
i notice my torch and transformers version are wrong. but after i change them to torch==1.4.0 and transformers==3.0.2, there is a new error, and the training not successed:
3/16/2022 22:38:05 - INFO - transformers.modeling_utils - loading weights file https://cdn.huggingface.co/bert-base-cased-pytorch_model.bin from cache at /root/.cache/torch/transformers/d8f11f061e407be64c4d5d7867ee61d1465263e24085cfa26abf183fdc830569.3fadbea36527ae472139fe84cddaa65454d7429f12d543d80bfc3ad70de55ac2
03/16/2022 22:38:08 - WARNING - transformers.modeling_utils - Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
03/16/2022 22:38:08 - WARNING - transformers.modeling_utils - Some weights of BertForMaskedLM were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['cls.predictions.decoder.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
03/16/2022 22:38:08 - INFO - models - Vocab size: 28996
03/16/2022 22:38:08 - ERROR - transformers.tokenization_utils_base - Using eos_token, but it is not set yet.
03/16/2022 22:38:08 - INFO - main - Original vocab size: 28996
03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V1] to the vocabulary
03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V2] to the vocabulary
03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V3] to the vocabulary
03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V4] to the vocabulary
03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V5] to the vocabulary
03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V6] to the vocabulary
03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V7] to the vocabulary
03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V8] to the vocabulary
03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V9] to the vocabulary
03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V10] to the vocabulary
03/16/2022 22:38:09 - INFO - main - # vocab after adding new tokens: 29006
03/16/2022 22:38:09 - INFO - main - Common vocab: common_vocabs/common_vocab_cased.txt, size: 21018
03/16/2022 22:38:10 - INFO - main - Tie embeddings of tokens: ([V1], is)
03/16/2022 22:38:10 - INFO - main - Tie embeddings of tokens: ([V2], a)
03/16/2022 22:38:10 - INFO - main - Tie embeddings of tokens: ([V3], by)
03/16/2022 22:38:10 - INFO - main - Tie embeddings of tokens: ([V4], profession)
03/16/2022 22:38:10 - INFO - main - Tie embeddings of tokens: ([V5], .)
03/16/2022 22:38:10 - INFO - main - Template: [X] [V1] [V2] [Y] [V3] [V4] [V5]
03/16/2022 22:38:10 - INFO - main - Train batches: 7
03/16/2022 22:38:10 - INFO - main - Valid batches: 4
0%| | 0/4 [00:00<?, ?it/s]03/16/2022 22:38:10 - INFO - models - Moving model to CUDA
/home/XuYifan/anaconda3/envs/opti/lib/python3.7/site-packages/torch/nn/parallel/_functions.py:61: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
warnings.warn('Was asked to gather along dimension 0, but all '
0%| | 0/4 [00:19<?, ?it/s]
Traceback (most recent call last):
File "code/run_optiprompt.py", line 174, in
best_result, result_rel = evaluate(model, valid_samples_batches, valid_sentences_batches, filter_indices, index_list)
File "/home/XuYifan/optiprompt/OptiPrompt/code/utils.py", line 133, in evaluate
eval_loss += loss.item() * tot_b
ValueError: only one element tensors can be converted to Python scalars
03/16/2022 22:38:31 - INFO - main - Namespace(check_step=-1, common_vocab_filename='common_vocabs/common_vocab_cased.txt', dev_data='data/autoprompt_data/P108/dev.jsonl', do_eval=True, do_shuffle=False, do_train=True, eval_batch_size=8, eval_per_epoch=3, init_manual_template=True, k=5, learning_rate=0.003, model_dir=None, model_name='bert-base-cased', num_epoch=10, num_vectors=5, output_dir='optiprompt-outputs/P108', output_predictions=True, random_init='none', relation='P108', relation_profile='relation_metainfo/LAMA_relations.jsonl', seed=6, test_data='data/LAMA-TREx/P108.jsonl', train_batch_size=16, train_data='data/autoprompt_data/P108/train.jsonl', warmup_proportion=0.1)
03/16/2022 22:38:31 - INFO - main - # GPUs: 8
03/16/2022 22:38:31 - INFO - main - Model: bert-base-cased
03/16/2022 22:38:33 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json from cache at /root/.cache/torch/transformers/b945b69218e98b3e2c95acf911789741307dec43c698d35fad11c1ae28bda352.9da767be51e1327499df13488672789394e2ca38b877837e52618a67d7002391
03/16/2022 22:38:33 - INFO - transformers.configuration_utils - Model config BertConfig {
"architectures": [
"BertForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"gradient_checkpointing": false,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 28996
}
from optiprompt.
Hi, could you try to run with only one GPU? You can do that by setting CUDA_VISIBLE_DEVICES=0
.
from optiprompt.
I run it on single GPU and find that there is an error in code/utils line 133:
I change
eval_loss += loss.item() * tot_b
to
eval_loss += loss.mean().item() * tot_b
and then success
from optiprompt.
Related Issues (4)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from optiprompt.