Giter VIP home page Giter VIP logo

Comments (3)

xuyifan-0731 avatar xuyifan-0731 commented on June 27, 2024

i notice my torch and transformers version are wrong. but after i change them to torch==1.4.0 and transformers==3.0.2, there is a new error, and the training not successed:

3/16/2022 22:38:05 - INFO - transformers.modeling_utils - loading weights file https://cdn.huggingface.co/bert-base-cased-pytorch_model.bin from cache at /root/.cache/torch/transformers/d8f11f061e407be64c4d5d7867ee61d1465263e24085cfa26abf183fdc830569.3fadbea36527ae472139fe84cddaa65454d7429f12d543d80bfc3ad70de55ac2
03/16/2022 22:38:08 - WARNING - transformers.modeling_utils - Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']

  • This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
  • This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    03/16/2022 22:38:08 - WARNING - transformers.modeling_utils - Some weights of BertForMaskedLM were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['cls.predictions.decoder.bias']
    You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
    03/16/2022 22:38:08 - INFO - models - Vocab size: 28996
    03/16/2022 22:38:08 - ERROR - transformers.tokenization_utils_base - Using eos_token, but it is not set yet.
    03/16/2022 22:38:08 - INFO - main - Original vocab size: 28996
    03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V1] to the vocabulary
    03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V2] to the vocabulary
    03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V3] to the vocabulary
    03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V4] to the vocabulary
    03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V5] to the vocabulary
    03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V6] to the vocabulary
    03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V7] to the vocabulary
    03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V8] to the vocabulary
    03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V9] to the vocabulary
    03/16/2022 22:38:08 - INFO - transformers.tokenization_utils - Adding [V10] to the vocabulary
    03/16/2022 22:38:09 - INFO - main - # vocab after adding new tokens: 29006
    03/16/2022 22:38:09 - INFO - main - Common vocab: common_vocabs/common_vocab_cased.txt, size: 21018
    03/16/2022 22:38:10 - INFO - main - Tie embeddings of tokens: ([V1], is)
    03/16/2022 22:38:10 - INFO - main - Tie embeddings of tokens: ([V2], a)
    03/16/2022 22:38:10 - INFO - main - Tie embeddings of tokens: ([V3], by)
    03/16/2022 22:38:10 - INFO - main - Tie embeddings of tokens: ([V4], profession)
    03/16/2022 22:38:10 - INFO - main - Tie embeddings of tokens: ([V5], .)
    03/16/2022 22:38:10 - INFO - main - Template: [X] [V1] [V2] [Y] [V3] [V4] [V5]
    03/16/2022 22:38:10 - INFO - main - Train batches: 7
    03/16/2022 22:38:10 - INFO - main - Valid batches: 4
    0%| | 0/4 [00:00<?, ?it/s]03/16/2022 22:38:10 - INFO - models - Moving model to CUDA
    /home/XuYifan/anaconda3/envs/opti/lib/python3.7/site-packages/torch/nn/parallel/_functions.py:61: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
    warnings.warn('Was asked to gather along dimension 0, but all '
    0%| | 0/4 [00:19<?, ?it/s]
    Traceback (most recent call last):
    File "code/run_optiprompt.py", line 174, in
    best_result, result_rel = evaluate(model, valid_samples_batches, valid_sentences_batches, filter_indices, index_list)
    File "/home/XuYifan/optiprompt/OptiPrompt/code/utils.py", line 133, in evaluate
    eval_loss += loss.item() * tot_b
    ValueError: only one element tensors can be converted to Python scalars

    03/16/2022 22:38:31 - INFO - main - Namespace(check_step=-1, common_vocab_filename='common_vocabs/common_vocab_cased.txt', dev_data='data/autoprompt_data/P108/dev.jsonl', do_eval=True, do_shuffle=False, do_train=True, eval_batch_size=8, eval_per_epoch=3, init_manual_template=True, k=5, learning_rate=0.003, model_dir=None, model_name='bert-base-cased', num_epoch=10, num_vectors=5, output_dir='optiprompt-outputs/P108', output_predictions=True, random_init='none', relation='P108', relation_profile='relation_metainfo/LAMA_relations.jsonl', seed=6, test_data='data/LAMA-TREx/P108.jsonl', train_batch_size=16, train_data='data/autoprompt_data/P108/train.jsonl', warmup_proportion=0.1)
    03/16/2022 22:38:31 - INFO - main - # GPUs: 8
    03/16/2022 22:38:31 - INFO - main - Model: bert-base-cased
    03/16/2022 22:38:33 - INFO - transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json from cache at /root/.cache/torch/transformers/b945b69218e98b3e2c95acf911789741307dec43c698d35fad11c1ae28bda352.9da767be51e1327499df13488672789394e2ca38b877837e52618a67d7002391
    03/16/2022 22:38:33 - INFO - transformers.configuration_utils - Model config BertConfig {
    "architectures": [
    "BertForMaskedLM"
    ],
    "attention_probs_dropout_prob": 0.1,
    "gradient_checkpointing": false,
    "hidden_act": "gelu",
    "hidden_dropout_prob": 0.1,
    "hidden_size": 768,
    "initializer_range": 0.02,
    "intermediate_size": 3072,
    "layer_norm_eps": 1e-12,
    "max_position_embeddings": 512,
    "model_type": "bert",
    "num_attention_heads": 12,
    "num_hidden_layers": 12,
    "pad_token_id": 0,
    "type_vocab_size": 2,
    "vocab_size": 28996
    }

from optiprompt.

a3616001 avatar a3616001 commented on June 27, 2024

Hi, could you try to run with only one GPU? You can do that by setting CUDA_VISIBLE_DEVICES=0.

from optiprompt.

xuyifan-0731 avatar xuyifan-0731 commented on June 27, 2024

I run it on single GPU and find that there is an error in code/utils line 133:
I change
eval_loss += loss.item() * tot_b
to
eval_loss += loss.mean().item() * tot_b
and then success

from optiprompt.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.