The disentangled-retriever's discuss from jingtaozhan

Design choice w.r.t. the DAM MLM training

Hi Jingtao:

I wonder what is the max_seq_len hyparameter you use when doing MLM on target domain dataset?

In the huggingface example they use a block_size=128, just curious why you did not use their method directly.

Also I am interested in the effect of leaving out [CLS] and [SEP] in preparing the MLM dataloader, does it make a huge difference if you do not remove these two tokens?

run_contrast.py: AttributeError: 'BertModel' object has no attribute 'add_adapter'

Hi,

I was trying to train my own REM by following the instruction.

output_dir="./data/dense-mlm/english-marco/train_rem/rem-with-hf-dam/contrast"

python -m torch.distributed.launch --nproc_per_node 4 \ 
    -m disentangled_retriever.dense.finetune.run_contrast \
    --lora_rank 192 --parallel_reduction_factor 4 --new_adapter_name msmarco \
    --pooling average \
    --similarity_metric ip \
    --qrel_path ./data/datasets/msmarco-passage/qrels.train \
    --query_path ./data/datasets/msmarco-passage/query.train \
    --corpus_path ./data/datasets/msmarco-passage/corpus.tsv \
    --negative ./data/datasets/msmarco-passage/msmarco-hard-negatives.tsv \
    --output_dir $output_dir \
    --model_name_or_path jingtao/DAM-bert_base-mlm-msmarco \
    --logging_steps 100 \
    --max_query_len 24 \
    --max_doc_len 128 \
    --per_device_train_batch_size 32 \
    --inv_temperature 1 \
    --gradient_accumulation_steps 1 \
    --fp16 \
    --neg_per_query 3 \
    --learning_rate 2e-5 \
    --num_train_epochs 5 \
    --dataloader_drop_last \
    --overwrite_output_dir \
    --dataloader_num_workers 0 \
    --weight_decay 0 \
    --lr_scheduler_type "constant" \
    --save_strategy "epoch" \
    --optim adamw_torch

However, I then get AttributeError: 'BertModel' object has no attribute 'add_adapter'.

    def add_adapter(self, adapter_name: str, config=None, overwrite_ok: bool = False, set_active: bool = False):
        """
        Adds a new adapter module of the specified type to the model.

        Args:
            adapter_name (str): The name of the adapter module to be added.
            config (str or dict, optional): The adapter configuration, can be either:

                - the string identifier of a pre-defined configuration dictionary
                - a configuration dictionary specifying the full config
                - if not given, the default configuration for this adapter type will be used
            overwrite_ok (bool, optional):
                Overwrite an adapter with the same name if it exists. By default (False), an exception is thrown.
            set_active (bool, optional):
                Set the adapter to be the active one. By default (False), the adapter is added but not activated.

        If self.base_model is self, must inherit from a class that implements this method, to preclude infinite
        recursion
        """
        if self.base_model is self:
            super().add_adapter(adapter_name, config, overwrite_ok=overwrite_ok, set_active=set_active)
        else:
            # error thrown here on the following line
            self.base_model.add_adapter(adapter_name, config, overwrite_ok=overwrite_ok, set_active=set_active)

Error Stack

[WARNING|modeling_utils.py:3180] 2023-11-04 16:40:20,978 >> Some weights of the model checkpoint at jingtao/DAM-bert_base-mlm-msmarco were not used when initializing BertDense: ['cls.predictions.transform.LayerNorm.weight', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.bias', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertDense from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertDense from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[WARNING|modeling_utils.py:3192] 2023-11-04 16:40:20,978 >> Some weights of BertDense were not initialized from the model checkpoint at jingtao/DAM-bert_base-mlm-msmarco and are newly initialized: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[INFO|modeling_utils.py:2839] 2023-11-04 16:40:21,154 >> Generation config file not found, using a generation config created from the model config.
11/04/2023 16:40:21-INFO-adapter_arg- Add a lora adapter and only train the adapter
11/04/2023 16:40:21-INFO-adapter_arg- Add a parallel adapter and only train the adapter
Traceback (most recent call last):
  File "C:\Users\ymurong\Documents\Github\Domain-Adapation-French-Legal-Retrieval\scripts\disentangled-retriever\run_contrast.py", line 203, in <module>
    main()
  File "C:\Users\ymurong\Documents\Github\Domain-Adapation-French-Legal-Retrieval\scripts\disentangled-retriever\run_contrast.py", line 145, in main
    model.add_adapter(model_args.new_adapter_name, config=adapter_config)
  File "C:\Users\ymurong\Documents\Github\Domain-Adapation-French-Legal-Retrieval\venv\lib\site-packages\transformers\adapters\model_mixin.py", line 1077, in add_adapter
    self.base_model.add_adapter(adapter_name, config, overwrite_ok=overwrite_ok, set_active=set_active)
  File "C:\Users\ymurong\Documents\Github\Domain-Adapation-French-Legal-Retrieval\venv\lib\site-packages\torch\nn\modules\module.py", line 1269, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'BertModel' object has no attribute 'add_adapter'

Is there anything that I could do wrong?

One small modification that I had done is to change the import augument as there is no BertAdapterModel in transformers in my case. Maybe it could be the reason? I am currenty using transformers-4.33.3 with adapter-transformers==3.2.1. I am running python3.10.

from transformers import BertAdapterModel

from transformers.adapters import BertAdapterModel

Thank you for your help!

官方示例运行报错

在运行官方示例时，model.merge_lora这一步报了没有这个属性错误。是更新了相关的版本吗？
AttributeError: 'BertDense' object has no attribute 'merge_lora'

jingtaozhan / disentangled-retriever Goto Github PK

disentangled-retriever's Issues

Design choice w.r.t. the DAM MLM training

run_contrast.py: AttributeError: 'BertModel' object has no attribute 'add_adapter'

官方示例运行报错

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent