Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Training issue: with bert_base about coref-hoi HOT 11 CLOSED

lxucs commented on September 28, 2024

Training issue: with bert_base

from coref-hoi.

Comments (11)

sushantakpani commented on September 28, 2024 1

In Experiment Section of the paper:

Note thatBERTandSpanBERTcompletely rely on only local decisions without any HOI. Particularly, +AA is equivalent to Joshi et al. (2020).

Please let me know to replicate Joshi 2020 work what should be the configuration.

Is this configuration fine:
higher_order = attended_antecedent

train_spanbert_base_ml0_d1 = ${train_spanbert_base}{
mention_loss_coef = 0
coref_depth = 2
}

from coref-hoi.

L-hongbin commented on September 28, 2024 1

In Experiment Section of the paper:

Note thatBERTandSpanBERTcompletely relyon only local decisions without any HOI. Particu-larly,+AAis equivalent to Joshi et al. (2020).

Please let me know to replicate Joshi 2020 work what should be the configuration.

Is this configuration fine:
higher_order = attended_antecedent

train_spanbert_base_ml0_d1 = ${train_spanbert_base}{
mention_loss_coef = 0
coref_depth = 2
}

hey, how about your trianing result of bert_base? I have trained the model on bert_base with c2f, but only get a result about 67 F1, and the tensorflow version is about 73 F1.

from coref-hoi.

yangjingyi commented on September 28, 2024 1

In Experiment Section of the paper:

Note thatBERTandSpanBERTcompletely rely on only local decisions without any HOI. Particularly, +AA is equivalent to Joshi et al. (2020).

Please let me know to replicate Joshi 2020 work what should be the configuration.

Is this configuration fine: higher_order = attended_antecedent

train_spanbert_base_ml0_d1 = ${train_spanbert_base}{ mention_loss_coef = 0 coref_depth = 2 }

Hi,
Have you replicate Joshi et al. spanbert large results?

from coref-hoi.

sushantakpani commented on September 28, 2024

python run.py train_bert_base_ml0_d1 0 also gave the same result

whereas

python run.py train_spanbert_base_ml0_d1 0 progressed for training but halted due to cuda out of memory issue.
RuntimeError: CUDA out of memory. Tried to allocate 630.00 MiB (GPU 5; 10.76 GiB total capacity; 8.44 GiB already allocated; 315.12 MiB free; 9.60 GiB reserved in total by PyTorch)

from coref-hoi.

AradAshrafi commented on September 28, 2024

python run.py train_bert_base_ml0_d1 0 also gave the same result

whereas

python run.py train_spanbert_base_ml0_d1 0 progressed for training but halted due to cuda out of memory issue.
RuntimeError: CUDA out of memory. Tried to allocate 630.00 MiB (GPU 5; 10.76 GiB total capacity; 8.44 GiB already allocated; 315.12 MiB free; 9.60 GiB reserved in total by PyTorch)

Hi,

I solve these kinds of issues by changing some parameters in the experiments.conf file, in order to decrease the size of the model. For example, you can decrease the ffnn_size, max_segment_len.

from coref-hoi.

sushantakpani commented on September 28, 2024

python run.py train_bert_base_ml0_d1 0 also gave the same result
whereas
python run.py train_spanbert_base_ml0_d1 0 progressed for training but halted due to cuda out of memory issue.
RuntimeError: CUDA out of memory. Tried to allocate 630.00 MiB (GPU 5; 10.76 GiB total capacity; 8.44 GiB already allocated; 315.12 MiB free; 9.60 GiB reserved in total by PyTorch)

Hi,

I solve these kinds of issues by changing some parameters in the experiments.conf file, in order to decrease the size of the model. For example, you can decrease the ffnn_size, max_segment_len.

Hi @AradAshrafi,
Thanks for your response and tips to solve the Cuda memory issue. I will try that one.
Are you able to train bert_base like spanbert_base?

Sushanta

from coref-hoi.

sushantakpani commented on September 28, 2024

python run.py train_bert_base_ml0_d1 0 also gave the same result
whereas
python run.py train_spanbert_base_ml0_d1 0 progressed for training but halted due to cuda out of memory issue.
RuntimeError: CUDA out of memory. Tried to allocate 630.00 MiB (GPU 5; 10.76 GiB total capacity; 8.44 GiB already allocated; 315.12 MiB free; 9.60 GiB reserved in total by PyTorch)

Hi,
I solve these kinds of issues by changing some parameters in the experiments.conf file, in order to decrease the size of the model. For example, you can decrease the ffnn_size, max_segment_len.

Hi @AradAshrafi,
Thanks for your response and tips to solve the Cuda memory issue. I will try that one.
Are you able to train bert_base like spanbert_base?

Sushanta

Seems working now.
I tried python run.py train_spanbert_base_ml0_d1 0 with the following values in the experiments.conf for spanbert_base.

spanbert_base = ${best}{
num_docs = 2802
bert_learning_rate = 2e-05
task_learning_rate = 0.0001
max_segment_len = 128 #384
ffnn_size = 1000 #3000
cluster_ffnn_size = 1000 #3000
max_training_sentences = 3
bert_tokenizer_name = bert-base-cased
bert_pretrained_name_or_path = ${best.data_dir}/spanbert_base
}

from coref-hoi.

sushantakpani commented on September 28, 2024

@lxucs
Please let me know how can I train bert_base?

from coref-hoi.

lxucs commented on September 28, 2024

Hi @sushantakpani , you can have a config like this (similar to training spanbert_base):

train_bert_base_ml0_d1 = ${train_bert_base}{
  mention_loss_coef = 0
  coref_depth = 1
}

from coref-hoi.

sushantakpani commented on September 28, 2024

Hi @lxucs

It seems this error was due to GPU memory issue. I have shifted to a higher memory GPU server and able to run the training.
python run.py train_bert_base_ml0_d1 0

My configuration as follows

bert_base = ${best}{
num_docs = 2802
bert_learning_rate = 1e-05
task_learning_rate = 2e-4
max_segment_len = 128
ffnn_size =1000 #3000
cluster_ffnn_size =1000 #3000
max_training_sentences = 11
bert_tokenizer_name = bert-base-cased
bert_pretrained_name_or_path = bert-base-cased
}

train_bert_base = ${bert_base}{
}

train_bert_base_ml0_d1 = ${train_bert_base}{
mention_loss_coef = 0
coref_depth = 1
}

Hi @sushantakpani , you can have a config like this (similar to training spanbert_base):
train_bert_base_ml0_d1 = ${train_bert_base}{
  mention_loss_coef = 0
  coref_depth = 1
}

from coref-hoi.

sushantakpani commented on September 28, 2024

In Experiment Section of the paper:
Note thatBERTandSpanBERTcompletely relyon only local decisions without any HOI. Particu-larly,+AAis equivalent to Joshi et al. (2020).
Please let me know to replicate Joshi 2020 work what should be the configuration.
Is this configuration fine:
higher_order = attended_antecedent
train_spanbert_base_ml0_d1 = ${train_spanbert_base}{
mention_loss_coef = 0
coref_depth = 2
}

hey, how about your trianing result of bert_base? I have trained the model on bert_base with c2f, but only get a result about 67 F1, and the tensorflow version is about 73 F1.

For BERT-base I could achieve 73.3 F1

from coref-hoi.

Training issue: with bert_base about coref-hoi HOT 11 CLOSED

Comments (11)

Related Issues (13)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent