Comments (4)
Hi WQi777,
Thanks for your interest. We just added a line to reproduce results of DeBERTa v3 to https://github.com/microsoft/KEAR/blob/main/bash/task_train.sh.
Hope that helps!
from kear.
Thanks a lot for your reply!
from kear.
but when i run the code ,got an error:
batch size: 4, total_batch_size: 20
[1528]: world_size = 2, rank = 1, backend=nccl
batch size: 4, total_batch_size: 20
restarting from checkpoint.
used_name: last2
restarting from checkpoint.
used_name: last2
loading result from dir test/last2
args.fp16 is 0
loading result from dir test/last2
args.fp16 is 0
load_vocab microsoft/deberta-v3-large
load_vocab microsoft/deberta-v3-large
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
load_data data/csqa_ret_3datasets/train_data.json
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
load_data data/csqa_ret_3datasets/train_data.json
data: 9741, world_size: 2
load_data data/csqa_ret_3datasets/dev_data.json
data: 1222, world_size: 2
get dir test/
make dataloader ...
data: 9741, world_size: 2
load_data data/csqa_ret_3datasets/dev_data.json
data: 1222, world_size: 2
get dir test/
make dataloader ...
max len: 968
95 percent len: 490
train_data 9741
total length: 1218
max len: 968
95 percent len: 490
train_data 9741
total length: 1218
max len: 851
95 percent len: 514
devlp_data 1222
init_model test/last2
set config, model_type= debertav2
deepspeed: True
resume_training: True
config_path:test/last2
model_type= debertav2
Traceback (most recent call last):
File "task.py", line 409, in
srt.init(Model)
File "task.py", line 46, in init
model = ModelClass(lm_config, opt=vars(self.config))
File "/home/aipf/work/wq/KEAR/model/model.py", line 51, in init
self.deberta = MyDebertaV2Model(config)
NameError: name 'MyDebertaV2Model' is not defined
Looking forward to your reply.
from kear.
Hi WQi777,
We have a typo in our code - can you try again?
from kear.
Related Issues (15)
- Performance on other PLM HOT 5
- is code uncompletely? HOT 1
- The second file needed in preprocessing seems invalid anymore HOT 3
- AssertionError HOT 10
- Bad results with DeBERTa V2 and runtime error (CUDA out of memory) with DeBERTa V3 HOT 2
- add_knowledge.py refers to incorrect path HOT 1
- This repo is missing a LICENSE file
- Is it possible to preprocess data by myself? HOT 1
- Probably needs a bit more guidance for the "general public" HOT 1
- Incorrect paper url in README.md HOT 1
- Sequence length for SOTA performance HOT 2
- GPU Memory size HOT 3
- about your datasets HOT 1
- Missing preprocessing script? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kear.