luyug / gc-dpr Goto Github PK
View Code? Open in Web Editor NEWTrain Dense Passage Retriever (DPR) with a single GPU
License: Other
Train Dense Passage Retriever (DPR) with a single GPU
License: Other
is 8 just a parameter or has some exact mining? eg, 8 GPUs?
Thanks for posting a really nice repo!
While I was studying the code, I found that in 'train_dense_encoder.py' line 669 and 691 the following:
'''
surrogate = surrogate * (trainer.distributed_factor / 8.)
'''
which I actually don't fully understand the reason of the multiplication part.
Can you explain any reason? Thank you 👍
Hi,
I found a weird thing that if using the multilingual-bert e.g: bert-base-multilingual-uncased, it seems like the grad_cache doesn't work. I know it sounds weird, changing different bert models shouldn't affect it, but the thing is I tried normal bert, german bert, and m-bert, only the latter one need very small batch_size (like 4) to successfully run. Other models like german bert runs with batch_size=128 successfully. Do you probably know the reason of this? Btw, great paper and code, extremely helpful! Thanks in advance!
Hello.
Thank you for your great work!
I have a few questions regarding CoCondenser fine tuning.
First,
When training 1st fine tuning and 2nd fine tuning on Ms-marco dataset, can you know the number of negative samples and hard negative samples?
The second,
What are the criteria for selecting hard negative samples? For example, when the rank of positive document is 5th, hard negative samples are 1st to 4th.
I only see one branch in it.
我留这个留言没什么别的原因,就是感觉这工作太棒了,感觉这工作完全解决了memory bank,moco里的问题,CV里面有类似的工作吗?怎么只投了一个workshop?太棒了我兴奋了一晚上。
我现在也在做input data 每个sample 都特别大,所以每个batch的sample 数就很小的cl的问题(医疗数据)。
看到您的paper,发现您的方法正是我想要的!!
麻烦要问一下,windows 下,pip install . 之后
会看到
Processing c:\users\user\dropbox\book\bnl\research\granger\fmri\code\gc-dpr
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [14 lines of output]
error: Multiple top-level packages discovered in a flat-layout: ['dpr', 'data'].
To avoid accidental inclusion of unwanted files or directories,
setuptools will not proceed with this build.
If you are trying to create a single distribution with multiple packages
on purpose, you should not rely on automatic discovery.
Instead, consider the following options:
1. set up custom discovery (`find` directive with `include` or `exclude`)
2. use a `src-layout`
3. explicitly set `py_modules` or `packages` with a list of names
To find more information, look for "package discovery" on setuptools docs.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
请问这个错误在您配置环境的时候常出现嘛,谢谢!
Will this work on a TPU?
If so, can I train it on a colab notebook?
Hi, Thanks for posting a nice repo!
I see that we can train the DPR model with GC-DPR but I guess we need to train it from scratch by loading base models (bert-base-uncased|roberta-base)
How can we make use of this repo to fine tune a pretrained DPR model. For example, we already have DPR encoder models provided by Facebook.
question_model = "facebook/dpr-question_encoder-single-nq-base"
context_model = "facebook/dpr-ctx_encoder-single-nq-base"
To make these models domain-specific my idea is to fine-tune these models with domain data.
It would be helpful if you can let me know how we can load question and context models with train_dense_encoder function.
Any other suggestion would be appreciated.
Hi,
Is the current version of encoder training with GC support multiple GPUs?
I tried to run the training with NQ dataset by following the instructions in README.md but on a machine with 2 GPUs.
seems it is running slower than on a single GPU?
i.e. on a single GPU, one step cost about 4 sec, but with two GPU, one step cost about 24 sec
The GC-DPR has two steps
However, during the computation, there might be one issues:
请问Reader训练的时候可以使用您的GC技术减少开销么,在您的示例中好像仍然用的8*32G的GPU。谢谢
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.