Comments (8)
因为我在MSRVTT复现得到的R1仅为44.9,训练过程中loss_ita居高不下,所以我猜想是哪些环节出了问题,希望您给予我启发!
from aurora.
loss_ita没有收敛或者变化可能是一些超参需要调整,我们之前记录的一份log日志如下,是收敛的:
{"train_lr": "0.000", "train_loss_itm": "0.270", "train_loss_ita": "6.418", "epoch": 0}
{"train_lr": "0.000", "train_loss_itm": "0.133", "train_loss_ita": "6.077", "epoch": 1}
{"train_lr": "0.000", "train_loss_itm": "0.061", "train_loss_ita": "5.849", "epoch": 2}
{"train_lr": "0.000", "train_loss_itm": "0.036", "train_loss_ita": "5.684", "epoch": 3}
{"train_lr": "0.000", "train_loss_itm": "0.028", "train_loss_ita": "5.590", "epoch": 4}
{"train_lr": "0.000", "train_loss_itm": "0.026", "train_loss_ita": "5.546", "epoch": 5}
性能gap可能是没有选择正确的.pth,我们采用base模型是指区分于blip_capfilt这种模型架构,但是预训练参数是需要根据任务调整的,调整骨架参数不影响我们对比不同baseline方法(Uniadapter, Lora)等的优劣,具体的预训练参数选择可以通过如下网站:https://storage.googleapis.com/sfr-vision-language-research
from aurora.
请问您复现的这个MSRVTT实验使用什么环境的呢,具体几张卡每个卡内存多少呢?
from aurora.
4*A100
from aurora.
那请问您数据集有改小吗?我是八卡4090,直接用readme中的语句运行,但是线程老被杀死。
from aurora.
并没有,我可以完整的运行代码,只是因为某些原因效果不太好。您能解答我上面的问题吗,那两个模块的实现在哪里?
from aurora.
您试着看下med.py的261看看那个信息上下文增强的,435看下那个门控的。另外能给您的pip list我看下吗?我不清楚我这个线程崩溃是内存问题还是我安装的版本不妥?(我8张4090 24G显存,搞不懂就是跑不出)
from aurora.
谢谢,435行门控是不是需要再像 https://github.com/WillDreamer/Aurora/issues/4 修改一下?
以下是我的pip list,因为是从公共环境拷贝,所以有点杂,见谅。如果您成功复现,麻烦告诉我一声。
Package Version
antlr4-python3-runtime 4.8
appdirs 1.4.4
argon2-cffi 20.1.0
astor 0.8.1
async-generator 1.10
attrs 19.3.0
autocommand 2.2.1
av 8.0.2
backcall 0.1.0
bidict 0.21.4
bleach 3.1.0
boto3 1.23.10
botocore 1.26.10
cairocffi 1.2.0
CairoSVG 2.5.2
certifi 2021.5.30
cffi 1.14.5
chardet 3.0.4
click 7.1.2
click-help-colors 0.9
cloudpickle 2.2.1
colorama 0.4.4
coloredlogs 15.0
commonmark 0.9.1
configparser 5.2.0
coverage 5.0.3
cssselect 1.1.0
cssselect2 0.4.1
cycler 0.10.0
Cython 0.29.14
dataclasses 0.8
decorator 4.4.1
decord 0.6.0
defusedxml 0.6.0
docker-pycreds 0.4.0
docutils 0.15.2
easydict 1.9
einops 0.4.1
elasticsearch 7.12.1
entrypoints 0.3
environs 9.5.0
fairscale 0.4.6
faiss 1.7.1
filelock 3.0.12
flake8 3.7.9
Flask 1.1.2
ftfy 6.0.3
future 0.18.2
gitdb 4.0.9
GitPython 3.1.18
grad-cam 1.4.8
graphviz 0.13
huggingface-hub 0.4.0
humanfriendly 9.1
humanize 3.5.0
idna 2.8
imageio 2.6.1
importlib-metadata 0.23
importlib-resources 5.4.0
iniconfig 1.1.1
ipykernel 5.1.3
ipython 7.9.0
ipython-genutils 0.2.0
ipywidgets 7.5.1
itsdangerous 1.1.0
jedi 0.15.1
Jinja2 2.10.3
jmespath 0.10.0
joblib 0.14.0
jsonschema 3.1.1
jupyter 1.0.0
jupyter-client 5.3.4
jupyter-console 6.0.0
jupyter-core 4.6.1
jupyterlab-pygments 0.1.2
kiwisolver 1.1.0
lmdb 1.4.1
lxml 4.6.3
MarkupSafe 1.1.1
marshmallow 3.14.1
matplotlib 3.1.1
mccabe 0.6.1
mistune 0.8.4
mkl-fft 1.3.0
mkl-random 1.1.1
mkl-service 2.3.0
more-itertools 7.2.0
multiprocessing-logging 0.3.4
nbclient 0.5.3
nbconvert 5.6.1
nbformat 4.4.0
nest-asyncio 1.5.1
networkx 2.4
ninja 1.10.0.post2
nltk 3.3
notebook 6.0.1
numpy 1.19.5
nvidia-dali 0.22.0
oathtool 2.3.1
olefile 0.46
omegaconf 2.1.0
onnx 1.7.0
opencv-python 4.1.1.26
packaging 21.0
pandas 1.1.5
pandocfilters 1.4.2
parso 0.5.1
path 16.2.0
pathtools 0.1.2
petrel-oss-sdk v2.2.1-2-g1505ef3-master
pexpect 4.7.0
pickleshare 0.7.5
Pillow 8.2.0
Pillow-SIMD 6.0.0.post0
pip 19.3.1
pkginfo 1.5.0.1
pluggy 0.13.1
prettytable 0.7.2
prometheus-client 0.7.1
prompt-toolkit 2.0.10
protobuf 3.19.6
psutil 5.9.6
ptyprocess 0.6.0
py 1.10.0
pybind11 2.6.2
pycocoevalcap 1.2
pycocotools 2.0.7
pycocotools-fix 2.0.0.9
pycodestyle 2.5.0
pycparser 2.20
pyDes 2.0.1
pyflakes 2.1.1
pygal 2.4.0
Pygments 2.9.0
pyparsing 3.0.9
pyrsistent 0.15.5
pytest 6.2.4
python-dateutil 2.8.1
python-dotenv 0.19.2
python-engineio 4.3.0
python-socketio 5.5.0
pytz 2021.1
PyWavelets 1.1.1
PyYAML 5.3.1
pyzmq 18.1.0
qtconsole 4.5.5
QtPy 1.9.0
readme-renderer 24.0
regex 2023.8.8
requests 2.25.1
requests-toolbelt 0.9.1
retrying 1.3.3
rich 10.1.0
ruamel.yaml 0.17.4
ruamel.yaml.clib 0.2.2
s3transfer 0.5.2
sacremoses 0.0.53
scikit-image 0.16.2
scikit-learn 0.21.3
scipy 1.3.1
seaborn 0.9.0
selenium 3.141.0
Send2Trash 1.5.0
sentry-sdk 1.34.0
setproctitle 1.2.3
setuptools 41.6.0.post20191030
six 1.15.0
smmap 5.0.0
spring 0.6.1+cu112.torch181.mvapich2.pmi2.nartgpu
spring-aux 0.6.7.develop.2021-12-29t02-21.8592b9f9
terminado 0.8.2
testpath 0.4.2
timm 0.6.12
tinycss 0.4
tinycss2 1.1.0
tokenizers 0.12.1
toml 0.10.2
torch 1.8.1+cuda112.cudnn8.1.0
torchvision 0.9.0a0+8fb5838
tornado 6.0.3
tqdm 4.37.0
traitlets 4.3.3
transformers 4.18.0
ttach 0.0.3
twine 2.0.0
typing-extensions 3.10.0.2
ujson 4.3.0
urllib3 1.26.18
wandb 0.15.11
wcwidth 0.1.7
webencodings 0.5.1
websocket-client 1.2.3
Werkzeug 1.0.1
wheel 0.33.6
widgetsnbextension 3.5.1
zipp 3.6.0
from aurora.
Related Issues (13)
- about package HOT 1
- the training set size of MSR-VTT HOT 1
- The details of experiments look very solid. Has any one reproduced successfully? HOT 5
- about dataset MSRVTT
- unable to reproduce your experimental HOT 1
- About Gated Query Transformation HOT 3
- Question about the adapter. HOT 2
- Questions about gated query transformation HOT 6
- bug in Aurora/CP/med.py HOT 5
- A bug HOT 3
- 关于运行环境 HOT 2
- About the Visualization HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aurora.