mynlp / cst_captioning Goto Github PK
View Code? Open in Web Editor NEWPyTorch Implementation of Consensus-based Sequence Training for Video Captioning
PyTorch Implementation of Consensus-based Sequence Training for Video Captioning
Hi
Thanks for sharing this amazing git repo for your paper.
Regarding the symbolic link in val_videodatainfo.json
It seems that there is an issue when viewing/downloading from your google drive for this particular file.
This issue is preventing me from correctly generating metadata using the command below.
make pre_process
Could you kindly double-check the file? Thanks!
Hello, I found one video only has one feature(C3D,Resnet),not all the fatures of frames we choosed. Could you tell me how to make them together?
I train a WEX model and get a cider score about 50, then train CST_MS_Greedy according to your options, bug cider score doesn't grow up by reinforcement learning. You model provided can't be loaded for test also. Can you give a hint about how to use your model or how to produce cider score 54.2?
Hi, I was wondering if you have used mean pooling to blend ResNet features for every frame into a 2048-D vector (representing the ResNet features for that video chunk)? If not, can you describe how did you merged features across the frames for each clip?
we are not able to run the code, can you help us with it?
can you give the hint about how to extract these features?
Hi,
I am trying to use multiple GPUs on my workstation for your code. I thus use GID=0,1,2,3
in the command to start a training session. However, it seems that it's still using only 1 GPU.
Going through your code, I was unable to find DataParallel
anywhere in the code. I am wondering whether if your code originally supports multi-GPU training.
If not, I might be able to take a look at.
can you share the average metric(BLEU, ROUGE,METEOR,CIDEr) of annotated captions?
Thanks very much!
Hi,
I was just trying to run with your default setting:
Train CST_GT_None/WXE model
make train GID=0 EXP_NAME=WXE FEATS="resnet c3d mfcc category" USE_RL=1 USE_CST=1 USE_MIXER=0 SCB_CAPTIONS=0 LOGLEVEL=DEBUG MAX_EPOCHS=50
Here is the error I got:
Traceback (most recent call last):
File "train.py", line 529, in <module>
rl_criterion=rl_criterion)
File "train.py", line 118, in train
'CIDEr': CiderD(df=opt.train_cached_tokens),
File "cider/pyciderevalcap/ciderD/ciderD.py", line 25, in __init__
self.cider_scorer = CiderScorer(n=self._n, df_mode=self._df)
File "cider/pyciderevalcap/ciderD/ciderD_scorer.py", line 69, in __init__
pkl_file = pickle.load(open(os.path.join('data', df_mode + '.p'),'r'))
IOError: [Errno 2] No such file or directory: 'data/output/metadata/msrvtt_train_ciderdf.pkl.p'
There seems to be an issue with the way how we load pickle file within CIDer. Making following change solved the problem.
In "cider/pyciderevalcap/ciderD/ciderD_scorer.py"
# Line #69 is wrong:
# pkl_file = pickle.load(open(os.path.join('data', df_mode + '.p'),'r'))
# It should be changed to:
pkl_file = pickle.load(open(os.path.join(df_mode),'r'))
hello, the result made me confused and i can not figure it out . CST_MS_SCB and other full RL training methods counldn't improve the result from WXE. Is there anyone met the same problem?
Thanks a lot!
Thanks for your great work at first.
I run you code and find out the usage rate of my GPU is always 0% when calculate the scores of the val captions. I train to use multiprocessing and need to add "num_works" in torch.utils.data.DataLoader but I find you write the class all by yourself.
So is there a way to use the mutiprocessing?
can you give an example to use make test
for your model? I want to see the final score.
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.