mynlp / cst_captioning Goto Github PK

View Code? Open in Web Editor NEW

59.0 14.0 17.0 919 KB

PyTorch Implementation of Consensus-based Sequence Training for Video Captioning

Makefile 7.25% Python 92.75%

captioning-videos policy-gradient

cst_captioning's People

Stargazers

Watchers

Forkers

tsingzao ahyuan kekedan miracle24 xiadingz dimplesl simnyatsanga amirunpri2018 sususushi jssprz andrew-zhu acodec ammieqi qbenliu plsang crystalsixone

cst_captioning's Issues

symbolic link in val_videodatainfo.json

Thanks for sharing this amazing git repo for your paper.

Regarding the symbolic link in val_videodatainfo.json

It seems that there is an issue when viewing/downloading from your google drive for this particular file.

This issue is preventing me from correctly generating metadata using the command below.

make pre_process

Could you kindly double-check the file? Thanks!

feature fusion?

Hello, I found one video only has one feature(C3D,Resnet),not all the fatures of frames we choosed. Could you tell me how to make them together?

I train a WEX model and get a cider score about 50, then train CST_MS_Greedy according to your options, bug cider score doesn't grow up by reinforcement learning. You model provided can't be loaded for test also. Can you give a hint about how to use your model or how to produce cider score 54.2?

Mean pooling for ResNet features?

Hi, I was wondering if you have used mean pooling to blend ResNet features for every frame into a 2048-D vector (representing the ResNet features for that video chunk)? If not, can you describe how did you merged features across the frames for each clip?

options

we are not able to run the code, can you help us with it?

how do you extract c3d mfcc and category features?

can you give the hint about how to extract these features?

Multi-GPU training support

Hi,

I am trying to use multiple GPUs on my workstation for your code. I thus use GID=0,1,2,3 in the command to start a training session. However, it seems that it's still using only 1 GPU.

Going through your code, I was unable to find DataParallel anywhere in the code. I am wondering whether if your code originally supports multi-GPU training.

If not, I might be able to take a look at.

about the average baseline metric of all for RL?

can you share the average metric(BLEU, ROUGE,METEOR,CIDEr) of annotated captions?
Thanks very much!

No such file or directory: 'data/output/metadata/msrvtt_train_ciderdf.pkl.p'

Hi,

I was just trying to run with your default setting:

Train CST_GT_None/WXE model

make train GID=0 EXP_NAME=WXE FEATS="resnet c3d mfcc category" USE_RL=1 USE_CST=1 USE_MIXER=0 SCB_CAPTIONS=0 LOGLEVEL=DEBUG MAX_EPOCHS=50

Here is the error I got:

Traceback (most recent call last):
  File "train.py", line 529, in <module>
    rl_criterion=rl_criterion)
  File "train.py", line 118, in train
    'CIDEr': CiderD(df=opt.train_cached_tokens),
  File "cider/pyciderevalcap/ciderD/ciderD.py", line 25, in __init__
    self.cider_scorer = CiderScorer(n=self._n, df_mode=self._df)
  File "cider/pyciderevalcap/ciderD/ciderD_scorer.py", line 69, in __init__
    pkl_file = pickle.load(open(os.path.join('data', df_mode + '.p'),'r'))
IOError: [Errno 2] No such file or directory: 'data/output/metadata/msrvtt_train_ciderdf.pkl.p'

There seems to be an issue with the way how we load pickle file within CIDer. Making following change solved the problem.

In "cider/pyciderevalcap/ciderD/ciderD_scorer.py"

# Line #69 is wrong:
# pkl_file = pickle.load(open(os.path.join('data', df_mode + '.p'),'r'))

# It should be changed to:
pkl_file = pickle.load(open(os.path.join(df_mode),'r'))

WXE get the best result

hello, the result made me confused and i can not figure it out . CST_MS_SCB and other full RL training methods counldn't improve the result from WXE. Is there anyone met the same problem?

the give val-jason file has some problems, can you share the val-jason file?

Thanks a lot!

Problem about multi processing

Thanks for your great work at first.
I run you code and find out the usage rate of my GPU is always 0% when calculate the scores of the val captions. I train to use multiprocessing and need to add "num_works" in torch.utils.data.DataLoader but I find you write the class all by yourself.
So is there a way to use the mutiprocessing?

give an example to use make test for your model?

can you give an example to use make test for your model? I want to see the final score.
Thanks.

mynlp / cst_captioning Goto Github PK

cst_captioning's People

Stargazers

Watchers

Forkers

cst_captioning's Issues

Recommend Projects

Recommend Topics

Recommend Org