Giter VIP home page Giter VIP logo

cmhungsteve / ta3n Goto Github PK

View Code? Open in Web Editor NEW
257.0 257.0 41.0 1.71 MB

[ICCV 2019 (Oral)] Temporal Attentive Alignment for Large-Scale Video Domain Adaptation (PyTorch)

Home Page: https://arxiv.org/abs/1907.12743

License: MIT License

Python 93.19% Shell 6.81%
action-recognition cvpr2019 domain-adaptation domain-discrepancy iccv2019 pytorch temporal-dynamics video video-classification video-da-datasets

ta3n's Introduction

Hi there πŸ‘‹

My name is Min-Hung (Steve) Chen (ι™³ζ•εΌ˜ in Chinese). I am a Senior Research Scientist at NVIDIA Research Taiwan, working on Vision+X Multi-Modal AI. I received my Ph.D. degree from Georgia Tech, advised by Prof. Ghassan AlRegib and in collaboration with Prof. Zsolt Kira. Before joining NVIDIA, I was working on Biometric Research for Cognitive Services as a Research Engineer II at Microsoft Azure AI, and was working on Edge-AI Research as a Senior AI Engineer at MediaTek, respectively.

My research interest is mainly Multi-Modal AI, including Vision-Language, Video Understanding, Cross-Modal Learning, Efficient Tuning, and Transformer. I am also interested in Learning without Fully Supervision, including domain adaptation, transfer learning, continual learning, X-supervised learning, etc.

[Update] I released a comprehensive paper list for Vision Transformer & Attention to facilitate related research. Feel free to check it (I would be appreciative if you can β˜…STAR it)!

[Personal Website][LinkedIn][Twitter][Google Scholar][Resume]

Min-Hung (Steve)'s GitHub stats

ta3n's People

Contributors

cmhungsteve avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ta3n's Issues

About the shell script

Hi @cmhungsteve ,

Thanks for sharing the codes. I wonder if the script_train_val file is correct or not. I try to run it but face some errors. (e.g. error: argument --pred_normalize: expected one argument) It seems that this version misses some arguments.

Best.

Question about the training details?

Nice work!
Thanks for your sharing code!

I am interested in this task,
I want to know some training detail.

  • Which GPUs are you using, and how many GPUs are you using?
  • Using your setting, I am curious about your UCF -> HMDE and HMDE -> UCF experiments detail

How long do you train on these experiments(hours/days)?

Question about attention

Hello sir, I would like to ask the difference between use_attn and use_attn_frame. Is the use_attn for aggregated features/frames and the use_attn_frame for a single frame? Thanks in advance

argument_modality: invalid choice

Hi, I used the pre-extracted features provided and manually edited the paths in the data list files. I am sure my paths are correct but I am getting the following error when running the script_train_val.sh:

main.py: error: argument modality: invalid choice: 'data/classInd_ucf_olympic.txt' (choose from 'RGB', 'Flow', 'RGBDiff', 'RGBDiff2', 'RGBDiffplus') .

Do you have any insight as to where I did it wrong? Thank you in advance.

T-SNE visualization

I want to visualized data after a training network. could you provide an idea of how can I visualize source and target data after training for comparison of data alignment way? I read your paper you are using T-SNE for visualization but I have no idea how can I use T-SNE after training finished. I prepared T-SNE before training source and target data but after training, I have a problem with visualization.

output feature

Hello, thanks for your great job, I want to ask if the code could output the feature processed?and how to visulize the feature processed and unprocessed like the picture in your paper?

Some questions about the paper

Why is formula 7 a minus sign instead of a plus sign? Why not add more weight to the ones that are hard to distinguish.

I am looking forward to your answer. Thank you very much!

question about Target Entropy

Hello Sir,

I have a question about add_loss_DA configuration. I have read the TA3N paper, but the target entropy is not mentioned, may I ask where can I find more information about it? Thank you in advance.

Cannot run code using default save_attention setting

Thank you for the wonderful code.

However, when trying to reproduce the results, I am getting errors using the default setting in opts.py. I found that when running it would produce the following error:

Traceback (most recent call last):
  File "main.py", line 836, in <module>
    main()
  File "main.py", line 240, in main
    loss_c, attn_epoch_source, attn_epoch_target = train(num_class, source_loader, target_loader, model, criterion, criterion_domain, optimizer, epoch, train_file, train_short_file, alpha, beta, gamma, mu)
  File "main.py", line 668, in train
    return losses_c.avg, attn_epoch_source.mean(0), attn_epoch_target.mean(0)
RuntimeError: invalid argument 2: invalid dimension 0 at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/TH/generic/THTensorMath.c:3897

After checking, I suspect that the save_attention argument was the cause, and I change the default from -1 to 0. Although now it could run, I am getting only 73.89% accuracy. Could you help explain why the save_attention was set to -1? And what is the meaning of it setting to different values? Thank you!

Code for obtaining the baseline results.

Hi @cmhungsteve

Thanks for the great work. I wanted to execute and obtain the baseline results mentioned in the paper (DANN + [TempPooling/TempRelation]), can you please let me know if you can make the code public?

Thanks

How to use the C3D

Hi, I want to konw how to get the /models/c3d.pickle? Could this code extract features using C3d networks?

Reproducing the results in the paper

Hi @cmhungsteve,

Thank you for releasing the wonderful code. I am trying to reproduce the UCF-HMDB_{full} results from the ICCV19 paper. i.e. source only, target only, TA^2N, and TA^3N of Table 3 and Table 4 of the paper. But I cannot reproduce the TA^2N and TA^3N results. Can you share the hyperparameter settings of the TA^2N and TA^3N? If you can share the shell script file of those, it would be really helpful.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.