Giter VIP home page Giter VIP logo

Comments (2)

TalalWasim avatar TalalWasim commented on July 17, 2024

Hi,

Thank you for reaching out.

Please note that we report the results on the Kinetics-400 Validation set, following the common practice of other popular works in literature, where they train on the training set and report on the validation set.

We also provide the log file below to show the results. Additionally, if required, we can make available our resized version of the Kinetics-400 train and validation sets. Please let us know if you would like us to do that.

log_rank0.txt.

Kind regards,

from video-focalnets.

innat avatar innat commented on July 17, 2024

@TalalWasim
Thanks for your detail response.

I used K400 test set, link. It's collected from open-datalab. Hopefully it's much as same as yours. The version that you mentioned, if possible please share.

Thanks for sharing the log file, it's helpful. The mian pionts probably are as follows for inference:

INPUT_SIZE: 224
NUM_FRAMES: 8
TEST:
  CROP: true
  NUM_CLIP: 4
  NUM_CROP: 3

For now, I didn't use multiple crop, and use only single spatial view. I think not using 3 crop will not cause major score gap.

In the paper, page 7, it is mentioned as below:

During inference, we report results as an average across Nclip × Ncrops
where a total of Nclip clips are uniformly sampled from the video, and for each video, Ncrops spatial crops are taken during inference.

To follow up, I also used n-clip=3 and iterate inference on multiple clip and average next for a single instance. By doing so, scores improves 65% to 71%.


update

reproduced. great. For me, It's was about sampling.

from video-focalnets.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.