Giter VIP home page Giter VIP logo

var's People

Contributors

leonnnop avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

var's Issues

Some data in training dataset seems weird.

First, thank you for sharing these codes and models across the community. I think it is an exciting task that is worth deep researching.

However, during my glimpse of the dataset structure, I see some samples, especially in the last part of the training dataset, which looks weird. There is an example here:

{'events': [{'video_id': 'UY0nYr-dXEI',
   'timestamp': [0, 157],
   'clip_idx': 0,
   'clip_tot': 10,
   'sentence': 'After a meeting with his pyschiatrist, the doctor, the man participates in an office conga line and talks to his pets about the woman he has fallen for.',
   'duration': 157},
  {'video_id': 'pYmo3PXF_T4',
   'timestamp': [0, 155],
   'clip_idx': 1,
   'clip_tot': 10,
   'sentence': 'After he hits a deer with his car, the woman runs away from the man, causing him to lose control.',
   'duration': 155},
  {'video_id': 'H5I1DyJ3w1g',
   'timestamp': [0, 151],
   'clip_idx': 2,
   'clip_tot': 10,
   'sentence': 'the man cleans up the mess he made from killing the woman, but her severed head begins speaking to him.',
   'duration': 151},
  {'video_id': 'bJSDrRcwwKQ',
   'timestamp': [0, 149],
   'clip_idx': 3,
   'clip_tot': 10,
   'sentence': 'the man recalls his involvement in the gruesome death of his mother.',
   'duration': 149},
  {'video_id': 'xllpnvAmnHE',
   'timestamp': [0, 157],
   'clip_idx': 4,
   'clip_tot': 10,
   'sentence': "the man talks to his cat, dog, and the woman's severed head to determine if he is truly a serial killer.",
   'duration': 157},
  {'video_id': 'ml_zSw6yWOE',
   'timestamp': [0, 170],
   'clip_idx': 5,
   'clip_tot': 10,
   'sentence': 'the man tries to get the woman to calm down after she discovers that he murdered a woman.',
   'duration': 170},
  {'video_id': 'SaaTUj7m8e4',
   'timestamp': [0, 150],
   'clip_idx': 6,
   'clip_tot': 10,
   'sentence': 'After the man kills the woman, all the voices in his head offer an opinion.',
   'duration': 150},
  {'video_id': 'ZMplRnotp8M',
   'timestamp': [0, 148],
   'clip_idx': 7,
   'clip_tot': 10,
   'sentence': 'the man kidnaps the doctor and takes her to a remote field, where he forces her to give him the therapy he needs.',
   'duration': 148},
  {'video_id': 'Ax-iwIoIxjY',
   'timestamp': [0, 132],
   'clip_idx': 8,
   'clip_tot': 10,
   'sentence': 'When the police arrive at his apartment, the man escapes through the vents, causing a gas leak and explosion.',
   'duration': 132},
  {'video_id': 'Gk_2euKF9MY',
   'timestamp': [0, 160],
   'clip_idx': 9,
   'clip_tot': 10,
   'sentence': 'the man passes on and succumbs to the voices in his head, who perform a cheery song and dance number.',
   'duration': 160}],
 'hypothesis': 2,
 'split': 'train'}

As shown, all clips start with 0 and end in different timestamps. Is it expected or some mistaken samples?

Thanks!

A question about temporal overlapping between events.

Thanks for sharing the code and the data of this interesting work.

I have a question after checking some samples in var_*_v1.0.json. The target of VAR is to infer the explanation from the incomplete observations. That is to say, the video content of the "explanation" event should not be exposed to models, am I right? However, it seems there exists temporal overlapping between the "explanation" and "observation" events. For example, in the following sample, the index of the "explanation" event is 1, whose timestamp is [3.09, 61,72] while the timestamp of the next video clip is [15.43, 55.24]. Is this acceptable?

{
  "events": [
    {
      "video_id": "ehGHCYKzyZ8",
      "timestamp": [
        0,
        2.78
      ],
      "clip_idx": 0,
      "clip_tot": 6,
      "sentence": "The video starts with a title logo sequence.",
      "duration": 61.72
    },
    {
      "video_id": "ehGHCYKzyZ8",
      "timestamp": [
        3.09,
        61.72
      ],
      "clip_idx": 1,
      "clip_tot": 6,
      "sentence": "A man and woman are in a living room demonstrating exercises.",
      "duration": 61.72
    },
    {
      "video_id": "ehGHCYKzyZ8",
      "timestamp": [
        15.43,
        55.24
      ],
      "clip_idx": 2,
      "clip_tot": 6,
      "sentence": "The woman lays on the ground.",
      "duration": 61.72
    },
    {
      "video_id": "ehGHCYKzyZ8",
      "timestamp": [
        17.59,
        54
      ],
      "clip_idx": 3,
      "clip_tot": 6,
      "sentence": "The man starts pointing to different areas of the woman's body as she does an exercise.",
      "duration": 61.72
    },
    {
      "video_id": "ehGHCYKzyZ8",
      "timestamp": [
        39.81,
        54.62
      ],
      "clip_idx": 4,
      "clip_tot": 6,
      "sentence": "The woman begins to do small sit ups.",
      "duration": 61.72
    },
    {
      "video_id": "ehGHCYKzyZ8",
      "timestamp": [
        56.47,
        61.72
      ],
      "clip_idx": 5,
      "clip_tot": 6,
      "sentence": "The woman ends with a final title logo sequence.",
      "duration": 61.72
    }
  ],
  "hypothesis": 1,
  "split": "train"
}

Some datas are wired.

Some examples are in the same video, while there is a lot of overlap. E.g., the start time of the second event is in the time space of the first event.

image

Problems about results in paper

Hi! How can I reproduce the results reported in paper? I get CIDer at 34.97 for observed events and 37.68 for explanation events. I have run the codes for several times, and results are similar.

找不到哪里测试和哪里加载训练出来的模型

你好,我现在在复现你的代码,用来做毕业设计。我已经用run.py训练出来了模型,但现在我没有找到哪里训练,以及哪里加载训练出来的模型。难道是直接在run.py里面把evaluate_mode参数改成test吗?但这样我感觉也不太对,因为也加载了训练集

The raw videos

I have downloaded the raw videos and concat them, while I can not obtain all the videos. Can you provide a way for zipping them .
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.