Giter VIP home page Giter VIP logo

Comments (5)

mmaaz60 avatar mmaaz60 commented on August 10, 2024

Hi @msra-jqxu,

Thank You for your interest in our work. Some of the video files were corrupted in our case and it could be the reason why the clip feature files are missing, and the reason of the mismatch. You can try skipping these videos as we did in our experiments.

The filtering script we use is attached below for your reference. Note that it takes an additional command line argument (i.e. --clip_feature_path). Let me know if it solves the issue or if you have any further questions. Thank You.

import os
import json
import argparse


def parse_args():
    parser = argparse.ArgumentParser(description="Training")

    parser.add_argument("--input_json_file", required=True,
                        help="Path to input json file (i.e. VideoInstruct_Dataset.json)")
    parser.add_argument("--output_json_file", required=True,
                        help="Path to output json file (i.e. VideoInstruct_Dataset_Train.json)")
    parser.add_argument("--clip_feature_path", required=False, default="",
                        help="Path to generated CLIP feature paths to filter any missing video ids (optional).")

    args = parser.parse_args()

    return args


def main():
    args = parse_args()
    input_json_file = args.input_json_file
    output_json_file = args.output_json_file
    clip_feature_path = args.clip_feature_path

    clip_features_files_witout_extension = ""
    if clip_feature_path:
        clip_features_files = os.listdir(clip_feature_path)
        clip_features_files_witout_extension = []
        for file in clip_features_files:
            clip_features_files_witout_extension.append(file.split('.')[0])


    input_json_contents = json.load(open(input_json_file, 'r'))
    output_json_contents = []
    for i, content in enumerate(input_json_contents):
        valid = False
        if not clip_feature_path:
            valid = True
        elif content['video_id'] in clip_features_files_witout_extension:
            valid = True

        if valid:
            output_content = {'id': content['video_id'], 'video': f"{content['video_id']}.pkl", 'conversations': []}
            # This is critical
            if i % 2 == 0:
                output_content['conversations'].append({'from': 'human', 'value': f"{content['q']}\n<video>"})
            else:
                output_content['conversations'].append({'from': 'human', 'value': f"<video>\n{content['q']}"})
            output_content['conversations'].append({'from': 'gpt', 'value': content['a']})
            output_json_contents.append(output_content)

    print(f"Total annotations retained: {len(output_json_contents)}")
    with open(output_json_file, 'w') as f:
        json.dump(output_json_contents, f)


if __name__ == "__main__":
    main()

from video-chatgpt.

msra-jqxu avatar msra-jqxu commented on August 10, 2024

Hi, @mmaaz60 ,
The filtering script really works for me and now I can train the model successfully! Thanks very much!

By the way, I specify the parameter --model_name_or_path while training. A prompt pops up during training: ”You are using a model of type llava to instantiate a model of type VideoChatGPT. This is not supported for all configurations of models and can yield errors.“ Is this normal?

from video-chatgpt.

mmaaz60 avatar mmaaz60 commented on August 10, 2024

Hi @msra-jqxu,

This is normal. Thank you.

from video-chatgpt.

msra-jqxu avatar msra-jqxu commented on August 10, 2024

Thanks again! I will close this issue as completed.

from video-chatgpt.

msra-jqxu avatar msra-jqxu commented on August 10, 2024

Hi, @mmaaz60 @hanoonaR ,
I noticed that there are 10024 videos in training set of ActivityNet200 in official website. But I found 13329 videos in the training set I downloaded from here. Could you explain where those 3,000+ videos came from? thanks!

image

from video-chatgpt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.