Comments (5)
Hi @msra-jqxu,
Thank You for your interest in our work. Some of the video files were corrupted in our case and it could be the reason why the clip feature files are missing, and the reason of the mismatch. You can try skipping these videos as we did in our experiments.
The filtering script we use is attached below for your reference. Note that it takes an additional command line argument (i.e. --clip_feature_path
). Let me know if it solves the issue or if you have any further questions. Thank You.
import os
import json
import argparse
def parse_args():
parser = argparse.ArgumentParser(description="Training")
parser.add_argument("--input_json_file", required=True,
help="Path to input json file (i.e. VideoInstruct_Dataset.json)")
parser.add_argument("--output_json_file", required=True,
help="Path to output json file (i.e. VideoInstruct_Dataset_Train.json)")
parser.add_argument("--clip_feature_path", required=False, default="",
help="Path to generated CLIP feature paths to filter any missing video ids (optional).")
args = parser.parse_args()
return args
def main():
args = parse_args()
input_json_file = args.input_json_file
output_json_file = args.output_json_file
clip_feature_path = args.clip_feature_path
clip_features_files_witout_extension = ""
if clip_feature_path:
clip_features_files = os.listdir(clip_feature_path)
clip_features_files_witout_extension = []
for file in clip_features_files:
clip_features_files_witout_extension.append(file.split('.')[0])
input_json_contents = json.load(open(input_json_file, 'r'))
output_json_contents = []
for i, content in enumerate(input_json_contents):
valid = False
if not clip_feature_path:
valid = True
elif content['video_id'] in clip_features_files_witout_extension:
valid = True
if valid:
output_content = {'id': content['video_id'], 'video': f"{content['video_id']}.pkl", 'conversations': []}
# This is critical
if i % 2 == 0:
output_content['conversations'].append({'from': 'human', 'value': f"{content['q']}\n<video>"})
else:
output_content['conversations'].append({'from': 'human', 'value': f"<video>\n{content['q']}"})
output_content['conversations'].append({'from': 'gpt', 'value': content['a']})
output_json_contents.append(output_content)
print(f"Total annotations retained: {len(output_json_contents)}")
with open(output_json_file, 'w') as f:
json.dump(output_json_contents, f)
if __name__ == "__main__":
main()
from video-chatgpt.
Hi, @mmaaz60 ,
The filtering script really works for me and now I can train the model successfully! Thanks very much!
By the way, I specify the parameter --model_name_or_path while training. A prompt pops up during training: ”You are using a model of type llava to instantiate a model of type VideoChatGPT. This is not supported for all configurations of models and can yield errors.“ Is this normal?
from video-chatgpt.
Hi @msra-jqxu,
This is normal. Thank you.
from video-chatgpt.
Thanks again! I will close this issue as completed.
from video-chatgpt.
Hi, @mmaaz60 @hanoonaR ,
I noticed that there are 10024 videos in training set of ActivityNet200 in official website. But I found 13329 videos in the training set I downloaded from here. Could you explain where those 3,000+ videos came from? thanks!
from video-chatgpt.
Related Issues (20)
- Same output for any system prompt HOT 1
- Adding Modality HOT 1
- Question about the semi-automatic dataset creation process HOT 1
- RuntimeError: Parent directory ./Video-ChatGPT_7B-1.1_Checkpoints/checkpoint-3000 does not exist. HOT 5
- Why is the <video> tag is needed in training json? HOT 1
- How to load the tuned backbone? HOT 1
- Why does "No module named 'video chatgpt'" appear? What should I do next? HOT 3
- How to download the vidoes? HOT 1
- What is the formal name of the benchmark? HOT 3
- GPT Judge does not work for the Prompt
- 13B, 70B models HOT 1
- RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())] HOT 5
- I encountered an error while using "Run the Demo". HOT 3
- Insufficient VRAM on a single card, how to utilize multiple cards for inference HOT 2
- Question about consistency evaluation HOT 3
- The responses from the offline_demo are garbled HOT 2
- Why TGIF videos missed and can't be processed HOT 3
- How to download the ready LLaVA-Lightening-7B weights HOT 5
- Download videos HOT 1
- How many frames are sampled per video for the training and testing process? HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from video-chatgpt.