Giter VIP home page Giter VIP logo

scene_graph_benchmark's Introduction

Scene Graph Benchmark in PyTorch 1.7

This project is based on maskrcnn-benchmark

alt text

Highlights

  • Upgrad to pytorch 1.7
  • Multi-GPU training and inference
  • Batched inference: can perform inference using multiple images per batch per GPU.
  • Fast and flexible tsv dataset format
  • Remove FasterRCNN detector dependency: during relation head training, can plugin bounding boxes from any detector.
  • Provides pre-trained models for different scene graph detection algorithms (IMP, MSDN, GRCNN, Neural Motif, RelDN).
  • Provides bounding box level and relation level feature extraction functionalities
  • Provides large detector backbones (ResNxt152)

Installation

Check INSTALL.md for installation instructions.

Model Zoo and Baselines

Pre-trained models can be found in SCENE_GRAPH_MODEL_ZOO.md

Visualization and Demo

We provide a helper class to simplify writing inference pipelines using pre-trained models (Currently only support objects and attributes). Here is how we would do it. Run the following commands:

# visualize VinVL object detection
# pretrained models at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
# the associated labelmap at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json
python tools/demo/demo_image.py --config_file sgg_configs/vgattr/vinvl_x152c4.yaml --img_file demo/woman_fish.jpg --save_file output/woman_fish_x152c4.obj.jpg MODEL.WEIGHT pretrained_model/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 TEST.IGNORE_BOX_REGRESSION False

# visualize VinVL object-attribute detection
# pretrained models at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
# the associated labelmap at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json
python tools/demo/demo_image.py --config_file sgg_configs/vgattr/vinvl_x152c4.yaml --img_file demo/woman_fish.jpg --save_file output/woman_fish_x152c4.attr.jpg --visualize_attr MODEL.WEIGHT pretrained_model/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 TEST.IGNORE_BOX_REGRESSION False

# visualize OpenImage scene graph generation by RelDN
# pretrained models at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/sgg_model_zoo/sgg_oi_vrd_model_zoo/RX152FPN_reldn_oi_best.pth
python tools/demo/demo_image.py --config_file sgg_configs/vrd/R152FPN_vrd_reldn.yaml --img_file demo/1024px-Gen_Robert_E_Lee_on_Traveler_at_Gettysburg_Pa.jpg --save_file output/1024px-Gen_Robert_E_Lee_on_Traveler_at_Gettysburg_Pa.reldn_relation.jpg --visualize_relation MODEL.ROI_RELATION_HEAD.DETECTOR_PRE_CALCULATED False

# visualize Visual Genome scene graph generation by neural motif
python tools/demo/demo_image.py --config_file sgg_configs/vg_vrd/rel_danfeiX_FPN50_nm.yaml --img_file demo/1024px-Gen_Robert_E_Lee_on_Traveler_at_Gettysburg_Pa.jpg --save_file demo/1024px-Gen_Robert_E_Lee_on_Traveler_at_Gettysburg_Pa_vgnm.jpg --visualize_relation MODEL.ROI_RELATION_HEAD.DETECTOR_PRE_CALCULATED False DATASETS.LABELMAP_FILE "visualgenome/VG-SGG-dicts-danfeiX-clipped.json" DATA_DIR /home/penzhan/GitHub/maskrcnn-benchmark-1/datasets1 MODEL.ROI_RELATION_HEAD.USE_BIAS True MODEL.ROI_RELATION_HEAD.FILTER_NON_OVERLAP True MODEL.ROI_HEADS.DETECTIONS_PER_IMG 64 MODEL.ROI_RELATION_HEAD.SHARE_BOX_FEATURE_EXTRACTOR False MODEL.ROI_RELATION_HEAD.NEURAL_MOTIF.OBJ_LSTM_NUM_LAYERS 0 MODEL.ROI_RELATION_HEAD.NEURAL_MOTIF.EDGE_LSTM_NUM_LAYERS 2 TEST.IMS_PER_BATCH 2

Perform training

For the following examples to work, you need to first install this repo.

You will also need to download the dataset. Datasets can be downloaded by azcopy with following command:

path/to/azcopy copy 'https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/datasets/TASK_NAME' <target folder> --recursive

TASK_NAME could be visualgenome, openimages_v5c.

We recommend to symlink the path to the dataset to datasets/ as follows

# symlink the dataset
cd ~/github/maskrcnn-benchmark
mkdir -p datasets/openimages_v5c/
ln -s /vrd datasets/openimages_v5c/vrd

You can also prepare your own datasets.

Follow tsv dataset creation instructions tools/mini_tsv/README.md

Single GPU training

python tools/train_sg_net.py --config-file "/path/to/config/file.yaml"

This should work out of the box and is very similar to what we should do for multi-GPU training. But the drawback is that it will use much more GPU memory. The reason is that we set in the configuration files a global batch size that is divided over the number of GPUs. So if we only have a single GPU, this means that the batch size for that GPU will be 4x larger, which might lead to out-of-memory errors.

Multi-GPU training

We use internally torch.distributed.launch in order to launch multi-gpu training. This utility function from PyTorch spawns as many Python processes as the number of GPUs we want to use, and each Python process will only use a single GPU.

export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_sg_net.py --config-file "path/to/config/file.yaml" 

Evaluation

You can test your model directly on single or multiple gpus. To evaluate relations, one needs to output "relation_scores_all" in the TSV_SAVE_SUBSET. Here are a few example command line for evaluating on 4 GPUS:

export NGPUS=4

python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file CONFIG_FILE_PATH 

# vg IMP evaluation
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file sgg_configs/vg_vrd/rel_danfeiX_FPN50_imp.yaml

# vg MSDN evaluation
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file sgg_configs/vg_vrd/rel_danfeiX_FPN50_msdn.yaml

# vg neural motif evaluation
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file sgg_configs/vg_vrd/rel_danfeiX_FPN50_nm.yaml

# vg GRCNN evaluation
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file sgg_configs/vg_vrd/rel_danfeiX_FPN50_grcnn.yaml

# vg RelDN evaluation
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file sgg_configs/vg_vrd/rel_danfeiX_FPN50_reldn.yaml

# oi IMP evaluation
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file sgg_configs/oi_vrd/R152FPN_imp_bias_oi.yaml

# oi MSDN evaluation
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file sgg_configs/oi_vrd/R152FPN_msdn_bias_oi.yaml

# oi neural motif evaluation
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file sgg_configs/oi_vrd/R152FPN_motif_oi.yaml

# oi GRCNN evaluation
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file sgg_configs/oi_vrd/R152FPN_grcnn_oi.yaml

# oi RelDN evaluation
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file sgg_configs/vrd/R152FPN_vrd_reldn.yaml

To evaluate in sgcls mode:

export NGPUS=4

python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file CONFIG_FILE_PATH MODEL.ROI_BOX_HEAD.FORCE_BOXES True MODEL.ROI_RELATION_HEAD.MODE "sgcls"

To evaluate in predcls mode:

export NGPUS=4

python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file CONFIG_FILE_PATH MODEL.ROI_RELATION_HEAD.MODE "predcls"

To evaluate with ground truth bbox and ground truth pairs:

export NGPUS=4

python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_sg_net.py --config-file CONFIG_FILE_PATH MODEL.ROI_RELATION_HEAD.FORCE_RELATIONS True

Abstractions

For more information on some of the main abstractions in our implementation, see ABSTRACTIONS.md.

Adding your own dataset

This implementation adds support for TSV style datasets. But adding support for training on a new dataset can be done as follows:

from maskrcnn_benchmark.data.datasets.relation_tsv import RelationTSVDataset

class MyDataset(RelationTSVDataset):
    def __init__(self, yaml_file, extra_fields=(), transforms=None,
            is_load_label=True, **kwargs):

        super(MyDataset, self).__init__(yaml_file, extra_fields, transforms, is_load_label, **kwargs)
    
    def your_own_function(self, idx, call=False):
        # you can overwrite function or add your own functions this way
        pass

That's it. You can also add extra fields to the boxlist, such as segmentation masks (using structures.segmentation_mask.SegmentationMask), or even your own instance type.

For a full example of how the VGTSVDataset is implemented, check maskrcnn_benchmark/data/datasets/vg_tsv.py.

Once you have created your dataset, it needs to be added in a couple of places:

Adding your own evaluation

To enable your dataset for testing, add a corresponding if statement in maskrcnn_benchmark/data/datasets/evaluation/__init__.py:

if isinstance(dataset, datasets.MyDataset):
        return your_evaluation(**args)

VinVL Feature extraction

The output feature will be encoded as base64

# extract vision features with VinVL object-attribute detection model
# pretrained models at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
# the associated labelmap at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json
python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True

To extract relation features (union bounding box's feature), in yaml file, set TEST.OUTPUT_RELATION_FEATURE to True, add relation_feature in TEST.TSV_SAVE_SUBSET.

To extract bounding box features, in yaml file, set TEST.OUTPUT_FEATURE to True, add feature in TEST.TSV_SAVE_SUBSET.

Troubleshooting

If you have issues running or compiling this code, we have compiled a list of common issues in TROUBLESHOOTING.md. If your issue is not present there, please feel free to open a new issue.

Citations

Please consider citing this project in your publications if it helps your research. The following is a BibTeX reference. The BibTeX entry requires the url LaTeX package.

@misc{han2021image,
      title={Image Scene Graph Generation (SGG) Benchmark}, 
      author={Xiaotian Han and Jianwei Yang and Houdong Hu and Lei Zhang and Jianfeng Gao and Pengchuan Zhang},
      year={2021},
      eprint={2107.12604},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License

maskrcnn-benchmark is released under the MIT license. See LICENSE for additional details.

Acknowledgement

scene_graph_benchmark's People

Contributors

hanxiaotian avatar microsoftopensource avatar pzzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scene_graph_benchmark's Issues

Extracting features for a directory of images

I'm slightly confused by the instructions on how to extract features from the test_sg_net.py script - more specifically:

  • what format the data directory has to be in (i.e. something more than just a directory containing images) and
  • what variables need to be set in the sgg_configs/vgattr/vinvl_x152c4.yaml config in order to point to my data directory (I'm not sure what the variables DATASETS.TEST and DATA_DIR need to be - I assumed the latter was the directory with all the images I need features for, but that does not seem to be the case)

How to extract relation features

I use the model offered by VinVL Feature extraction,.But after using the same setting as advised, I can only obtain bounding box features. So, how to extract relation features? Thanks

Cannot run VinVL feature extraction command

I ran the following command (from the README):

python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True

I think the DATA_DIR is misconfigured because I get the following error (below). Where is ../maskrcnn-benchmark-1/datasets1 from? Or the file visualgenome/test_vgoi6_clipped.yaml, which I think it's looking for?

This is the AssertionError:

Traceback (most recent call last):
  File "tools/test_sg_net.py", line 197, in <module>
    main()
  File "tools/test_sg_net.py", line 193, in main
    run_test(cfg, model, args.distributed, model_name)
  File "tools/test_sg_net.py", line 55, in run_test
    data_loaders_val = make_data_loader(cfg, is_train=False, is_distributed=distributed)
  File "/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 170, in make_data_loader
    datasets = build_dataset(cfg, transforms, DatasetCatalog, is_train or is_for_period)
  File "/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 45, in build_dataset
    cfg, dataset_name, factory_name, is_train
  File "/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/utils/config_args.py", line 7, in config_tsv_dataset_args
    assert op.isfile(full_yaml_file)
AssertionError

Thanks in advance!

Slow feature extraction compared to bottom-up-attention

Hi, thanks for the great work and open-sourcing this project.

I'm excited to try VinVL since it promises faster computation time for the feature extraction part as written in the paper compared to bottom-up-attention
image

I have created my own TSV file using tsv_demo.py and ran tools/test_sg_net.py to do feature extraction.
The sad thing is the feature extraction runs quite slowly.
Right now I'm using Pytorch 1.7, Debian 10, with 1 Nvidia T4.
The feature extraction process took 9 second / 4 images.

I used bottom-up-attention from https://github.com/airsplay/py-bottom-up-attention and https://github.com/peteanderson80/bottom-up-attention while using OSCAR on the same dataset. these repo give much faster feature extraction time (the first repo need 2.7 seconds / 8 images, while the original caffe bottom-up took less than 1 second for 1 image ) on a similar machine. This contradicts what written in your paper.

Here's some key config that I'm using while running the tools/test_sg_net.py

TEST:
    IMS_PER_BATCH: 4
    IGNORE_BOX_REGRESSION: True
    SKIP_PERFORMANCE_EVAL: True
    SAVE_PREDICTIONS: True
    SAVE_RESULTS_TO_TSV: True
    TSV_SAVE_SUBSET: ['rect', 'class', 'conf', 'feature']
    GATHER_ON_CPU: True
    OUTPUT_FEATURE : True

I'm check my nvidia-smi and it showing my GPU is working.

Is anyone else have this issue also?

Execute setup.py with the problem link.exe failed with code 1181 in windows platform

Dear author:
Thanks for your great job. When i use your code "setup.py" in the win10 platform, i get the error link.exe "fatal error lnk 1181 link.exe ,cannot open ROIAlign.obj file". And i used vs2015 and vs2019 with vs++14.0 and vs++16.0 ,both get the same problem.I check the ROIAlign.obj file and it missed in the file path. Can you tell me how to fix the problem?
But it works on the ubuntu platform.
Thanks very much, looks your forward soon.

image

Why does the program freeze when running with multiple GPUs?

python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_sg_net.py --config-file
"sgg_configs/vg_vrd/rel_danfeiX_FPN50_reldn.yaml"


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


Several issues in extracting VinVL Feature extraction

When I have installed your environment step by step by option 1 and then run the command for below,

# extract vision features with VinVL object-attribute detection model
# pretrained models at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
# the associated labelmap at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json
python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True

There are several issues,

  1. I do not find the code to load the pre_trained model parameters for "AttrRCNN". Though the command has a pre-trained model path, I do not find a concrete torch.load() code by debugging. Thus I wonder whether I need to add torch.load() by self when I run the above command.

  2. The "self.training" in "AttrRCNN" is extended from "torch.nn.modules.module.py", which is set as True by default. But in run the command to extract VinVL features by the beginning command, it seems that it should be False and I have to overwrite each init functions of AttrRCNN, its "self.rpn", and "self.roi_heads" as below,

 proposals, proposal_losses = self.rpn(images, features, targets, is_training = self.training)
  x, predictions, detector_losses = self.roi_heads(features,  proposals, targets, is_training = self.training) 
  1. Instead of applying Pytorch 1.4, I apply Pytorch 1.7, but it always gives running errors for several in-place operations, such as below codes in "bounding_box.py"
        def clip_to_image(self, remove_empty=True):
        TO_REMOVE = 1
        self.bbox[:, 0].clamp_(min=0, max=self.size[0] - TO_REMOVE)
        self.bbox[:, 1].clamp_(min=0, max=self.size[1] - TO_REMOVE)
        self.bbox[:, 2].clamp_(min=0, max=self.size[0] - TO_REMOVE)
        self.bbox[:, 3].clamp_(min=0, max=self.size[1] - TO_REMOVE)

the error is as below,

  File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/modeling/rpn/rpn.py", line 188, in _forward_test
    boxes = self.box_selector_test(anchors, objectness, rpn_box_regression)
  File "/home/jfhe/anaconda3/envs/JD2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/modeling/rpn/inference.py", line 140, in forward
    sampled_boxes.append(self.forward_for_single_feature_map(a, o, b))
  File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/modeling/rpn/inference.py", line 114, in forward_for_single_feature_map
    boxlist = boxlist.clip_to_image(remove_empty=False)
  File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 217, in clip_to_image
    self.bbox[:, 1].clamp_(min=0, max=self.size[1] - TO_REMOVE)
RuntimeError: Output 0 of UnbindBackward is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

I address them by setting "with torch.no_grad()", but it feels strange. If I have to fine-tune the model, then these bugs will come again by removing "with torch.no_grad()".

  1. Also, I agree with the top question, #25
    Could you please provide a simpler way to extract the VinVL features directly? Because it will bring much help to the community, and we will cite your works definitely.

can you add same GCN methods in recent CVPR

I find the five methods in the comparisons are all before 2020, but the mainstream method in 2020 and 2021 is graph neural network. So I hope your team can add some latest methods to this project. Thanks.

Hi, I found a way to circumvent using tsv files by modifying `scene_graph_benchmark/tools/demo/demo_image.py`, and now I only need `jpg` image dataset, VinVl yaml configuration file and model weight file. The predictions are saved in dictionary and are stored in `pth` format. I ran it on Google Colab and it generates predictions at a rate about 2s/image. I hope this helps.

Hi, I found a way to circumvent using tsv files by modifying scene_graph_benchmark/tools/demo/demo_image.py, and now I only need jpg image dataset, VinVl yaml configuration file and model weight file. The predictions are saved in dictionary and are stored in pth format. I ran it on Google Colab and it generates predictions at a rate about 2s/image. I hope this helps.

# pretrained models at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
# the associated labelmap at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json

import cv2
import os
import os.path as op
import argparse
import json
from PIL import Image


from scene_graph_benchmark.scene_parser import SceneParser
from scene_graph_benchmark.AttrRCNN import AttrRCNN
from maskrcnn_benchmark.data.transforms import build_transforms
from maskrcnn_benchmark.utils.checkpoint import DetectronCheckpointer
from maskrcnn_benchmark.config import cfg
from scene_graph_benchmark.config import sg_cfg
from maskrcnn_benchmark.data.datasets.utils.load_files import \
    config_dataset_file
from maskrcnn_benchmark.data.datasets.utils.load_files import load_labelmap_file
from maskrcnn_benchmark.utils.miscellaneous import mkdir

def cv2Img_to_Image(input_img):
    cv2_img = input_img.copy()
    img = cv2.cvtColor(cv2_img, cv2.COLOR_BGR2RGB)
    img = Image.fromarray(img)
    return img


def detect_objects_on_single_image(model, transforms, cv2_img):
    # cv2_img is the original input, so we can get the height and 
    # width information to scale the output boxes.
    img_input = cv2Img_to_Image(cv2_img)
    img_input, _ = transforms(img_input, target=None)
    img_input = img_input.to(model.device)

    with torch.no_grad():
        prediction = model(img_input)[0].to('cpu')
    #     prediction = prediction[0].to(torch.device("cpu"))

    img_height = cv2_img.shape[0]
    img_width = cv2_img.shape[1]

    prediction = prediction.resize((img_width, img_height))
    
    return prediction

#Setting configuration
cfg.set_new_allowed(True)
cfg.merge_from_other_cfg(sg_cfg)
cfg.set_new_allowed(False)
#Configuring VinVl
cfg.merge_from_file('/scene_graph_benchmark/sgg_configs/vgattr/vinvl_x152c4.yaml')

#This is a list specifying the values for additional arguments, it encompasses pairs of list and values in an ordered manner
#MODEL.WEIGHT specifies the full path of the VinVl weight pth file
#DATA_DIR specifies the directory that contains VinVl input tsv configuration yaml file
argument_list = [
                 'MODEL.WEIGHT', 'vinvl_vg_x152c4.pth',
                 'MODEL.ROI_HEADS.NMS_FILTER', 1,
                 'MODEL.ROI_HEADS.SCORE_THRESH', 0.2, 
                 'TEST.IGNORE_BOX_REGRESSION', False,
                 'MODEL.ATTRIBUTE_ON', True
                 ]
cfg.merge_from_list(argument_list)
cfg.freeze()

#     assert op.isfile(args.img_file), \
#         "Image: {} does not exist".format(args.img_file)

output_dir = cfg.OUTPUT_DIR
#     mkdir(output_dir)

model = AttrRCNN(cfg)
model.to(cfg.MODEL.DEVICE)
model.eval()

checkpointer = DetectronCheckpointer(cfg, model, save_dir=output_dir)
checkpointer.load(cfg.MODEL.WEIGHT)

transforms = build_transforms(cfg, is_train=False)

input_img_directory = 'insert your images directory path here'
#need to be pth
output_prediction_file = 'insert your output pth file path here'
dets = {}
for img_name in os.listdir(input_img_directory):
  #Convert png format to jpg format
  if img_name.split('.')[1]=='png' or img_name.split('.')[1]=='PNG':
    im = Image.open(os.path.join(input_img_directory, img_name))
    rgb_im = im.convert('RGB')
    new_name = img_name.split('.')[0]+'.jpg'
    rgb_im.save(os.path.join(input_img_directory, new_name))
    print(new_name)

  img_file_path = os.path.join(input_img_directory,img_name.split('.')[0]+'.jpg')
  print(img_file_path)

  cv2_img = cv2.imread(img_file_path)

  det = detect_objects_on_single_image(model, transforms, cv2_img)
  
#   prediction contains ['labels',
#  'scores',
#  'box_features',
#  'scores_all',
#  'boxes_all',
#  'attr_labels',
#  'attr_scores']
# box_features are used for oscar

  det_dict ={key : det1[0].get_field(key) for key in det1[0].fields()}

  dets[img_name.split('.')[0]] = det_dict


torch.save(dets, output_prediction_file)

Originally posted by @SPQRXVIII001 in #7 (comment)

demo yaml

Can you provide flickr30k/tsv/flickr30k.yaml as specified in the vgattr/vinvl_x152c4.yaml?

python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True

Issue when attempting to generate image features

Hi,

Thank you for providing us code for this project. I was trying to generate images features. I attempted to follow both examples from #25 and #7
as follows:

With a directory of 18 images (stored in datasets/test_imgs), I used tools/mini_tsv/demo_tsv.py to generate tsv files (label, hw, linelist) for the corresponding dataset, and stored them in datasets/test/. Since I didn't have any particular labelmap in mind, and I had downloaded the checkpoint for the RelDN model, and its corresponding config file, I used the label map VG-SGG-dicts-vgoi6-clipped.json (I copied this file into the same directory), so that my yaml file is as follows:

datasets/test/test_imgs.yaml
img: test_imgs.tsv label: test_imgs_label.tsv hw: test_imgs_hw.tsv label_map: VG-SGG-dicts-vgoi6-clipped.json linelist: test_imgs_linelist.tsv

Then, I made a new yaml file datasets/test/testing.yaml which was the same yaml file as rel_danfeiX_FPN50_reldn.yaml but with DATASETS.TRAIN = ("test/test_imgs.yaml",) and DATASETS.TEST = ("test/test_imgs.yaml",) and ran the command

python -m torch.distributed.launch --nproc_per_node=2 tools/test_sg_net.py --config-file datasets/test/testing.yaml

This ran into the error:

2021-07-16 04:08:02,996 maskrcnn_benchmark.inference INFO: Start evaluation on test/test_imgs.yaml dataset(18 images).
INFO:maskrcnn_benchmark.inference:Start evaluation on test/test_imgs.yaml dataset(18 images).
0%| | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
File "tools/test_sg_net.py", line 198, in
main()
File "tools/test_sg_net.py", line 194, in main
run_test(cfg, model, args.distributed, model_name)
File "tools/test_sg_net.py", line 73, in run_test
save_predictions=cfg.TEST.SAVE_PREDICTIONS,
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 265, in inference
predictions = compute_on_dataset(model, data_loader, device, bbox_aug, inference_timer)
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 32, in compute_on_dataset
for _, batch in enumerate(tqdm(data_loader)):
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/tqdm/std.py", line 1185, in iter
for obj in iterable:
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/relation_tsv.py", line 146, in getitem
target = self.get_target_from_annotations(annotations, img_size)
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/relation_tsv.py", line 78, in get_target_from_annotations
target = self.label_loader(annotations['objects'], img_size, remove_empty=False)
TypeError: list indices must be integers or slices, not str

This is very strange to me since when I run the command without using my own datasets I don't run into this issue at all. Is there anything that I did incorrectly that could cause this error-- and if so, how can I fix it?

This is my first issue raised so forgive me if this is too much/little info or if this is more suited to Stack Overflow instead.

missing model weight when training on visualgenome

I attempted to train RelDN model on visual genome by executing this command:

python tools/train_sg_net.py --config-file sgg_configs/vg_vrd/rel_danfeiX_FPN50_reldn.yaml

and get the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'models/vgvrd/vgnm_usefpTrue_objctx0_edgectx2/model_final.pth'

I searched in the repository but found no information about the file vgnm_usefpTrue_objctx0_edgectx2/model_final.pth. Could you please provide more details about where can we download this model file? Thanks in advance.

config_file for train_sg_net.py

Hi,

I try to train a model following "python tools/train_sg_net.py --config-file "/path/to/config/file.yaml"
". But I confuse which config.file I need to use, can you share more details about training config files?

Thank you very much!

Question about image ids

I thought the key in the tsv file ( visualgenome/label_danfeiX_overlap.new.tsv ) represented the image ID, but I found out that it does not.

from maskrcnn_benchmark.structures.tsv_file_ops import tsv_reader
import glob

# get image ids from Visual Genome image files
img_files = glob.glob('/VisualGenome/VG_100K/*.jpg') + glob.glob('/VisualGenome/VG_100K_2/*.jpg')
image_ids_from_files = [img_file.split('/')[-1].split('.')[0] for img_file in img_files]

# get image ids from the scene graph annotation files
tsv = tsv_reader('datasets/visualgenome/label_danfeiX_overlap.new.tsv')
image_ids_from_annos = [row[0] for row in tsv]

# extract the overlap between the two data
overlap = set(image_ids_from_annos) & set(image_ids_from_files)

print(f'size of iid from files: {len(image_ids_from_files)}')
print(f'size of iid from annotations: {len(image_ids_from_annos)}')
print(f'size of overlapped iid: {len(overlapped_iid)}')

The output of this code looks like this:

size of iid from files: 108249
size of iid from annotations: 108073
size of overlapped iid: 5196

I think this shows that about 95% of the image ids are mismatched.
How can I get the correct image id mapping?

Guide to run train_net.py

Hi, I want to train the detector with train_net.py, could you please give me some guide? How to organize the data, and how to pass the parameters. Thanks. @pzzhang

Index error when accessing box features for RelDN model

Hi there,
I'm trying to use the model sgg_configs/vg_rvd/rel_danfeiX_FPN50_reldn.yaml for generating scene graphs for a custom dataset. The bounding box proposals works fine however seems that there is a bug in the way the proposal_pairs are computed. In particular, I get the following exception:

Traceback (most recent call last):
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/profiler/profilers.py", line 103, in profile
    yield action_name
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1088, in run_predict
    self.predict_loop.predict_step(batch, batch_idx, dataloader_idx)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/trainer/predict_loop.py", line 111, in predict_step
    predictions = self.trainer.accelerator.predict_step(args)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 265, in predict_step
    return self.training_type_plugin.predict_step(*args)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 167, in predict_step
    return self.lightning_module.predict_step(*args, **kwargs)
  File "/Users/asuglia/workspace/scene_graph_benchmark/tools/video/extract_features.py", line 311, in predict_step
    predictions = self.model(images)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/asuglia/workspace/scene_graph_benchmark/scene_graph_benchmark/scene_parser.py", line 319, in forward
    x_pairs, prediction_pairs, relation_losses = self.relation_head(features, predictions, targets)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/asuglia/workspace/scene_graph_benchmark/scene_graph_benchmark/relation_head/relation_head.py", line 211, in forward
    = self.rel_predictor(features, proposals, proposal_pairs)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/asuglia/workspace/scene_graph_benchmark/scene_graph_benchmark/relation_head/reldn/reldn.py", line 103, in forward
    sub_vert_per_image = proposal_per_image.get_field("subj_box_features")[rel_ind_i[:, 0]]
IndexError: index 49 is out of bounds for dimension 0 with size 38

This seems to be due to the fact the indexes of bounding boxes in rel_ind_i cause out of bounds error because there are fewer bounding boxes in proposal_per_image.get_field("subj_box_features"). In this specific case I can see that proposal_per_image.get_field("subj_box_features") has the following shape: torch.Size([38, 1024]). While, rel_ind_i[:, 0] has the following indexes:

tensor([49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 50, 50, 50, 50, 50, 50,
        50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 51, 51, 51,
        51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 52, 52, 52, 52,
        52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52,
        52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 53, 53, 53,
        53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53,
        53, 53, 53, 53, 53, 53, 53, 53, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54,
        54, 54, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55,
        55, 55, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56,
        57, 57, 57, 57, 57, 57, 57, 57, 57, 57, 57, 58, 58, 58, 58, 58, 58, 58,
        58, 58, 58, 58, 58, 58, 58, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59,
        59, 59, 59, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 61, 61,
        61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61,
        61, 61, 61, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 63,
        63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 64, 64, 64, 64, 64,
        64, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65,
        65, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 67, 67, 67, 67,
        67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67,
        67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 68, 68, 68, 68,
        68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68,
        68, 68, 68, 68, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69,
        69, 69, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 71, 71,
        71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 72, 72, 72, 72,
        72, 72, 72, 72, 72, 72, 72, 72, 72, 73, 73, 73, 73, 73, 73, 73, 73, 73,
        73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 74, 74, 74, 74,
        74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 75, 75, 75, 75, 75, 75, 75,
        75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
        75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 76, 76, 76, 76, 76, 76, 76,
        76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76,
        76, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 78, 78, 78, 78, 78, 78, 78,
        78, 79, 79, 79, 79, 79, 79, 79, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
        80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
        80, 80, 80, 80, 80, 80, 80, 80, 80, 81, 81, 81, 81, 81, 81, 81, 81, 81,
        81, 81, 81, 81, 81, 81, 81, 81, 81, 81, 81, 82, 82, 82, 82, 82, 82, 82,
        82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,
        82, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83,
        83, 83, 83, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 85, 85, 85, 85,
        85, 85, 85, 85, 85, 85, 85, 85, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86,
        86, 86, 86, 86, 86, 86, 86, 86])

I'm running this code with batch size 2. I thought that the error could have been in the way the proposal_pairs are generated. However, the exception happens when either of these lines are executed to generate the proposal_pairs:

  1. https://github.com/microsoft/scene_graph_benchmark/blob/main/scene_graph_benchmark/relation_head/relation_head.py#L187
  2. https://github.com/microsoft/scene_graph_benchmark/blob/main/scene_graph_benchmark/relation_head/relation_head.py#L185

Do you think that offset is required here: https://github.com/microsoft/scene_graph_benchmark/blob/main/scene_graph_benchmark/relation_head/reldn/reldn.py#L98

@hanxiaotian Could you please advise?

RuntimeError: CUDA error: invalid device function

When I try to run

python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 1 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 \
MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR /my_path_to_prepard_tsv/dataset/tsv TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True TEST.OUTPUT_FEATURE True

with environment

PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 16.04.7 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
CMake version: version 3.5.1

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration: 
GPU 0: Tesla V100-SXM2-32GB
GPU 1: Tesla V100-SXM2-32GB
GPU 2: Tesla V100-SXM2-32GB
GPU 3: Tesla V100-SXM2-32GB
GPU 4: Tesla V100-SXM2-32GB
GPU 5: Tesla V100-SXM2-32GB
GPU 6: Tesla V100-SXM2-32GB
GPU 7: Tesla V100-SXM2-32GB

Nvidia driver version: 418.67
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] torch==1.4.0
[pip3] torchvision==0.5.0
[conda] blas                      1.0                         mkl  
[conda] mkl                       2020.2                      256  
[conda] mkl-service               2.3.0            py37he8ac12f_0  
[conda] mkl_fft                   1.3.0            py37h54f3939_0  
[conda] mkl_random                1.1.1            py37h0573a6f_0  
[conda] pytorch                   1.4.0           py3.7_cuda10.1.243_cudnn7.6.3_0    pytorch
[conda] torchvision               0.5.0                py37_cu101    pytorch
        Pillow (8.2.0)

I encountered the error as following

RuntimeError: CUDA error: invalid device function (launch_kernel at /opt/conda/conda-bld/pytorch_1579022060824/work/aten/src/ATen/native/cuda/Loops.cuh:103)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7f3f8ee64627 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: void at::native::gpu_index_kernel<__nv_dl_wrapper_t<__nv_dl_tag<void (*)(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>)), 1u>> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>, __nv_dl_wrapper_t<__nv_dl_tag<void (*)(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>)), 1u>> const&) + 0x78d (0x7f3f9670368d in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #2: <unknown function> + 0x571bf32 (0x7f3f966fcf32 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #3: <unknown function> + 0x571c298 (0x7f3f966fd298 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #4: <unknown function> + 0x16957eb (0x7f3f926767eb in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #5: at::native::index(at::Tensor const&, c10::ArrayRef<at::Tensor>) + 0x47e (0x7f3f926725ae in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #6: <unknown function> + 0x1c0155a (0x7f3f92be255a in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #7: <unknown function> + 0x1c06023 (0x7f3f92be7023 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #8: <unknown function> + 0x3820d1a (0x7f3f94801d1a in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #9: <unknown function> + 0x1c06023 (0x7f3f92be7023 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #10: at::Tensor::index(c10::ArrayRef<at::Tensor>) const + 0x191 (0x7f3fc1465931 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #11: nms_cuda(at::Tensor, float) + 0x7e8 (0x7f3f6982407b in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
frame #12: nms(at::Tensor const&, at::Tensor const&, float) + 0x790 (0x7f3f697eabb0 in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
frame #13: <unknown function> + 0x53b97 (0x7f3f697fbb97 in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
frame #14: <unknown function> + 0x5004d (0x7f3f697f804d in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
<omitting python frames>

I tried both to set up my env by using option 1 and docker image, both environments give me the same error.
If anyone also has the same issue, please guide me.

ModelZoo contains broken links

demo_image.py not working

Hi,

I am running the following code:

python tools/demo/demo_image.py --config_file sgg_configs/vgattr/vinvl_x152c4.yaml --img_file women_fish.jpg --save_file output/woman_fish_x152c4.obj.jpg MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "." TEST.IGNORE_BOX_REGRESSION False

Here is the error:

    rel_subj_centers = [r['subj_center'] for r in rel_dets]
UnboundLocalError: local variable 'rel_dets' referenced before assignment

I believe the bug is in line

if isinstance(model, SceneParser):

KeyError: 'box_features' --- extracting features from own image tsv files

Hi - I am getting a KeyError box features message when trying to extract features from my own image. I've played around with the code, but can't seem to figure it out. If I set the TEST.OUTPUT_FEATURE to False, then the code runs fine, but just outputting the detected objects. Can someone please help out?

For reference, the demo extraction on single image works fine - both object detection and box features.

Traceback (most recent call last):
File "tools/test_sg_net.py", line 197, in
main()
File "tools/test_sg_net.py", line 193, in main
run_test(cfg, model, args.distributed, model_name)
File "tools/test_sg_net.py", line 72, in run_test
save_predictions=cfg.TEST.SAVE_PREDICTIONS,
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 297, in inference
relation_on=cfg.MODEL.RELATION_ON,
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 211, in convert_predictions_to_tsv
tsv_writer(gen_rows(), os.path.join(output_folder, output_tsv_name))
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/structures/tsv_file_ops.py", line 42, in tsv_writer
for value in values:
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 139, in gen_rows
features = prediction.get_field('box_features').numpy()
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 43, in get_field
return self.extra_fields[field]
KeyError: 'box_features'

Broken links to VinVL model and associated labelmaps

Hi!
These links are broken (resource not found error):
https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json

When I extract image features with VinVL, the AssertionError occur

hello! When I extract image features with VinVL by the command :

python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True TEST.OUTPUT_FEATURE True

I get the traceback:

Traceback (most recent call last):
File "tools/test_sg_net.py", line 129, in
main()
File "tools/test_sg_net.py", line 106, in main
data_loaders_val = make_data_loader(cfg, is_train=False, is_distributed=distributed)
File "/content/drive/MyDrive/VinVL/scene_graph_benchmark/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 170, in make_data_loader
datasets = build_dataset(cfg, transforms, DatasetCatalog, is_train or is_for_period)
File "/content/drive/MyDrive/VinVL/scene_graph_benchmark/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 45, in build_dataset
cfg, dataset_name, factory_name, is_train
File "/content/drive/MyDrive/VinVL/scene_graph_benchmark/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/utils/config_args.py", line 7, in config_tsv_dataset_args
assert op.isfile(full_yaml_file)
AssertionError

I guess the reason is that the file flickr30k/tsv/flickr30k.yaml is not exist, so where can I find this yaml file?

I also try to delete the line :
TEST: ("flickr30k/tsv/flickr30k.yaml",)
in the file: vinvl_x152c4.yaml,
but after run the code, there is nothing output.

Is that a bug for extracting visual features?

Hi,
I run the demo_image.py. But I found something inconsistent.
In code, the image is changed into RGB format to feed into the detection model.

But, I found that in the configure file. the PIXEL MEAN is : [103.530, 116.280, 123.675], which is BGR format indeed. and I found in
the tsv_demo.py, the image is also BGR format read by CV2.
So I am confused which is right?

KeyError: 'gt_labels' during training

When attempting to perform training using tools/train_sg_net.py and a config file like sgg_configs/vg_vrd/rel_danfeiX_FPN50_nm.yaml, I receive the following error:

Traceback (most recent call last):
  File "tools/train_sg_net.py", line 225, in <module>
    main()
  File "tools/train_sg_net.py", line 218, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "tools/train_sg_net.py", line 110, in train
    meters
  File "/home/scene_graph_benchmark/maskrcnn_benchmark/engine/trainer.py", line 94, in do_train
    loss_dict = model(images, targets)
  File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/parallel/distributed.py",
 line 619, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/scene_graph_benchmark/scene_graph_benchmark/scene_parser.py", line 319, in forward
    x_pairs, prediction_pairs, relation_losses = self.relation_head(features, predictions, targets)
  File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/scene_graph_benchmark/scene_graph_benchmark/relation_head/relation_head.py", line
211, in forward
    = self.rel_predictor(features, proposals, proposal_pairs)
  File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/scene_graph_benchmark/scene_graph_benchmark/relation_head/neural_motif/neuralmotif
.py", line 135, in forward
    = _get_tensor_from_boxlist(proposals, 'gt_labels')
  File "/home/scene_graph_benchmark/scene_graph_benchmark/relation_head/sparse_targets.py", line
 61, in _get_tensor_from_boxlist
    assert proposals[0].extra_fields[field] is not None
KeyError: 'gt_labels'

Has anyone else encountered the same problem?

Res152c4 on 4 datasets seems not right

VinVL's DOWNLOAD.md says We also provide the X152-C4 objecte detection config file and pretrained model on the merged four datasets (COCO with stuff, Visual Genome, Objects365 and Open Images). The labelmap to decode the 1848 can be found here. The first 1594 classes are exactly VG classes, with the same order. The map from COCO vocabulary to this merged vocabulary can be found here. The map from Objects365 vocabulary to this merged vocabulary can be found here. The map from OpenImages V5 vocabulary to this merged vocabulary can be found here.

But I am wondering how to run this pretrained model?
Obviously Scene Graph Benchmark can't run this pre-trained model since the configuration file is not compatible with that package. I force to change the config file (deleting options one by one until yacs accepts), so I can manage to run the pre-trained model, but results are not right because the number of boxes are too small compared to other detector (which should not be) ...

Any help please?

Pre-extracted Image Features: what OD model is used?

Hi,
In here, we can easily use pre-extracted image features.

And I thought these features are from VINVL OD model trained from the merged four datasets: COCO with stuff, Visual Genome, Object365 and Open Images.

However, I found that features and corresponding labels (object tags) are only from the Visual Genome dataset, which shows inferior performance than that from merged four datasets (according to VinVL paper)

So I want to clarify whether the given image features are from the pretrained X152-C4 object-attribute detection (based on only the Visual Genome dataset) or from the pretrained model on the merged four datasets.

Thanks

predcls ValueError: object labelmap is required, but was not provided

hi I want to get some suggestions about the more object label circumstance. I found the project provides us the labelmap file. But the labelmap only provide 50 objects. It is so smaller. I run the coco2014 , the object label 1370 is far bigger than it . So We should only add some id and name to the labelmap.file or we should from scratch to train the object detector of the project ? I have the coco 2014 36 box's label. But I don't know how to get bigger object label labelmap file

If you have way, please help me . Thanks

KeyError: 'broccoli'
Killing subprocess 4356

About attribute and object lable for the pointed or designated bounding box

Dear scholar,
I want to ask whether your elegant code includes the function about produce a description about the attribute and object for the designated bounding box.
In your tools/demo_image.py , it can produce 36 bounding box with the attribute and object label by your model with the pretrained weights file.

image

Could your code pass the boxlist to the model, and the model produce the designated bounding box 's attribute and object label?

How to generate the predicted object attributes and relations labels together?

@hanxiaotian Hi, thanks a lot for releasing the great SGG benchmark! I want to extract the predicted scene graph (with predicted boxes, attributes and relations) from scratch for new images using the pre-trained models in the model zoo. However, when I try to use the demo script, I notice the model cannot predict attributes and relations together (the VinVL pre-trained model only predicts the attributes and the RelDN pre-trained model only generates relations, where I need to do further ad-hoc alignment with the two outputs to predict a full scene graph). Is there any way to achieve this using a single provided pre-trained model? (Sorry I'm not very familiar with this task.) Thank you very much!

About Vinvl R50-C4 Model

Could you please share your VINVL R50-C4 detection model which is referred in Oscar plus, I want to make some further experiments with this model.
Thank you!

Image file in Vinvl example script

I failed to find the image source when I was trying to execute the script about Vinvl, where can I get the image file in the script"--img_file ../maskrcnn-benchmark-1/datasets1/imgs/woman_fish.jpg"?

VinVL can model the relation prediction?

I found VinVL 'S object and attribute lable is so bigger. So How to use the VinVL in predicate classification. At Now the project only provides 150 object. But visual genome +faster rcnn can detect 1370 object class. It is so big difference.

fail to setup

error occurs "#error "You're running a too old version of GCC. We need GCC 5 or later." when I build the setup

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.