Giter VIP home page Giter VIP logo

paco's Introduction

PACO: Parts and Attributes of Common Objects

PACO

PACO is a detection dataset that goes beyond traditional object boxes and masks and provides richer annotations such as part masks and attributes. It spans 75 object categories, 456 object-part categories and 55 attributes across image (LVIS) and video (Ego4D) datasets. We provide 641K part masks annotated across 260K object boxes, with roughly half of them exhaustively annotated with attributes as well. We design evaluation metrics and provide benchmark results for three tasks on the datasets: part mask segmentation, object and part attribute prediction and zero-shot instance detection. See the paper and dataset doc for more details.

This repository contains data loaders, training and evaluation scripts for joint object-part-attributes detection models, query evaluation scripts, and visualization notebooks for PACO dataset.

[Paper][Website]

Setup

Library Setup

To setup the library run:

git clone [email protected]:facebookresearch/paco.git
cd paco/
conda create --name paco python=3.9
conda activate paco
pip install torch==1.10.2+cu113 torchvision==0.11.3+cu113 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
pip install -e .

Test whether the setup was succesful:

(paco) ~/paco$ python
Python 3.9.15 (main, Nov 24 2022, 14:31:59)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import paco
>>>

Dataset setup

To train and evaluate on PACO dataset using this repo, please download the train2017 and val2017 COCO images (PACO-LVIS uses them), PACO annotations, and PACO-EGO4D frames. Then setup the following environment variables:

conda env config vars set PACO_ANNOTATION_ROOT=/path/to/paco/annotations
conda env config vars set PACO_IMAGES_ROOT=/path/to/paco/images
conda env config vars set COCO_IMAGES_ROOT=/path/to/coco/images
conda activate paco

To download PACO annotations:

cd /path/to/paco/annotations
wget https://dl.fbaipublicfiles.com/paco/annotations/paco_lvis_v1.zip
wget https://dl.fbaipublicfiles.com/paco/annotations/paco_ego4d_v1.zip
diff <(sha256sum paco_lvis_v1.zip) <(echo "02ac4edb22c251e07853e6231d69aec3fad0a180f03de2f8c880650322debc80  paco_lvis_v1.zip")
diff <(sha256sum paco_ego4d_v1.zip) <(echo "5be2b35ce32f1b76ffafa54a075dd660baa1816d446736ad5a23897848719515  paco_ego4d_v1.zip")

and if SHA-256 checksum matches

unzip paco_lvis_v1.zip
unzip paco_ego4d_v1.zip

To download PACO-EGO4D frames:

  1. Review and accept the terms of Ego4D usage agreement (Ego4D Dataset). It takes 48 hours to obtain credentials needed for frame download.
  2. Download the frames
ego4d --output_directory /temp/folder --version v1 --datasets paco_frames
mv /temp/folder/v1/paco_frames /path/to/paco/images

Usage

Pre-trained models

For links to pre-trained models and corresponding performance see our model zoo. The evaluation commands in the next section will assume that the models are downloaded into ./models folder in the paco repo root.

Example commands

Training

To train an R101 FPN model on joint PACO dataset on a single node with 8 GPUs, run

./tools/lazyconfig_train_net.py --config-file ./configs/mask_rcnn_configs/r101_attr_fpn_100_ep.py --num-gpus 8

To train the same model on PACO-LVIS only, run

./tools/lazyconfig_train_net.py --config-file ./configs/mask_rcnn_configs/r101_attr_fpn_100_ep.py --num-gpus 8 dataloader.train.dataset.names=paco_lvis_v1_train

To train the same model on 8 nodes with 8 GPUs with 32GB RAM in SLURM, run

python3 ./tools/multi_node_training.py \
    --config-file ./configs/mask_rcnn_configs/r101_attr_fpn_100_ep.py \
    --num-gpus 8 \
    --num-machines 8 \
    --use-volta32 \
    --name "r101_100_ep" \
    --target "./tools/lazyconfig_train_net.py" \
    --job-dir "./output/r101_100_ep/"

Eval

To evaluate a pre-trained ViT-L FPN model on PACO-EGO4D on a single node with 8 GPUs, run

./tools/lazyconfig_train_net.py --config-file ./configs/mask_rcnn_configs/vit_b_fpn_100_ep.py --eval-only --num-gpus 8 train.init_checkpoint=./models/vit_l_fpn_joint.pth dataloader.test.dataset.names=paco_ego4d_v1_test

Query eval

To evaluate a pre-trained ViT-L FPN model on PACO-EGO4D query dataset on a single node with 8 GPUs, run

./tools/lazyconfig_train_net.py --config-file ./configs/query_eval_configs/vit_l_fpn_query_eval.py --eval-only --num-gpus 8 train.init_checkpoint=./models/vit_l_fpn_joint.pth dataloader.test.dataset.names=paco_ego4d_v1_test

Citation

@inproceedings{ramanathan2023paco,
  title={{PACO}: Parts and Attributes of Common Objects},
  author={Ramanathan, Vignesh and Kalia, Anmol and Petrovic, Vladan and Wen, Yi and Zheng, Baixue and Guo, Baishan and Wang, Rui and Marquez, Aaron and Kovvuri, Rama and Kadian, Abhishek and Mousavi, Amir and Song, Yiwen and Dubey, Abhimanyu and Mahajan, Dhruv},
  booktitle={arXiv preprint arXiv:2301.01795},
  year={2023}}
}

License

Copyright (c) Meta Platforms, Inc. and affiliates.

This source code is licensed under the license found in the LICENSE file in the root directory of this source tree.

paco's People

Contributors

anmolkalia avatar eltociear avatar vpetrovi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paco's Issues

Apply part segmentation on OUR OWN DATA

Hi! Thanks a lot for your amazing benchmark and code!

I wonder if you could provide some instructions on how to get part segments on our OWN DATA by the models you provided. It seems that right now the repo only offers the instruction to do inference on LVIS and Ego4D.

Thanks!

Typo in paper

Great work! I think in section 3 you meant to say "belong to small and medium" size in this sentence (instead of low)

image

Score Calculation

Hi, I am interested in analyzing open-vocabulary detectors with the zero-shot instance detection task. I read in the paper (Appendix H) how Detic and MDETR were used, but I wanted to clarify if the query scores for the open-vocabulary detectors were computed in the same way as the PACO-trained models (Appendix G). If so, this repo supports such evaluation of open-vocabulary detectors, correct?

Thank you for your time and consideration.

test json

The paper reports results on test splits, but where can I get paco_lvis_v1_test.json to reproduce the results of the paper

how to visualize the paco dataset ?

thanks for your great work, is there a function to visualize the paco format dataset? i tried use coco related function,but it can't work

Resume training from the last checkpoint

Hi,

Is there any way to resume the training from the latest checkpoint with iteration number parsed from the last checkpoint? Detectron2 has this functionality with the --resume switch, however, it is not working with this baseline of PACO.

Thanks,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.