Giter VIP home page Giter VIP logo

coco-rem's Introduction

COCO-ReM (COCO with Refined Masks)

Framework: PyTorch HuggingFace Datasets

Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai

Equal Contribution

[arxiv] [Dataset Website]

Random examples from COCO-ReM

Introducing COCO-ReM, a set of high-quality instance annotations for COCO images. COCO-ReM improves on imperfections prevailing in COCO-2017 such as coarse mask boundaries, non-exhaustive annotations, inconsistent handling of occlusions, and duplicate masks. Masks in COCO-ReM have a visibly better quality than COCO-2017, as shown below.

COCO and COCO-ReM

Contents

  1. News
  2. Setup Instructions
  3. Download COCO-ReM
  4. Mask Visualization
  5. Evaluation using COCO-ReM
  6. Training with COCO-ReM
  7. Annotation Pipeline
  8. Citation

News

Setup Instructions

Clone the repository, create a conda environment, and install all dependencies as follows:

git clone https://github.com/kdexd/coco-rem.git && cd coco-rem
conda create -n coco_rem python=3.10
conda activate coco_rem

Install PyTorch and torchvision following the instructions on pytorch.org. Install Detectron2, instructions are available here. Then, install the dependencies:

pip install -r requirements.txt
pip install git+https://github.com/facebookresearch/segment-anything.git
pip install git+https://github.com/bowenc0221/boundary-iou-api.git

python setup.py develop

Download COCO-ReM

COCO-ReM is hosted on Huggingface Datasets at @kdexd/coco-rem. Download the annotation files:

for name in trainrem valrem; do
    wget https://huggingface.co/datasets/kdexd/coco-rem/resolve/main/instances_$name.json.zip
    unzip instances_$name.json.zip
done

Dataset organization: COCO and COCO-ReM and must be organized inside datasets directory as follows.

$PROJECT_ROOT/datasets
    — coco/
        — train2017/         # Contains 118287 train images (.jpg files).
        — val2017/           # Contains 5000 val images (.jpg files).
        — annotations/
            — instances_train2017.json
            — instances_val2017.json
    - coco_rem/
            - instances_trainrem.json
            - instances_valrem.json
    -lvis
            - lvis_v1_val.json
            - lvis_v1_train.json

Mask Visualization

We include a lightweight script to quickly visualize masks of COCO-ReM and COCO-2017, both validation and training sets. For example, run the following command to visualize the masks for COCO-ReM validation set:

python scripts/visualize_coco.py \
    --input-json datasets/coco_rem/instances_valrem.json \
    --image-dir datasets/coco/val2017 \
    --output visualization_output

Read the documentation (python scripts/visualize_coco.py --help) for details about other arguments.


Evaluation using COCO-ReM

We support evaluation of all fifty object detectors available in the paper. First, run python checkpoints/download.py to download all the pre-trained models from their official repositories and save them in checkpoints/pretrained_weights.

For example, to evaluate a Mask R-CNN ViTDet-B model using 8 GPUs and calculate average precision (AP) metrics, run the following command:

python scripts/train_net.py --num-gpus 8 --eval-only \
    --config coco_rem/configs/vitdet/mask_rcnn_vitdet_b_100ep.py \
    train.init_checkpoint=checkpoints/pretrained_weights/vitdet/mask_rcnn_vitdet_b_100ep.pkl \
    dataloader.test.dataset.names=coco_rem_val \
    train.output_dir=evaluation_results

Training with COCO-ReM

We also support training ViTDet baselines on COCO-ReM using the Detectron2 library. Run the following command to train using 8 GPUs (with at least 32GB memory):

python scripts/train_net.py --num-gpus 8 \
    --config coco_rem/configs/vitdet/mask_rcnn_vitdet_b_100ep.py \
    dataloader.train.dataset.names=coco_rem_train \
    dataloader.test.dataset.names=coco_rem_val \
    train.output_dir=training_output \
    dataloader.train.total_batch_size=16 train.grad_accum_steps=4

For GPUs with less memory, update the parameters in the last line above: the batch size can be halved and gradient accumulation steps can be doubled, for same results.

Annotation Pipeline

Stage 1: Mask Boundary Refinement (automatic step)

Download checkpoint for SAM from segment-anything repository and place it in checkpoint folder.

Run the following command to refine the boundaries of validation set masks using 8 GPUs:

python scripts/refine_boundaries.py \
    --input-json datasets/coco/annotations/instances_val2017.json \
    --image-dir datasets/coco/val2017 \
    --num-gpus 8 \
    --output datasets/intermediate/cocoval_boundary_refined.json

Read the documentation (python scripts/refine_boundaries.py --help) for details about other arguments.

Use default values for other optional arguments to follow the strategy used in paper.

Do this stage for both COCO and LVIS datasets before the merging stage.

Stage 2: Exhaustive Instance Annotation (automatic step)

Run the following command to merge LVIS annotations for validation set of COCO using the strategy described in paper:

python scripts/merge_instances.py \
    --coco-json datasets/intermediate/cocoval_boundary_refined.json \
    --lvis-json datasets/intermediate/lvistrain_boundary_refined.json datasets/intermediate/lvisval_boundary_refined.json \
    --split val \
    --output datasets/intermediate/cocoval_lvis_merged.json

Read the documentation (python scripts/merge_instances.py --help) for details about above arguments.

Merging handpicked (image,category) non exhaustive instances from LVIS in validation set is done in the script of next stage.

Stage 3: Correction of Labeling Errors

This stage is done only for validation set.

python scripts/correct_labeling_errors.py \
    --input datasets/intermediate/cocoval_lvis_merged.json \
    --output datasets/cocoval_refined.json

Note: For the above json to be COCO-ReM we also have to perform the manual parts of Stage 1 and Stage 2.

Citation

If you found COCO-ReM useful in your research, please consider starring ⭐ us on GitHub and citing 📚 us in your research!

@inproceedings{cocorem,
  title={Benchmarking Object Detectors with COCO: A New Path Forward},
  author={Singh, Shweta and Yadav, Aayan and Jain, Jitesh and Shi, Humphrey and Johnson, Justin and Desai, Karan},
  journal={ECCV},
  year={2024}
}

coco-rem's People

Contributors

kdexd avatar

Stargazers

Sai Kumar Dwivedi avatar Johannes avatar Scott Laue avatar  avatar  avatar RUAN SHIHAI avatar Haobo Yuan avatar  avatar Zilin Wang avatar Tai Duc Nguyen avatar Mohamed El Banani avatar Ziyang Chen avatar Trevor Lynn avatar  avatar Aayan Yadav avatar  avatar  avatar Vishaal Udandarao avatar  avatar Yang Liu avatar

Watchers

 avatar Aayan Yadav avatar

coco-rem's Issues

About Box AP

thank you for your excellent work, I was wondering are there any box AP results of COCO-ReM datasets? I only seen mask AP based on different detector in the paper

About boundary-iou

Hey @kdexd,

great effort and congrats on the ECCV paper! 🥳 I'm already using the annotations in my work :)

I was wondering if you will also report boundary-iou in the upcoming ECCV paper? I'v seen that you are calculating it in the evaluator but the arXiv version 'only' reports mask AP if I'm not mistaken?

Either way, great work!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.