Giter VIP home page Giter VIP logo

detectron-self-train's Introduction

PyTorch-Detectron for domain adaptation by self-training on hard examples

intro

This codebase replicates results for pedestrian detection with domain shifts on the BDD100k dataset, following the CVPR 2019 paper Automatic adaptation of object detectors to new domains using self-training. We provide trained models, train and eval scripts as well as splits of the dataset for download. More details are available on the project page.

This repository is heavily based off A Pytorch Implementation of Detectron. We modify it for experiments on domain adaptation of face and pedestrian detectors.

If you find this codebase useful, please consider citing:

@inproceedings{roychowdhury2019selftrain,
    Author = {Aruni RoyChowdhury and Prithvijit Chakrabarty  and Ashish Singh and SouYoung Jin and Huaizu Jiang and Liangliang Cao and Erik Learned-Miller},
    Title = {Automatic adaptation of object detectors to new domains using self-training},
    Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    Year = {2019}
}

Getting Started

Clone the repo:

git clone [email protected]:AruniRC/detectron-self-train.git

Requirements

Tested under python3.

  • python packages
    • pytorch>=0.3.1
    • torchvision>=0.2.0
    • cython
    • matplotlib
    • numpy
    • scipy
    • opencv
    • pyyaml
    • packaging
    • pycocotools โ€” for COCO dataset, also available from pip.
    • tensorboardX โ€” for logging the losses in Tensorboard
  • An NVIDAI GPU and CUDA 8.0 or higher. Some operations only have gpu implementation.
  • NOTICE: different versions of Pytorch package have different memory usages.

Installation

This walkthrough describes setting up this Detectron repo. The detailed instructions are in INSTALL.md.

Dataset

Create a data folder under the repo,

cd {repo_root}
mkdir data

BDD-100k

Our pedestrian detection task uses both labeled and unlabeled data from the Berkeley Deep Drive BDD-100k dataset. Please register and download the dataset from their website. We use a symlink from our project root, data/bdd100k to link to the location of the downloaded dataset. The folder structure should be like this:

data/bdd100k/
    images/
        test/
        train/
        val/
    labels/
        train/
        val/

BDD-100k takes about 6.5 GB disk space. The 100k unlabeled videos take 234 GB space, but you do not need to download them, since we have already done the hard example mining on these and the extracted frames (+ pseudo-labels) are available for download.

BDD Hard Examples

Mining the hard positives ("HPs") involve detecting pedestrians and tracklet formation on 100K videos. This was done on the UMass GPU Cluster and took about a week. We do not include this pipeline here (yet) -- the mined video frames and annotations are available for download as a gzipped tarball from here. NOTE: this is a large download (23 GB). The data retains the permissions and licensing associated with the BDD-100K dataset (we make the video frames available here for ease of research).

Now we create a symlink to the untarred BDD HPs from the project data folder, which should have the following structure: data/bdd_peds_HP18k/*.jpg. The image naming convention is <video-name>_<frame-number>.jpg.

Annotation JSONs

All the annotations are assumed to be downloaded inside a folder data/bdd_jsons relative to the project root: data/bdd_jsons/*.json. We use symlinks here as well, in case the JSONs are kept in some other location.

Data Split JSON Dataset name Image Dir.
BDD_Source_Train bdd_peds_train.json bdd_peds_train data/bdd100k
BDD_Source_Val bdd_peds_val.json bdd_peds_val data/bdd100k
BDD_Target_Train bdd_peds_not_clear_any_daytime_train.json bdd_peds_not_clear_any_daytime_train data/bdd100k
BDD_Target_Val bdd_peds_not_clear_any_daytime_val.json bdd_peds_not_clear_any_daytime_val data/bdd100k
BDD_dets bdd_dets18k.json DETS18k data/bdd_peds_HP18k
BDD_HP bdd_HP18k.json HP18k data/bdd_peds_HP18k
BDD_score_remap bdd_HP18k_remap_hist.json HP18k_remap_hist data/bdd_peds_HP18k
BDD_target_GT bdd_target_labeled.json bdd_peds_not_clear_any_daytime_train_100 data/bdd100k

Models

Use the environment variable CUDA_VISIBLE_DEVICES to control which GPUs to use. All the training scripts are run with 4 GPUs. The trained model checkpoints can be downloaded from the links under the column Model weights. The eval scripts need to be modified to point to where the corresponding model checkpoints have been downloaded locally. To be consistent, we suggest creating a folder under the project root like data/bdd_pre_trained_models and saving all the models under it.

The performance numbers shown are from single models (the same models available for download), while the tables in the paper show results averaged across 5 rounds of train/test.

Method Model weights Config YAML Train script Eval script AP, AR
Baseline bdd_baseline cfg train eval 15.21, 33.09
Dets bdd_dets cfg train eval 27.55, 56.90
HP bdd_hp cfg train eval 28.34, 58.04
HP-constrained bdd_hp-cons cfg train eval 29.57, 56.48
HP-score-remap bdd_score-remap cfg train eval 28.11, 56.80
DA-im bdd_da-im cfg train eval 25.71, 56.29
Src-Target-GT bdd_target-gt cfg train eval 35.40, 66.26

Inference demo

HP-constrained Baseline
HP-cons Baseline

The folder gypsum/scripts/demo contains two shell scripts that run the pre-trained Baseline (BDD-Source trained) and HP-constrained (domain adapted to BDD Target) models on a sample image. Please change the MODEL_PATH variable in these scripts to where the appropriate models have been downloaded locally. Your results should resemble the example shown above. Note that the domain adapted model (HP-constrained) detects pedestrians with higher confidence (visualization threshold is 0.9 on the confidence score), while making one false positive in the background.

Acknowledgement

This material is based on research sponsored by the AFRL and DARPA under agreement num-ber FA8750-18-2-0126. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the AFRL and DARPA or the U.S. Government. We acknowledge support from the MassTech Collaborative grant for funding the UMass GPU cluster. We thank Tsung-Yu Lin and Subhransu Maji for helpful discussions.

We appreciate the well-organized and accurate codebase for the Detectron implementation in PyTorch from the creators of A Pytorch Implementation of Detectron. Also thanks to the creators of BDD-100k which has allowed us to share our pseudo-labeled video frames for our academic, non-commercial purpose of quickly reproducing results.

detectron-self-train's People

Contributors

arunirc avatar jiasenlu avatar jwyang avatar pcjohn avatar roytseng-tw avatar yuliang-zou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

detectron-self-train's Issues

initial weight download

When I try to train the model, I cannot find the file "/mnt/nfs/scratch1/pchakrabarty/bdd_recs/ped_models/bdd_peds.pth" to initialize the model. Could you let me know what it is and where to download it.

ImportError: cannot import name numpy_type_map

Hi, Automatic adaptation of object detectors to new domains using self-training is a nice work, but when I run gypsum/scripts/demo/hp_cons_demo.sh use bdd_HP-cons_model_step29999.pth.pth and occur

Traceback (most recent call last):
  File "tools/infer_demo.py", line 35, in <module>
    import nn as mynn
  File "/content/detectron-self-train/lib/nn/__init__.py", line 2, in <module>
    from .parallel import DataParallel
  File "/content/detectron-self-train/lib/nn/parallel/__init__.py", line 3, in <module>
    from .data_parallel import DataParallel, data_parallel
  File "/content/detectron-self-train/lib/nn/parallel/data_parallel.py", line 4, in <module>
    from .scatter_gather import scatter_kwargs, gather
  File "/content/detectron-self-train/lib/nn/parallel/scatter_gather.py", line 8, in <module>
    from torch.utils.data.dataloader import numpy_type_map
ImportError: cannot import name numpy_type_map

System information

  • Operating system: Ubuntu 18.04.5 LTS
  • CUDA version: 10.1
  • python version: 3.6.9
  • pytorch version: 1.7.0+cu101
  • torchvision version: 0.8.1+cu101

and it does not seem to support torch >= 1.1.0 and I also find some issues.

detectron2 supports pytorch 1.3.0 and above, and cuda 10.1 only supports pytorch 1.4 and above.
Is it possible to move to detectron2?

Thanks

Is it possible to convert to Caffe2?

Hello.

I have been impressed with your research.
I would like to test your output with caffe2, but I want to know if it is possible.

Thank you :)

how to generate pseudo labels from baseline detection model

Hi
I think the the bdd_peds+DETS18k means using bboxes from the source dataset (bdd_peds) and the pseudo labels generated by the baseline detection model for target dataset. Could you let me know how to generate the pseudos labels for this part? Do you run the baseline model on all the training sample in the target dataset and filter ~100000 images?

What if I'd like to use own dataset

Thanks for great sources.
I want to try to implement this with my own dataset, so I'd like to ask you guys which settings or configuration I should change.

What I am thinking of are, config file and dataset path (annotated in Pascal VOC ideally).
Is there any other things that I should take care of?

Thank you.

What if re-training on pseudo-labeled target images only?

Hi,

After pseudo-labels of unlabeled target images are generated, you re-train the baseline source model jointly on the combined set of source and target images. However, the source images might not always be available. Did you try re-training on pseudo-labeled target images only? What is the expected performance?

Width and Height of the images

Hi,

I am a little bit confused about your json data converting.

bdd100k
image['width'] = 720
image['height'] = 1280

Wider
image['width'] = im.height
image['height'] = im.width

It seems that you have swapped the height and width of the image.
What does it mean here?
Does it affect the training and testing of the model?

Can you provide a separate evaluation code?

Hi,
As far as I understand, the evaluation is done on the fly you run the detection.
That means we can not run a different model from another source (i.e. tensorflow).
Can you provide the evaluation script that is able to evaluate the box prediction only?

For example, I have a separate tensorflow model that output the 'bbox_bdd_peds_val_results.json'
And I want to evaluate this result file on the ground truth 'bdd_peds_val.json'.
That means I do not have to run your detection script.
It just like: evaluate.py --gt bdd_peds_val.json --pred bbox_bdd_peds_val_results.json

Thank you

sh make.sh problem in Window10

First of all, Thank you for provide your code.

I followed https://github.com/AruniRC/detectron-self-train/blob/master/INSTALL.md for Installation.

But i met problem at "Compile Detectron-pytorch"
cd lib # please change to this directory
sh make.sh

Currently, i am using Window10, so i cannot access sh make.sh.
Could you suggest solution of this problem?

System information

  • Operating system: Window10
  • CUDA version: 10.0
  • cuDNN version: ?
  • GPU models (for all devices if they are not all the same): Titan Xp
  • python version: 3.7
  • pytorch version: 1.1.0

Images without pedestrian bbox

Hi,

As I have checked in the bdd_peds_train.json and bdd_peds_val.json, there are a lot of images without bounding boxes annotation. How do you train/ evaluate your model for this case.
For example, in the training file only 4428/12477 images have bboxes annotation, and in the validation there are 628/1764.

Thank a lot

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.