VFP290K: A Large-Scale Benchmark Dataset for Vision-based Fallen Person Detection

This repository is the official documentation & implementation of VFP290K: A Large-Scale Benchmark Dataset for Vision-based Fallen Person Detection.

Requirements

Our pretrained models except YOLO are based on MMdetection2 detection framework. You can donwload coco-pretrained models for the transfer learning.

Download our VFP290K dataset in here: VFP290K.

Environments

CUDA=11.1
python=3.7
CUDNN=7.6.5

MMdetection-based models

1. Install Pytorch

We use pytorch=1.8.0 from this link.

2. Install MMDetection & pkgs

pip install openmim
mim install mmdet
pip install future tensorboard
pip install -r requirements.txt

3. Prepare all preprocessed data for training and inference.

python make_label.py --data_root_dir <VFP directory>

4. Download the checkpoints files

You can find checkpoints files in the official repository and put them into ./checkpoints

5. Set a config file.

We prepare all config files used in our experiments in "configs/VFP290K".
Set your labels.txt and VFP290K data directory.
(classes= "<YOUR labels.txt DIRECTORY>", data_root = "<YOUR DATA DIRECTORY>")

6. Running Benchmark or desired experiment

To train and evaluate the model(s) in the paper, run this command:

single gpu training
```
python tools/train.py <config> --gpu-ids <device> 
```
and indicate path of the config file and gpu id, respectively. This example is for train faster R-CNN model on gpu 0.
ex) python tools/train.py configs/VFP290K/faster_rcnn_r50_1x_benchmark.py --gpu-ids 0
multi gpu training
```
bash ./tools/dist_train.sh <config> <num_gpu> 
```
<num_gpu> is a number of gpus to use. This example is for train faster R-CNN model with 4 gpus.
ex) bash ./tools/dist_train.sh configs/VFP290K/faster_rcnn_r50_1x_benchmark.py 4
test After train the model, you can evaluate the result.
```
python tools/test.py <config> <weight> --eval bbox --gpu-ids <device>
```
is the path of the trained model weight.
ex) python tools/test.py configs/VFP290K/faster_rcnn_r50_1x_benchmark.py work_dirs/faster_rcnn_r50_1x_benchmark/latest.pth --eval bbox --gpu-ids 1

YOLOv5

1. Generate .txt files for yolo.

cd yolov5
python configs/data_refactoring.py --data_root_dir <{YOUR DATA ROOT DIRECTORY}/yolov5>

2. Change the configuration.

E.g) configs/benchmark.yaml
train: /media/data1/VFP290K/VFP290K/yolov5/benchmark/train/image
val: /media/data1/VFP290K/VFP290K/yolov5/benchmark/val/image

2. Training.

Training process is exactly same with official code. E.g)

python train.py --img-size 640 --epochs 100 --data configs/benchmark.yaml --batch-size 48 --cfg ./models/yolov5x.yaml --device 0,1 --workers 8

You can train your model by using this script.

5. Testing.

Along with the training, test process is also needed to evaluate our model.

python test.py --weights runs/train/exp<your_exp_num>/weights/best.pt --data data/test.yaml --batch-size 48 --img-size 640 --conf-thres 0.5 --iou-thres 0.5 --device 0,1

Results

Our model achieves the following performance on benchmark:

Method	Two-Stage			One-Stage			Transformer -based
Model	Faster R-CNN	Cascade R-CNN	DetectoRS	RetinaNet	YOLO3	YOLO5	DETR
mAP	0.732	0.751	0.746	0.750	0.590	0.741	0.605
AP_50	0.873	0.874	0.866	0.910	0.813	0.838	0.868
AP_75	0.799	0.811	0.797	0.811	0.670	0.784	0.687

Our model achieves the following performance on Background:

Backbone	Training	Street	Park	Building	Street	Park	Building	Street	Park	Building
	Test	Street			Park			Building
Faster R-CNN	mAP AP_50 AP_75	0.742 0.910 0.829	0.732 0.860 0.809	0.616 0.828 0.723	0.620 0.786 0.690	0.706 0.857 0.768	0.517 0.705 0.588	0.748 0.876 0.813	0.847 0.957 0.920	0.702 0.821 0.791
RetinaNet	mAP AP_50 AP_75	0.770 0.922 0.843	0.743 0.861 0.804	0.654 0.811 0.730	0.664 0.830 0.720	0.737 0.888 0.791	0.587 0.752 0.647	0.828 0.932 0.901	0.851 0.960 0.918	0.804 0.915 0.875
YOLOv3	mAP AP_50 AP_75	0.610 0.817 0.689	0.510 0.664 0.600	0.284 0.400 0.336	0.416 0.578 0.468	0.537 0.759 0.632	0.282 0.421 0.315	0.610 0.817 0.689	0.664 0.824 0.784	0.671 0.831 0.790
YOLOv5	mAP AP_50 AP_75	0.669 0.783 0.729	0.671 0.745 0.719	0.226 0.335 0.266	0.398 0.465 0.428	0.692 0.776 0.727	0.209 0.335 0.266	0.675 0.743 0.727	0.802 0.848 0.836	0.606 0.707 0.679

Our model achieves the following performance on light conditions and camera heights:

Backbone	Training	Day	Night	Day	Night	Low	High	Low	High
	Test	Day		Night		Low		High
Faster R-CNN	mAP AP_50 AP_75	0.767 0.917 0.843	0.632 0.826 0.808	0.523 0.714 0.572	0.559 0.783 0.609	0.700 0.898 0.808	0.573 0.760 0.669	0.561 0.749 0.636	0.729 0.896 0.817
RetinaNet	mAP AP_50 AP_75	0.779 0.932 0.848	0.667 0.856 0.741	0.534 0.747 0.567	0.566 0.785 0.620	0.702 0.903 0.792	0.610 0.818 0.695	0.596 0.780 0.669	0.739 0.909 0.817
YOLOv3	mAP AP_50 AP_75	0.615 0.874 0.728	0.432 0.630 0.490	0.299 0.545 0.306	0.415 0.635 0.451	0.567 0.808 0.678	0.375 0.606 0.414	0.349 0.530 0.394	0.563 0.800 0.653
YOLOv5	mAP AP_50 AP_75	0.794 0.888 0.842	0.343 0.447 0.384	0.392 0.517 0.416	0.414 0.561 0.442	0.590 0.752 0.680	0.412 0.542 0.465	0.350 0.448 0.394	0.718 0.843 0.781

anirban2020-code / vfp290k Goto Github PK

vfp290k's Introduction