rwightman / efficientdet-pytorch Goto Github PK

View Code? Open in Web Editor NEW

1.6K 29.0 291.0 329 KB

A PyTorch impl of EfficientDet faithful to the original Google impl w/ ported weights

License: Apache License 2.0

Python 99.98% Shell 0.02%

efficientdet efficientnet object-detection semantic-segmentation pytorch

efficientdet-pytorch's People

Contributors

Stargazers

Watchers

Forkers

linhduongtuan peternara bbingy dreadlord1984 agporto onisimchukv lukious vakkov ai-hub-deep-learning-fundamental hzhang57 frdnd binianzjl yangjirui randl ap-cv-research harryzhu123 tezike chenaifang pgsrv dam775 salmatfq ohke tonthatnam qap objectdetection codeaudit ioangatop atticusjohnson msaqib17 sailfish009 sramirez detshawn y78h11b09 byq-luo riwaly santolina donghyeops hamadichihaoui guitarmind gcv9htd thesky0108 pacifinapacific rikirolly bo-sfl manhlab anatolypavlov z3plus2 harinath0906 tuanho27 authman allarobot nazarko12345 xjsxujingsong greendream182 nazarsykhomlun indigoviolet ekta246 zhuzhuzhenbang lnt28 meinternational zou3519 freegliboracle yhamidullah ofekp fiyero zymale weixingithubjiang javasparrows tomkake phager90 pjh4993 bharrisau sadransh obeedev davisrbr bmyan adsimilarily swall0w cwinjet praritagarwal redwankarimsony landywei cv-ip lzd0825 marsjacobs daumkh402 zengwb-lx albertvillanova hannhu jaeohwoo albertotono surayuth istiakshihab aabbcc23 michaelmonashev nodiz tecsar-uncc deeplearningfromscratch ghnreigns hiyyg

efficientdet-pytorch's Issues

Info request - best practice for config custom datasets?

Hi @rwightman - super excited that you have made this!
Can you describe best method for training on custom datasets? (I can hack it up but prefer if you have have advice up front).
Thanks!

validation discrepancy in training.py and validate.py

Hi Ross,

As always thanks for the great codebase. I am trying to reproduce the efficientDet-D1. I have a question regarding the reported Average Precision during training vs the mAP reported in validate.py.

For D1, this is a snapshot of the training log:

Test (EMA): [   0/9]  Time: 14.371 (14.371)  Loss:  0.5902 (0.5902)  
Test (EMA): [   9/9]  Time: 7.463 (2.456)  Loss:  0.6043 (0.5914)  
Loading and preparing results...
DONE (t=0.20s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.85s).
Accumulating evaluation results...
DONE (t=0.41s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.470
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.655
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.520
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.256
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.519
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.677
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.395
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.537
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.550
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.308
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.604
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.718

which seems to reach a 47 mAP. However if i use the validate.py (with ema) then the reported mAP is 37.4 instead. I am still going through the code but can't identify what i have done wrong. Have you encountered this problem before?

ONNX Export

Have you tried ONNX export for tf_efficientdet_lite0? I tried with exporting only backbone+BiFPN by defining forward_dummy similar to used in mmdetect. I got very complicated ONNX graph (Scatter/Gather/Div/Squeeze etc..) Whereas efficientNetLite models you exported using gen-efficientnet-pytorch/blob/master/onnx_export.py are very clean.

Is there a way to get clean ONNX graph for tf_efficientdet_lite0?

Thanks a lot for all your help.

Details regarding training speed

Can you comment on how long it takes you to train a tf_efficientdet_d0 from scratch on COCO using pretrained imagenet weights?

I'm training with the following command on 2 RTX 6000s and it's taken ~18.5 hours to reach 41 epochs.
./distributed_train.sh 2 /mscoco --model tf_efficientdet_d0 -b 48 --amp --lr .06 --warmup-epochs 5 --sync-bn --opt fusedmomentum --fill-color mean --model-ema

format of target['bbox']

HI. I read Alex code on kaggle and I want to reproduce his work. I noticed that something changed after he share the code. Could you tell me what is the format of target['bbox'] in DetBenchTrain? I tried [x_min,y_min,x_max,y_max] first but i got something definitely not i expected. Then I noticed Alex used [x_min,y_min,width,height] but when i put this into the model i got a very large loss in the cls loss and nan in the box loss.

Request examples for data organization

Hi @rwightman

Thank you for your high quality work! Here I want to ask if you can provide examples for dataset organizations to run your pipeline.

For example, in coco detection, we have train_image/val_image and train_annotation.json/val_annotation.json (just ignore test set shortly), then how should I place them to make sure it is acceptable for your program to run?

Thank you for your help.

future training: area<=0 check for ann_file parsing, dataset.py will not load bbox only

I'm picking up the code from dataset.py and noticed all my custom bounding boxes were coming up empty.
This is b/c of the check on 71 in dataset.py:
if ann['area'] <= 0 or w < 1 or h < 1: continue
area is apparently only computed from segmentation masks and not bounding boxes...thus if you have a custom dataset with no seg masks, your bboxes won't be loaded.

I think just having the w and h check is sufficient and avoids people hitting this when training on custom datasets.
cocodataset/cocoapi#36

Feature request - clustered NMS (in addition to CIoU)

Just wanted to put this on the radar for the project - the developers of Complete IoU have extended it with clustered NMS:
"...we propose Cluster-NMS, where NMS during inference is done by implicitly clustering detected boxes and usually requires less iterations. Cluster-NMS is very efficient due to its pure GPU implementation, , and geometric factors can be incorporated to improve both AP and AR. In the experiments, CIoU loss and Cluster-NMS have been applied to state-of-the-art instance segmentation (e.g., YOLACT), and object detection (e.g., YOLO v3, SSD and Faster R-CNN) models. Taking YOLACT on MS COCO as an example, our method achieves performance gains as +1.7 AP and +6.2 AR100 for object detection..."

https://arxiv.org/abs/2005.03572
code:
https://github.com/Zzh-tju/CIoU

How set img_size at DetBenchPredict class?

I Ross,
I am still following the code of Alex Schonenkov, here https://www.kaggle.com/shonenkov/inference-efficientdet/comments?scriptVersionId=34956042
And this time I get an error at DetBenchPredict class. I know Alex used your code at training branch (he used DetBenchEval) and I am using your current master branch. My weights come from training process using master branch.
So it lacks of img_size at forward function of DetBenchPredict.
Do you have any suggestion to fix the problem?
Linh

resizePad bbox scaling incorrectly clipping ymax (transforms.py)

I am running with the resizepad function in transforms.py and found that the additional clipping code (marked as fixme to be fair) was clipping excessively in terms of rescaling the bboxes:

if 'bbox' in annotations:

        # FIXME haven't tested this path since not currently using dataset annotations for train/eval
        bbox = annotations['bbox']
        bbox[:, :4] *= scale
        #bbox = clip_boxes(bbox, (scaled_height, scaled_width))
        #indices = np.where(np.sum(bbox, axis=1) != 0)[0]
       # if len(indices) < len(bbox):
            #bbox = np.take(bbox, indices)
           # annotations['cls'] = np.take(annotations['cls'], indices)
        annotations['bbox'] = bbox`

So far I've found that bbox[:, :4] *= scale
is sufficient. The additional code there was introducing incorrect clipping to ymax and I've simply commented out for now.
There might be cases the clipping is needed but I haven't seen it yet and did see that it was incorrectly adjusting the lowest box boundaries (ymax).

tpu problem with class values

Hi,
I try to get EfficientDet running on Kaggle TPUs following Alex Shonenkov's kernel

I am rather a beginner with python and pytorch - sorry...

the model runs ok on GPU - is it possible, that there is a problem with num_classes=1?

the call stack is like:

`def get_net(imgsize=IMG_SIZE, use_checkpoint=None):
config = get_efficientdet_config('tf_efficientdet_d4')
net = EfficientDet(config, pretrained_backbone=False)
checkpoint = torch.load('../input/efficientdet/efficientdet_d4-5b370b7a.pth')
net.load_state_dict(checkpoint)
config.num_classes = 1
config.image_size = IMG_SIZE
net.class_net = HeadNet(config, num_outputs=config.num_classes, norm_kwargs=dict(eps=.001,
momentum=.01))

return DetBenchTrain(net, config)`

and I call

def _mp_fn(rank, flags):
global acc_list
torch.set_default_tensor_type('torch.FloatTensor')
a = run_training()
FLAGS={}
xmp.spawn(_mp_fn, args=(FLAGS,), nprocs=1, start_method='fork') #8

Error looks like:

Exception in device=TPU:0: Class values must be non-negative.

Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch_xla/distributed/xla_multiprocessing.py", line 231, in _start_fn
fn(gindex, *args)
File "", line 8, in _mp_fn
a = run_training()
File "", line 76, in run_training
fitter.fit(train_loader, val_loader)
File "", line 40, in fit
summary_loss = self.train_one_epoch(train_loader)
File "", line 106, in train_one_epoch
loss, _, _ = self.model(images, boxes, labels)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 577, in call
result = self.forward(*input, **kwargs)
File "../input/timm-efficientdet-pytorch/effdet/bench.py", line 93, in forward
gt_class_out, gt_box_out, num_positive = self.anchor_labeler.label_anchors(gt_boxes[i], gt_labels[i])
File "../input/timm-efficientdet-pytorch/effdet/anchors.py", line 343, in label_anchors
cls_targets, _, box_targets, _, matches = self.target_assigner.assign(anchor_box_list, gt_box_list, gt_labels)
File "../input/timm-efficientdet-pytorch/effdet/object_detection/target_assigner.py", line 140, in assign
match = self._matcher.match(match_quality_matrix, **params)
File "../input/timm-efficientdet-pytorch/effdet/object_detection/matcher.py", line 212, in match
return Match(self._match(similarity_matrix, **params))
File "../input/timm-efficientdet-pytorch/effdet/object_detection/argmax_matcher.py", line 155, in _match
return _match_when_rows_are_non_empty()
File "../input/timm-efficientdet-pytorch/effdet/object_detection/argmax_matcher.py", line 144, in _match_when_rows_are_non_empty
force_match_column_indicators = one_hot(force_match_column_ids, similarity_matrix.shape[1])
RuntimeError: Class values must be non-negative.

GPU memory issue

Thank you for your hard work.
I have a question. When I was trying to learn D5, I saw a change in memory.
This saw me using 31g of memory per gpu(v100-DGXs station) when loading the image, and after starting training I noticed that it dropped to 27g.
I couldn't see in the code why gpu memory goes up while loading the data. Can you tell me where this is happening?

minor - default model name in config.py doesn't work as-is, missing 'tf_' prefix.

Minor issue but the default model param for config.py / get_efficientdet_config will err out (line 183).

def get_efficientdet_config(model_name='efficientdet_d1'): """Get the default config for EfficientDet based on model name.""" h = default_detection_configs() h.update(efficientdet_model_param_dict[model_name]) return h

The model names are all prefaced with tf_ so default model_name should be "tf_efficientdet_d1".

Minor but helps avoid anyone from getting seeing this error:

` KeyError Traceback (most recent call last)
in
----> 1 z= get_efficientdet_config(model_name='efficientdet_d1')

~\pyeffdet\effdet\config\config.py in get_efficientdet_config(model_name)
184 """Get the default config for EfficientDet based on model name."""
185 h = default_detection_configs()
--> 186 h.update(efficientdet_model_param_dict[model_name])
187 return h
188

KeyError: 'efficientdet_d1'

AssertionError: APEX and CUDA required for fused optimizers

Hi,

I have four gpus and installed apex but still I am getting this error while trying to train using the training script given. Any suggestion to solve this problem.

Feature Request : Segmentation model

The paper lists an easy way to use the model for segmentation. I really hope there is enough flexibility in your code to allow for that alteration.

Following [16], we modify our EfficientDet model to keep feature level {P2,P3,...,P7} in BiFPN, but only use P2 for the fi- nal per-pixel classification. For simplicity, here we only evaluate a EfficientDet-D4 based model, which uses a Ima- geNet pretrained EfficientNet-B4 backbone (similar size to ResNet-50). We set the channel size to 128 for BiFPN and 256 for classification head. Both BiFPN and classification head are repeated by 3 times.

Does DetBenchTrain class support nn.DataParallel?

Thanks for the wonderful PyTorch version of EffiDet. I am trying to retrain the EffiDet B5 using multi GPUs, but got much worse loss.(Epoch: 1, summaryl oss: 0.98135) comparing to single GPU (Epoch: 1, summary loss: 0.4458)

Does the DetBenchTrain supports nn.DataParallel?

net.class_net = HeadNet(config, num_outputs=config.num_classes, norm_kwargs=dict(eps=.001, momentum=.01)) return DetBenchTrain(net, config)

ImportError: cannot import name 'get_act_layer' from 'timm.models.layers' (/opt/conda/lib/python3.7/site-packages/timm/models/layers/init.py)

How to solve this issue?

Pre-train mode not supporting?

Hi @rwightman,

Are your implementation currently support pretrain model? Not everyone needs to train from start right? Any fix for this?

Thank you

Training COCO from Scratch

Thanks for the good work!

Just wanted to mention that I have tried the two currently most stared EfficientDet PyTorch repos on Github, and neither reproduce the paper results on COCO, not even close.

They mostly claim they train on custom data and port weights from official TF checkpoints, but fail to train from scratch on COCO.

I tried their implementations with 200+ kinds of hyper-parameter tuning sets & settings - yes 200+ jobs!

Very keen to see your completed training on COCO. Looking forward to that.

Cheers

Validation outputs all zeros on ms coco

Hi,

Using the default validation parameters given on the github gives all zeros on mscoco as shown below. Is it possible to know what can be the problem.

anchor_labeler: batch_label_anchors issue with single bounding box

Hi @rwightman

Thank you so much for making this repo.

I am currently experiencing an issue calling batch_label_anchors when ground truth bounding box list only has 1 bbox. Not sure what might caused this issue, I am wondering if you can take a look. Thanks in advance :)

Issue:
anchor_labeler.batch_label_anchors () has index out of range error. Trace attached

Setup anchor:

model_config = get_efficientdet_config('tf_efficientdet_d0')
model = EfficientDet(model_config,pretrained_backbone=True)
model_config.num_classes = 1
model_config.image_size = 512

anchors = Anchors(
    model_config.min_level,model_config.max_level,
    model_config.num_scales, model_config.aspect_ratios,
    model_config.anchor_scale, model_config.image_size
    )


anchor_labeler = AnchorLabeler(anchors,model_config.num_classes,match_threshold=0.5)

Reproduce:

tb = torch.tensor([[468.,353.,52.,386.5]])
tb = tb.int().float()
tlbl = torch.tensor([1.])
cls_targets, box_targets,num_positives = anchor_labeler.batch_label_anchors(1,[tb],[tlbl])

Trace:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-45-e8bceaf11fb2> in <module>
----> 1 cls_targets, box_targets,num_positives = anchor_labeler.batch_label_anchors(1,[tb],[tlbl])

/opt/conda/envs/fastai2/lib/python3.7/site-packages/effdet/anchors.py in batch_label_anchors(self, batch_size, gt_boxes, gt_classes)
    394             # cls_weights, box_weights are not used
    395             cls_targets, _, box_targets, _, matches = self.target_assigner.assign(
--> 396                 anchor_box_list, BoxList(gt_boxes[i]), gt_classes[i])
    397 
    398             # class labels start from 1 and the background class = -1

/opt/conda/envs/fastai2/lib/python3.7/site-packages/effdet/object_detection/target_assigner.py in assign(self, anchors, groundtruth_boxes, groundtruth_labels, groundtruth_weights)
    144         match_quality_matrix = self._similarity_calc.compare(groundtruth_boxes, anchors)
    145         match = self._matcher.match(match_quality_matrix)
--> 146         reg_targets = self._create_regression_targets(anchors, groundtruth_boxes, match)
    147         cls_targets = self._create_classification_targets(groundtruth_labels, match)
    148         reg_weights = self._create_regression_weights(match, groundtruth_weights)

/opt/conda/envs/fastai2/lib/python3.7/site-packages/effdet/object_detection/target_assigner.py in _create_regression_targets(self, anchors, groundtruth_boxes, match)
    167         zero_box = torch.zeros(4, device=device)
    168         matched_gt_boxes = match.gather_based_on_match(
--> 169             groundtruth_boxes.boxes(), unmatched_value=zero_box, ignored_value=zero_box)
    170         matched_gt_boxlist = box_list.BoxList(matched_gt_boxes)
    171         if groundtruth_boxes.has_field(self._keypoints_field_name):

/opt/conda/envs/fastai2/lib/python3.7/site-packages/effdet/object_detection/matcher.py in gather_based_on_match(self, input_tensor, unmatched_value, ignored_value)
    171         input_tensor = torch.cat([ss, input_tensor], dim=0)
    172         gather_indices = torch.clamp(self.match_results + 2, min=0)
--> 173         gathered_tensor = torch.index_select(input_tensor, 0, gather_indices)
    174         return gathered_tensor

IndexError: index out of range in self

Here are some values:
ipdb> p input_tensor
tensor([[  0.,   0.,   0.,   0.],
        [  0.,   0.,   0.,   0.],
        [468., 353.,  52., 386.]])

ipdb> p gather_indices.shape
torch.Size([49104])

ipdb> p gather_indices
tensor([    1,     1,     1,  ...,     1,     1, 24554])

ipdb> p self.match_results
tensor([   -1,    -1,    -1,  ...,    -1,    -1, 24552])

Number classes for COCO is 81, but it is 90 in the config?

Hello. Thanks for your implementation.
I read through the config.py and see the number class =90, therefore the weight of cls_net.predict.conv_pw is [810x fpn_feature x1x1], where the number of anchor =9. (90x9=810)

This is different with I expect, since COCO has 80 foreground classes, it should be 80x9=720 instead.

Final release?

Are we getting closer to the final release of the weights that get equal accuracy to what the paper suggests? I think you have been working for over a month on this model now, could you also explain the difficulties you faced in reaching the same accuracy the paper claimed? Would be interested to learn from your experience on how to approach these models.

What is the difference between timm and gen-efficientnet?

I confirmed that there is an efficientnet in the gen-efficientnet you made and an efficientnet in the timm.
Does it matter whether you use timm's efficientnet or gen-efficientnet's efficientnet?

# TypeError: forward() takes 3 positional arguments but 4 were given

Hi Ross,
As you suggest, I have try to rerun the code from Alex here https://www.kaggle.com/shonenkov/training-efficientdet.
The notebook runs well using kaggle instance, but it gets error message: "TypeError: forward() takes 3 positional arguments but 4 were given", while I am trying to run on my local machine. My packages are effdet version 0.18.2, Pytorch version 1.5, and timm version 0.1.30.
I have also refer the code to wrap like this

class ExtendDetBenchTrain(DetBenchTrain):
def init(self, model, config):
super(ExtendDetBenchTrain, self).init(model, config)

def forward(self, x, target):
    class_out, box_out = self.model(x)
    cls_targets, box_targets, num_positives = self.anchor_labeler.batch_label_anchors(
    x.shape[0], target['bbox'], target['cls'])
    loss, class_loss, box_loss = self.loss_fn(class_out, box_out, cls_targets, box_targets,      num_positives)
    output = dict(loss=loss, class_loss=class_loss, box_loss=box_loss)
    return output

but it still gets the same error
Can you give me an advice.
Many thank
Linh

Readme idea -add coco val download instructions?

Just ran coco eval but thought adding download instructions on the readme directly would be helpful for people?

#// from tf page - Download coco data.
!wget http://images.cocodataset.org/zips/val2017.zip
!wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
!unzip val2017.zip
!unzip annotations_trainval2017.zip

Can I fork your repo and change the liscence with MIT license

Due to kaggle GWD need to use MIT liscense :(

Error loading saved model checkpoints file during validation on custom test dataset

Hi,
Thank you for the code repo.
I am training on my custom dataset (5 class). The training is completed successfully (tf_efficientdet_d1). I am using single gpu for training/testing.
Whilst evaluating on test set using validate.py, its raising below error to load the checkpoints model weight.

bench = create_model( args.model, bench_task='predict', pretrained=args.pretrained, redundant_bias=args.redundant_bias, checkpoint_path=args.checkpoint, checkpoint_ema=args.use_ema, )

RuntimeError: Error(s) in loading state_dict for EfficientDet: Missing key(s) in state_dict: "fpn.resample.3.conv.conv.bias",

Is it related to saving the model, using nn.DataParallel, or something else?

Many thanks,
Neel

Setup for parameters

Hi @rwightman

Just checked your code. Quite surprising to see your default training mode is to train without pre-train weight. How to train with pre-train weight? And normally, when we refer to pre-train weight, we are refer to pre-train weight for backbone right? So what's the point to present both "pretrain" and "pretrain backbone" option to user?

And what's drop_path rate and drop_block rate (did not see them in paper)? You set default value as None, so do you want your user to use them? If yes, can you provide default value?

I also want to know where is the yaml file in this line "args, args_text = _parse_args()"?

And how to setup num_classes to correct number?

Thanks for any help.

Not reuse batch-norm for different input feature levels?

Hello,

I appreciate your great works.

By the way, I'm double-checking your implementation with the original TF implementation.
In this perspective, I think your code is different in HeadNet, which does not resue batch-norm layers for inputs from different levels.

efficientdet-pytorch/effdet/efficientdet.py

Line 328 in 20cd5f3

bn_levels.append(bn_seq)

It looks weird to me that convolutions are reused but batch-norms.
Is it intended?

Tensor size mismatch, from data loader

@rwightman Any reasons? And did you consider the case where image sizes are different?

Error during inference using pretrained TF weights for D5

I tried running COCO eval using TF ported weights for D5 as a sanity check and got an error.
I don't get this error while running the same check on D0, D1 or D2. Don't know if there is a problem with the tensorflow weights or I'm supposed to call .contiguous() somewhere

This was my command:
python validate.py /mscoco --model tf_efficientdet_d5 --checkpoint tf_efficientdet_d5-ef44aea8.pth

I get the following error:

Traceback (most recent call last):
  File "validate.py", line 188, in <module>
    main()
  File "validate.py", line 184, in main
    validate(args)
  File "validate.py", line 139, in validate
    output = bench(input, target['img_scale'], target['img_size'])
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kgupta/code/sixth_sense/efficientdet-pytorch/effdet/bench.py", line 73, in forward
    class_out, box_out = self.model(x)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kgupta/code/sixth_sense/efficientdet-pytorch/effdet/efficientdet.py", line 467, in forward
    x = self.backbone(x.contiguous())
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/timm/models/efficientnet.py", line 478, in forward
    x = self.conv_stem(x)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/timm/models/layers/conv2d_same.py", line 30, in forward
    return conv2d_same(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/timm/models/layers/conv2d_same.py", line 17, in conv2d_same
    return F.conv2d(x, weight, bias, stride, (0, 0), dilation, groups)
  File "/miniconda/envs/py37/lib/python3.7/site-packages/apex/amp/wrap.py", line 28, in wrapper
    return orig_fn(*new_args, **kwargs)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

EfficientDetLite0

I am curious to know your view on efficientnetLiteB0 as a backbone. Google has officially not released but do you think switching backbone from efficientnetB0 to efficientnetLiteB0 will make coco AP degrade less than 5% of original or it will degrade more?

In the current code base, if I call efficientnetLiteB0 from your timm model collection, will it work?

Integration with Mantisshrimp

Hello Ross!

First of all, thank you for all the amazing work put into this repo, your efforts into making sure your implementation could replicate the original results from the paper makes your code stand out.

The team I'm a part of is developing an object detection library called mantisshrimp and we're looking forward to add efficientdet to our arsenal.

Before I go to questions, let me give you a very brief background:

Differently than other object detection libraries, our main goal is not to implement everything ourselves, but instead to provide a framework that makes it really easy to integrate implementations made by the community.

As an example, the library does not contain any implementation of a training loop! Instead, we provide adapters to libraries like fastai and lightning that handle the training loop, if you're curious to learn more, take a look at our introduction guide.

The same can be said for models, we currently only have support for torchvision's rcnns, and we choose this implementation of efficientdet to add next 🥳

Now that you know a little bit about the background, let me get to the questions (sorry for the long list):

What is the recommended way of installing the library?
I'm currently doing pip install git+https://github.com/rwightman/efficientdet-pytorch.git but there is no mention on the README, is this the recommended way?
What class is used as background? 0 or -1?
I found this comment saying background should be -1, but I wanted to confirm.
What is the order of bounding box coordinates?
Again, I'm very sure you're using (xmin, ymin, xmax, ymax), but better safe than sorry 😅
How did you fixed the pycocotools CocoEvaluator with transforms problem?
This is a problem we're facing as well, pycocotools is the most annoying thing ever and we're even thinking of reimplementing this entire metric and stop depending on it.
Just to be sure that we're facing the same problem: The problem is that CocoEvaluator requires you to pass all targets when you first instantiate it, any transforms applied after that will be disregarded and the computed metric will be incorrect.
I think my next question is related to how you solved this.
What is img_scale? How do I use it correctly?
If this is related to pycocotools, is there a way of disabling it? Because we'll do evaluation outside this library.
General advice
Any general advice? Any important detail I should pay extra attention? I'm currently following the Kaggle notebook mentioned in the readme as a guide, although some of the stuff is outdated it's still very helpful.

Feature Request : adding it in Detectron2

Thank you very much for your project, really looking into it. It would be awesome to integrate it within Detectron2 pipeline as current FPN implementation is super slow.

Inference with custom image shape

Hi, I tried to create predictions for a custom image shape other than that used for model training. I noticed that using img_size parameter in DetBenchPredict didn't output expected results (weird boxes).

Instead I got the desired results by relying on config parameter during model construction which is a bit clunky for me.

def load_net(checkpoint_path, image_size=512):
    config = get_efficientdet_config('tf_efficientdet_d6')
    net = EfficientDet(config, pretrained_backbone=False)
    config.num_classes = 1
    config.image_size=image_size
    net.class_net = HeadNet(config, num_outputs=config.num_classes, norm_kwargs=dict(eps=.001, momentum=.01))
    checkpoint = torch.load(checkpoint_path)
    net.load_state_dict(checkpoint['model_state_dict'])
    del checkpoint
    gc.collect()
    net = DetBenchPredict(net, config)

Seems img_size is not used in generate_predictions function, but it takes these info from the initialization parameters from initial config.

    **boxes = decode_box_outputs(box_outputs.float(), anchor_boxes, output_xyxy=True)**
    boxes = clip_boxes_xyxy(boxes, img_size / img_scale)  # clip before NMS better?

Any clue about how to improve this feature? Maybe I can help.

Do you get 130 img/s with D0 for batch=1?

@rwightman Hi, nice work!

Do you get this FPS for batch=1?
by using TensorCores FP16/32?

2020-04-11
Cleanup post-processing. Less code and a five-fold throughput increase on the smaller models. D0 running > 130 img/s on a single 2080Ti, D1 > 130 img/s on dual 2080Ti up to D7 @ 8.5 img/s.

D2 trained on some custom dataset?

Since coco really takes a long time to train, did u consider train D2 with a smaller dataset?

Also, I believe yolov5 repo can easily get AP up to 40+ on coco now.

Training code

Thank you very much for your project, when the training code release?

Compute loss on validation

Hi!
Thx for this amazing work!
I'm not very experienced yet, can you tell me how can i compute loss (only) on validation?
Set torch.no_grad() enough for this? Cause model.eval() changes output of DetBenchTrain.

if not self.training:
            # if eval mode, output detections for evaluation

I'm little confused.

Bug in model_config - tf_efficientdet_d7

Hi @rwightman,

Thank you for your amazing work.

When I tried using tf_efficientdet_d7, I found out it actually used tf_efficientdet_d6 backbone. Diving a little into your code, I found a small bug in efficientdet_model_param_dict in model_config.py.

    tf_efficientdet_d7=dict(
        name='tf_efficientdet_d7',
        backbone_name='tf_efficientnet_b6', # ~~> should be b7
        image_size=1536,
        fpn_channels=384,
        fpn_cell_repeats=8,
        box_class_repeats=5,
        anchor_scale=5.0,
        fpn_name='bifpn_sum',  # Use unweighted sum for training stability.
        backbone_args=dict(drop_rate=0.5, drop_path_rate=0.2),
        url='https://github.com/rwightman/efficientdet-pytorch/releases/download/v0.1/tf_efficientdet_d7-f05bf714.pth'
    ),

Hope you fix it soon.

Have a nice day.

About run time

Hi,

I'm very impressed by your wonderful work.

I wonder your implementation models are as fast as original work.

I can't find the inference time of your implementation models.

Could you let us know the inference time, too?

I have tried to add the biFPN into CenterMask and experienced that biFPN was slower than the original FPN because of its many layers.

efficientdet_d3 has wrong configuration

Looks like 'efficientdet_d3' has the wrong configuration. Looks like it was just a copy and paste error, because it is currently the same as 'efficientdet_d2'.

efficientdet-pytorch/effdet/config/model_config.py

Lines 102 to 113 in 6ff9140

 efficientdet_d3=dict( 

 name='efficientdet_d3', 

 backbone_name='efficientnet_b3', 

 image_size=768, 

 fpn_channels=112, 

 fpn_cell_repeats=5, 

 box_class_repeats=3, 

 pad_type='', 

 redundant_bias=False, 

 backbone_args=dict(drop_path_rate=0.2), 

 url='', # no pretrained weights yet 

 ),

Is there typo?

https://github.com/rwightman/pytorch-image-models/blob/master/timm/data/config.py#L82
in function:

def get_mean_by_model(model_name):
    model_name = model_name.lower()
    if 'dpn' in model_name:
        return IMAGENET_DPN_STD
    elif 'ception' in model_name or ('nasnet' in model_name and 'mnasnet' not in model_name):
        return IMAGENET_INCEPTION_MEAN
    else:
        return IMAGENET_DEFAULT_MEAN

expected IMAGENET_DPN_MEAN?

Finetuning

Hi,

I know that you have given a link to finetuning on custom dataset. But being a naive programmer, is there a way to use your training script for finetuning. For example load your train model and fine tune it with a custom dataset which has same images and annotations as mscoco.

Finetuning Training d7

Hi rwightman,

It doesn't matter what batchsize I use for d7, it doesn't fit in the GPU (16GB). I am finetuning and following is the runtime command.

./distributed_train.sh 2 /datasetlocation --model tf_efficientdet_d7 -b 1 --amp --lr .04 --sync-bn --opt adam --fill-color mean --sched plateau

Please can you let me know about your setting for training d7 or am I doing something wroing.

evaluator not set for validate call in ema branch

Looks like a recent commit modified the interface to the validate function in train.py so that it takes an evaluator keyword argument. The evaluator is not set in the branch that calls validate for ema. This causes the default 'map' --eval-metric to fail.

Train: 0 [3664/3665 (100%)]  Loss:  3.457180 (3.6070)  Time: 3.425s,    5.25/s  (0.722s,   24.92/s)  LR: 1.000e-04  Data: 0.174 (0.011)
/pytorch/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
Test: [   0/154]  Time: 1.064 (1.064)  Loss:  2.4959 (2.4959)  
Test: [  50/154]  Time: 0.412 (0.428)  Loss:  2.6239 (2.5469)  
Test: [ 100/154]  Time: 0.413 (0.421)  Loss:  2.3167 (2.5493)  
Test: [ 150/154]  Time: 0.410 (0.428)  Loss:  2.5362 (2.5629)  
Test: [ 154/154]  Time: 1.433 (0.434)  Loss:  2.5216 (2.5620)  
Loading and preparing results...
DONE (t=3.12s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=41.79s).
Accumulating evaluation results...
DONE (t=15.55s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.001
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.002
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.002
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.005
Test (EMA): [   0/154]  Time: 1.070 (1.070)  Loss:  2.5276 (2.5276)  
Test (EMA): [  50/154]  Time: 0.378 (0.391)  Loss:  2.6372 (2.5649)  
Test (EMA): [ 100/154]  Time: 0.376 (0.384)  Loss:  2.3414 (2.5664)  
Test (EMA): [ 150/154]  Time: 0.372 (0.382)  Loss:  2.5519 (2.5790)  
Test (EMA): [ 154/154]  Time: 0.376 (0.381)  Loss:  2.5492 (2.5783)  
Traceback (most recent call last):
  File "train.py", line 608, in <module>
    main()
  File "train.py", line 448, in main
    lr_scheduler.step(epoch + 1, eval_metrics[eval_metric])
KeyError: 'map'

Here is the code snippet:

efficientdet-pytorch/train.py

Lines 437 to 444 in 91f5172

 eval_metrics = validate(model, loader_eval, args, evaluator) 

 if model_ema is not None and not args.model_ema_force_cpu: 

 if args.distributed and args.dist_bn in ('broadcast', 'reduce'): 

 distribute_bn(model_ema, args.world_size, args.dist_bn == 'reduce') 

 ema_eval_metrics = validate(model_ema.ema, loader_eval, args, log_suffix=' (EMA)') 

 eval_metrics = ema_eval_metrics

Looks like a simple fix .. and workaround is simply not to set --ema.

How to feed tensor to target["img_size"] & target["img_scale"] for DetBenchTrain()?

Hello Ross,
Thank you for this awesome project!
I'm been playing with it to build a pipeline and got stuck by the annoying input of target["img_scale"] & target["img_size"] required by the DetBenchTrain class. I expect the model to output pred boxes during validation, but I can't feed the right target to the model.

def forward(self, x, target):
        class_out, box_out = self.model(x)
        cls_targets, box_targets, num_positives = self.anchor_labeler.batch_label_anchors(
            x.shape[0], target['bbox'], target['cls'])
        loss, class_loss, box_loss = self.loss_fn(class_out, box_out, cls_targets, box_targets, num_positives)
        output = dict(loss=loss, class_loss=class_loss, box_loss=box_loss)
        if not self.training:
            # if eval mode, output detections for evaluation
            class_out, box_out, indices, classes = _post_process(self.config, class_out, box_out)
            output['detections'] = _batch_detection(
                x.shape[0], class_out, box_out, self.anchors.boxes, indices, classes,
                target['img_scale'], target['img_size'])
        return output

For example, my batch_size=2, image_size=512, how should I feed in these 2 tensors? The bug pops whatever I'm trying to feed. Please help.

Great thanks!

Question for next time - Detectron2

Hi @rwightman ,

Thank you for your great job.

Have you a plan for integration into detectron2?
The question is how do you think the performance is still keep like that if you intergrate to detectron2.

Thank you so much.

np.rollaxis being deprecated, prefer np.moveaxis in ImageToTensor

minor nit but to make this as future proof as possible:
34 in transforms.py, ImageToTensor:

np_img = np.rollaxis(np_img, 2) # HWC to CHW
np wants people to move to 'moveaxis' so I updated my copy to:

np_img = np.moveaxis(np_img,2,0)

docs: This function continues to be supported for backward compatibility, but you should prefer moveaxis. The moveaxis function was added in NumPy 1.11.

https://numpy.org/doc/stable/reference/generated/numpy.rollaxis.html?highlight=rollaxis#numpy.rollaxis

and moveaxis doc:
https://numpy.org/doc/stable/reference/generated/numpy.moveaxis.html#numpy.moveaxis

	efficientdet_d3=dict(
	name='efficientdet_d3',
	backbone_name='efficientnet_b3',
	image_size=768,
	fpn_channels=112,
	fpn_cell_repeats=5,
	box_class_repeats=3,
	pad_type='',
	redundant_bias=False,
	backbone_args=dict(drop_path_rate=0.2),
	url='', # no pretrained weights yet
	),

	eval_metrics = validate(model, loader_eval, args, evaluator)

	if model_ema is not None and not args.model_ema_force_cpu:
	if args.distributed and args.dist_bn in ('broadcast', 'reduce'):
	distribute_bn(model_ema, args.world_size, args.dist_bn == 'reduce')

	ema_eval_metrics = validate(model_ema.ema, loader_eval, args, log_suffix=' (EMA)')
	eval_metrics = ema_eval_metrics

rwightman / efficientdet-pytorch Goto Github PK

efficientdet-pytorch's People

Contributors

Stargazers

Watchers

Forkers

efficientdet-pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org