rwightman / efficientdet-pytorch Goto Github PK
View Code? Open in Web Editor NEWA PyTorch impl of EfficientDet faithful to the original Google impl w/ ported weights
License: Apache License 2.0
A PyTorch impl of EfficientDet faithful to the original Google impl w/ ported weights
License: Apache License 2.0
Hi @rwightman - super excited that you have made this!
Can you describe best method for training on custom datasets? (I can hack it up but prefer if you have have advice up front).
Thanks!
Hi Ross,
As always thanks for the great codebase. I am trying to reproduce the efficientDet-D1. I have a question regarding the reported Average Precision during training vs the mAP reported in validate.py.
For D1, this is a snapshot of the training log:
Test (EMA): [ 0/9] Time: 14.371 (14.371) Loss: 0.5902 (0.5902)
Test (EMA): [ 9/9] Time: 7.463 (2.456) Loss: 0.6043 (0.5914)
Loading and preparing results...
DONE (t=0.20s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.85s).
Accumulating evaluation results...
DONE (t=0.41s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.470
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.655
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.520
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.256
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.519
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.677
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.395
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.537
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.550
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.308
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.604
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.718
which seems to reach a 47 mAP. However if i use the validate.py (with ema) then the reported mAP is 37.4 instead. I am still going through the code but can't identify what i have done wrong. Have you encountered this problem before?
Have you tried ONNX export for tf_efficientdet_lite0? I tried with exporting only backbone+BiFPN by defining forward_dummy similar to used in mmdetect. I got very complicated ONNX graph (Scatter/Gather/Div/Squeeze etc..) Whereas efficientNetLite models you exported using gen-efficientnet-pytorch/blob/master/onnx_export.py are very clean.
Is there a way to get clean ONNX graph for tf_efficientdet_lite0?
Thanks a lot for all your help.
Can you comment on how long it takes you to train a tf_efficientdet_d0
from scratch on COCO using pretrained imagenet weights?
I'm training with the following command on 2 RTX 6000s and it's taken ~18.5 hours to reach 41 epochs.
./distributed_train.sh 2 /mscoco --model tf_efficientdet_d0 -b 48 --amp --lr .06 --warmup-epochs 5 --sync-bn --opt fusedmomentum --fill-color mean --model-ema
HI. I read Alex code on kaggle and I want to reproduce his work. I noticed that something changed after he share the code. Could you tell me what is the format of target['bbox'] in DetBenchTrain? I tried [x_min,y_min,x_max,y_max] first but i got something definitely not i expected. Then I noticed Alex used [x_min,y_min,width,height] but when i put this into the model i got a very large loss in the cls loss and nan in the box loss.
Hi @rwightman
Thank you for your high quality work! Here I want to ask if you can provide examples for dataset organizations to run your pipeline.
For example, in coco detection, we have train_image/val_image and train_annotation.json/val_annotation.json (just ignore test set shortly), then how should I place them to make sure it is acceptable for your program to run?
Thank you for your help.
I'm picking up the code from dataset.py and noticed all my custom bounding boxes were coming up empty.
This is b/c of the check on 71 in dataset.py:
if ann['area'] <= 0 or w < 1 or h < 1: continue
area is apparently only computed from segmentation masks and not bounding boxes...thus if you have a custom dataset with no seg masks, your bboxes won't be loaded.
I think just having the w and h check is sufficient and avoids people hitting this when training on custom datasets.
cocodataset/cocoapi#36
Just wanted to put this on the radar for the project - the developers of Complete IoU have extended it with clustered NMS:
"...we propose Cluster-NMS, where NMS during inference is done by implicitly clustering detected boxes and usually requires less iterations. Cluster-NMS is very efficient due to its pure GPU implementation, , and geometric factors can be incorporated to improve both AP and AR. In the experiments, CIoU loss and Cluster-NMS have been applied to state-of-the-art instance segmentation (e.g., YOLACT), and object detection (e.g., YOLO v3, SSD and Faster R-CNN) models. Taking YOLACT on MS COCO as an example, our method achieves performance gains as +1.7 AP and +6.2 AR100 for object detection..."
https://arxiv.org/abs/2005.03572
code:
https://github.com/Zzh-tju/CIoU
I Ross,
I am still following the code of Alex Schonenkov, here https://www.kaggle.com/shonenkov/inference-efficientdet/comments?scriptVersionId=34956042
And this time I get an error at DetBenchPredict class. I know Alex used your code at training branch (he used DetBenchEval) and I am using your current master branch. My weights come from training process using master branch.
So it lacks of img_size at forward function of DetBenchPredict.
Do you have any suggestion to fix the problem?
Linh
I am running with the resizepad function in transforms.py and found that the additional clipping code (marked as fixme to be fair) was clipping excessively in terms of rescaling the bboxes:
if 'bbox' in annotations:
# FIXME haven't tested this path since not currently using dataset annotations for train/eval
bbox = annotations['bbox']
bbox[:, :4] *= scale
#bbox = clip_boxes(bbox, (scaled_height, scaled_width))
#indices = np.where(np.sum(bbox, axis=1) != 0)[0]
# if len(indices) < len(bbox):
#bbox = np.take(bbox, indices)
# annotations['cls'] = np.take(annotations['cls'], indices)
annotations['bbox'] = bbox`
So far I've found that bbox[:, :4] *= scale
is sufficient. The additional code there was introducing incorrect clipping to ymax and I've simply commented out for now.
There might be cases the clipping is needed but I haven't seen it yet and did see that it was incorrectly adjusting the lowest box boundaries (ymax).
Hi,
I try to get EfficientDet running on Kaggle TPUs following Alex Shonenkov's kernel
I am rather a beginner with python and pytorch - sorry...
the model runs ok on GPU - is it possible, that there is a problem with num_classes=1?
the call stack is like:
`def get_net(imgsize=IMG_SIZE, use_checkpoint=None):
config = get_efficientdet_config('tf_efficientdet_d4')
net = EfficientDet(config, pretrained_backbone=False)
checkpoint = torch.load('../input/efficientdet/efficientdet_d4-5b370b7a.pth')
net.load_state_dict(checkpoint)
config.num_classes = 1
config.image_size = IMG_SIZE
net.class_net = HeadNet(config, num_outputs=config.num_classes, norm_kwargs=dict(eps=.001,
momentum=.01))
return DetBenchTrain(net, config)`
and I call
def _mp_fn(rank, flags):
global acc_list
torch.set_default_tensor_type('torch.FloatTensor')
a = run_training()
FLAGS={}
xmp.spawn(_mp_fn, args=(FLAGS,), nprocs=1, start_method='fork') #8
Error looks like:
Exception in device=TPU:0: Class values must be non-negative.
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch_xla/distributed/xla_multiprocessing.py", line 231, in _start_fn
fn(gindex, *args)
File "", line 8, in _mp_fn
a = run_training()
File "", line 76, in run_training
fitter.fit(train_loader, val_loader)
File "", line 40, in fit
summary_loss = self.train_one_epoch(train_loader)
File "", line 106, in train_one_epoch
loss, _, _ = self.model(images, boxes, labels)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 577, in call
result = self.forward(*input, **kwargs)
File "../input/timm-efficientdet-pytorch/effdet/bench.py", line 93, in forward
gt_class_out, gt_box_out, num_positive = self.anchor_labeler.label_anchors(gt_boxes[i], gt_labels[i])
File "../input/timm-efficientdet-pytorch/effdet/anchors.py", line 343, in label_anchors
cls_targets, _, box_targets, _, matches = self.target_assigner.assign(anchor_box_list, gt_box_list, gt_labels)
File "../input/timm-efficientdet-pytorch/effdet/object_detection/target_assigner.py", line 140, in assign
match = self._matcher.match(match_quality_matrix, **params)
File "../input/timm-efficientdet-pytorch/effdet/object_detection/matcher.py", line 212, in match
return Match(self._match(similarity_matrix, **params))
File "../input/timm-efficientdet-pytorch/effdet/object_detection/argmax_matcher.py", line 155, in _match
return _match_when_rows_are_non_empty()
File "../input/timm-efficientdet-pytorch/effdet/object_detection/argmax_matcher.py", line 144, in _match_when_rows_are_non_empty
force_match_column_indicators = one_hot(force_match_column_ids, similarity_matrix.shape[1])
RuntimeError: Class values must be non-negative.
Thank you for your hard work.
I have a question. When I was trying to learn D5, I saw a change in memory.
This saw me using 31g of memory per gpu(v100-DGXs station) when loading the image, and after starting training I noticed that it dropped to 27g.
I couldn't see in the code why gpu memory goes up while loading the data. Can you tell me where this is happening?
Minor issue but the default model param for config.py / get_efficientdet_config will err out (line 183).
def get_efficientdet_config(model_name='efficientdet_d1'): """Get the default config for EfficientDet based on model name.""" h = default_detection_configs() h.update(efficientdet_model_param_dict[model_name]) return h
The model names are all prefaced with tf_ so default model_name should be "tf_efficientdet_d1".
Minor but helps avoid anyone from getting seeing this error:
` KeyError Traceback (most recent call last)
in
----> 1 z= get_efficientdet_config(model_name='efficientdet_d1')
~\pyeffdet\effdet\config\config.py in get_efficientdet_config(model_name)
184 """Get the default config for EfficientDet based on model name."""
185 h = default_detection_configs()
--> 186 h.update(efficientdet_model_param_dict[model_name])
187 return h
188
KeyError: 'efficientdet_d1'
`
Hi,
I have four gpus and installed apex but still I am getting this error while trying to train using the training script given. Any suggestion to solve this problem.
The paper lists an easy way to use the model for segmentation. I really hope there is enough flexibility in your code to allow for that alteration.
Following [16], we modify our EfficientDet model to keep feature level {P2,P3,...,P7} in BiFPN, but only use P2 for the fi- nal per-pixel classification. For simplicity, here we only evaluate a EfficientDet-D4 based model, which uses a Ima- geNet pretrained EfficientNet-B4 backbone (similar size to ResNet-50). We set the channel size to 128 for BiFPN and 256 for classification head. Both BiFPN and classification head are repeated by 3 times.
Thanks for the wonderful PyTorch version of EffiDet. I am trying to retrain the EffiDet B5 using multi GPUs, but got much worse loss.(Epoch: 1, summaryl oss: 0.98135) comparing to single GPU (Epoch: 1, summary loss: 0.4458)
Does the DetBenchTrain supports nn.DataParallel?
net.class_net = HeadNet(config, num_outputs=config.num_classes, norm_kwargs=dict(eps=.001, momentum=.01)) return DetBenchTrain(net, config)
How to solve this issue?
Hi @rwightman,
Are your implementation currently support pretrain model? Not everyone needs to train from start right? Any fix for this?
Thank you
Thanks for the good work!
Just wanted to mention that I have tried the two currently most stared EfficientDet PyTorch repos on Github, and neither reproduce the paper results on COCO, not even close.
They mostly claim they train on custom data and port weights from official TF checkpoints, but fail to train from scratch on COCO.
I tried their implementations with 200+ kinds of hyper-parameter tuning sets & settings - yes 200+ jobs!
Very keen to see your completed training on COCO. Looking forward to that.
Cheers
Hi,
Using the default validation parameters given on the github gives all zeros on mscoco as shown below. Is it possible to know what can be the problem.
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.003
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001
Hi @rwightman
Thank you so much for making this repo.
I am currently experiencing an issue calling batch_label_anchors when ground truth bounding box list only has 1 bbox. Not sure what might caused this issue, I am wondering if you can take a look. Thanks in advance :)
Issue:
anchor_labeler.batch_label_anchors () has index out of range error. Trace attached
Setup anchor:
model_config = get_efficientdet_config('tf_efficientdet_d0')
model = EfficientDet(model_config,pretrained_backbone=True)
model_config.num_classes = 1
model_config.image_size = 512
anchors = Anchors(
model_config.min_level,model_config.max_level,
model_config.num_scales, model_config.aspect_ratios,
model_config.anchor_scale, model_config.image_size
)
anchor_labeler = AnchorLabeler(anchors,model_config.num_classes,match_threshold=0.5)
Reproduce:
tb = torch.tensor([[468.,353.,52.,386.5]])
tb = tb.int().float()
tlbl = torch.tensor([1.])
cls_targets, box_targets,num_positives = anchor_labeler.batch_label_anchors(1,[tb],[tlbl])
Trace:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-45-e8bceaf11fb2> in <module>
----> 1 cls_targets, box_targets,num_positives = anchor_labeler.batch_label_anchors(1,[tb],[tlbl])
/opt/conda/envs/fastai2/lib/python3.7/site-packages/effdet/anchors.py in batch_label_anchors(self, batch_size, gt_boxes, gt_classes)
394 # cls_weights, box_weights are not used
395 cls_targets, _, box_targets, _, matches = self.target_assigner.assign(
--> 396 anchor_box_list, BoxList(gt_boxes[i]), gt_classes[i])
397
398 # class labels start from 1 and the background class = -1
/opt/conda/envs/fastai2/lib/python3.7/site-packages/effdet/object_detection/target_assigner.py in assign(self, anchors, groundtruth_boxes, groundtruth_labels, groundtruth_weights)
144 match_quality_matrix = self._similarity_calc.compare(groundtruth_boxes, anchors)
145 match = self._matcher.match(match_quality_matrix)
--> 146 reg_targets = self._create_regression_targets(anchors, groundtruth_boxes, match)
147 cls_targets = self._create_classification_targets(groundtruth_labels, match)
148 reg_weights = self._create_regression_weights(match, groundtruth_weights)
/opt/conda/envs/fastai2/lib/python3.7/site-packages/effdet/object_detection/target_assigner.py in _create_regression_targets(self, anchors, groundtruth_boxes, match)
167 zero_box = torch.zeros(4, device=device)
168 matched_gt_boxes = match.gather_based_on_match(
--> 169 groundtruth_boxes.boxes(), unmatched_value=zero_box, ignored_value=zero_box)
170 matched_gt_boxlist = box_list.BoxList(matched_gt_boxes)
171 if groundtruth_boxes.has_field(self._keypoints_field_name):
/opt/conda/envs/fastai2/lib/python3.7/site-packages/effdet/object_detection/matcher.py in gather_based_on_match(self, input_tensor, unmatched_value, ignored_value)
171 input_tensor = torch.cat([ss, input_tensor], dim=0)
172 gather_indices = torch.clamp(self.match_results + 2, min=0)
--> 173 gathered_tensor = torch.index_select(input_tensor, 0, gather_indices)
174 return gathered_tensor
IndexError: index out of range in self
Here are some values:
ipdb> p input_tensor
tensor([[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[468., 353., 52., 386.]])
ipdb> p gather_indices.shape
torch.Size([49104])
ipdb> p gather_indices
tensor([ 1, 1, 1, ..., 1, 1, 24554])
ipdb> p self.match_results
tensor([ -1, -1, -1, ..., -1, -1, 24552])
Hello. Thanks for your implementation.
I read through the config.py and see the number class =90, therefore the weight of cls_net.predict.conv_pw
is [810x fpn_feature x1x1], where the number of anchor =9. (90x9=810)
This is different with I expect, since COCO has 80 foreground classes, it should be 80x9=720 instead.
Are we getting closer to the final release of the weights that get equal accuracy to what the paper suggests? I think you have been working for over a month on this model now, could you also explain the difficulties you faced in reaching the same accuracy the paper claimed? Would be interested to learn from your experience on how to approach these models.
I confirmed that there is an efficientnet in the gen-efficientnet you made and an efficientnet in the timm.
Does it matter whether you use timm's efficientnet or gen-efficientnet's efficientnet?
Hi Ross,
As you suggest, I have try to rerun the code from Alex here https://www.kaggle.com/shonenkov/training-efficientdet.
The notebook runs well using kaggle instance, but it gets error message: "TypeError: forward() takes 3 positional arguments but 4 were given", while I am trying to run on my local machine. My packages are effdet version 0.18.2, Pytorch version 1.5, and timm version 0.1.30.
I have also refer the code to wrap like this
class ExtendDetBenchTrain(DetBenchTrain):
def init(self, model, config):
super(ExtendDetBenchTrain, self).init(model, config)
def forward(self, x, target):
class_out, box_out = self.model(x)
cls_targets, box_targets, num_positives = self.anchor_labeler.batch_label_anchors(
x.shape[0], target['bbox'], target['cls'])
loss, class_loss, box_loss = self.loss_fn(class_out, box_out, cls_targets, box_targets, num_positives)
output = dict(loss=loss, class_loss=class_loss, box_loss=box_loss)
return output
but it still gets the same error
Can you give me an advice.
Many thank
Linh
Just ran coco eval but thought adding download instructions on the readme directly would be helpful for people?
#// from tf page - Download coco data.
!wget http://images.cocodataset.org/zips/val2017.zip
!wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
!unzip val2017.zip
!unzip annotations_trainval2017.zip
Due to kaggle GWD need to use MIT liscense :(
Hi,
Thank you for the code repo.
I am training on my custom dataset (5 class). The training is completed successfully (tf_efficientdet_d1). I am using single gpu for training/testing.
Whilst evaluating on test set using validate.py, its raising below error to load the checkpoints model weight.
bench = create_model( args.model, bench_task='predict', pretrained=args.pretrained, redundant_bias=args.redundant_bias, checkpoint_path=args.checkpoint, checkpoint_ema=args.use_ema, )
RuntimeError: Error(s) in loading state_dict for EfficientDet: Missing key(s) in state_dict: "fpn.resample.3.conv.conv.bias",
Is it related to saving the model, using nn.DataParallel, or something else?
Many thanks,
Neel
Hi @rwightman
Just checked your code. Quite surprising to see your default training mode is to train without pre-train weight. How to train with pre-train weight? And normally, when we refer to pre-train weight, we are refer to pre-train weight for backbone right? So what's the point to present both "pretrain" and "pretrain backbone" option to user?
And what's drop_path rate and drop_block rate (did not see them in paper)? You set default value as None, so do you want your user to use them? If yes, can you provide default value?
I also want to know where is the yaml file in this line "args, args_text = _parse_args()"?
And how to setup num_classes to correct number?
Thanks for any help.
Hello,
I appreciate your great works.
By the way, I'm double-checking your implementation with the original TF implementation.
In this perspective, I think your code is different in HeadNet, which does not resue batch-norm layers for inputs from different levels.
efficientdet-pytorch/effdet/efficientdet.py
Line 328 in 20cd5f3
It looks weird to me that convolutions are reused but batch-norms.
Is it intended?
@rwightman Any reasons? And did you consider the case where image sizes are different?
I tried running COCO eval using TF ported weights for D5 as a sanity check and got an error.
I don't get this error while running the same check on D0, D1 or D2. Don't know if there is a problem with the tensorflow weights or I'm supposed to call .contiguous()
somewhere
This was my command:
python validate.py /mscoco --model tf_efficientdet_d5 --checkpoint tf_efficientdet_d5-ef44aea8.pth
I get the following error:
Traceback (most recent call last):
File "validate.py", line 188, in <module>
main()
File "validate.py", line 184, in main
validate(args)
File "validate.py", line 139, in validate
output = bench(input, target['img_scale'], target['img_size'])
File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/kgupta/code/sixth_sense/efficientdet-pytorch/effdet/bench.py", line 73, in forward
class_out, box_out = self.model(x)
File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/kgupta/code/sixth_sense/efficientdet-pytorch/effdet/efficientdet.py", line 467, in forward
x = self.backbone(x.contiguous())
File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/miniconda/envs/py37/lib/python3.7/site-packages/timm/models/efficientnet.py", line 478, in forward
x = self.conv_stem(x)
File "/miniconda/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/miniconda/envs/py37/lib/python3.7/site-packages/timm/models/layers/conv2d_same.py", line 30, in forward
return conv2d_same(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups)
File "/miniconda/envs/py37/lib/python3.7/site-packages/timm/models/layers/conv2d_same.py", line 17, in conv2d_same
return F.conv2d(x, weight, bias, stride, (0, 0), dilation, groups)
File "/miniconda/envs/py37/lib/python3.7/site-packages/apex/amp/wrap.py", line 28, in wrapper
return orig_fn(*new_args, **kwargs)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.
I am curious to know your view on efficientnetLiteB0 as a backbone. Google has officially not released but do you think switching backbone from efficientnetB0 to efficientnetLiteB0 will make coco AP degrade less than 5% of original or it will degrade more?
In the current code base, if I call efficientnetLiteB0 from your timm model collection, will it work?
Hello Ross!
First of all, thank you for all the amazing work put into this repo, your efforts into making sure your implementation could replicate the original results from the paper makes your code stand out.
The team I'm a part of is developing an object detection library called mantisshrimp and we're looking forward to add efficientdet to our arsenal.
Before I go to questions, let me give you a very brief background:
Differently than other object detection libraries, our main goal is not to implement everything ourselves, but instead to provide a framework that makes it really easy to integrate implementations made by the community.
As an example, the library does not contain any implementation of a training loop! Instead, we provide adapters to libraries like fastai and lightning that handle the training loop, if you're curious to learn more, take a look at our introduction guide.
The same can be said for models, we currently only have support for torchvision's rcnns, and we choose this implementation of efficientdet to add next ๐ฅณ
Now that you know a little bit about the background, let me get to the questions (sorry for the long list):
What is the recommended way of installing the library?
I'm currently doing pip install git+https://github.com/rwightman/efficientdet-pytorch.git
but there is no mention on the README, is this the recommended way?
What class is used as background? 0 or -1?
I found this comment saying background should be -1
, but I wanted to confirm.
What is the order of bounding box coordinates?
Again, I'm very sure you're using (xmin, ymin, xmax, ymax), but better safe than sorry ๐
How did you fixed the pycocotools CocoEvaluator with transforms problem?
This is a problem we're facing as well, pycocotools is the most annoying thing ever and we're even thinking of reimplementing this entire metric and stop depending on it.
Just to be sure that we're facing the same problem: The problem is that CocoEvaluator requires you to pass all targets when you first instantiate it, any transforms applied after that will be disregarded and the computed metric will be incorrect.
I think my next question is related to how you solved this.
What is img_scale
? How do I use it correctly?
If this is related to pycocotools, is there a way of disabling it? Because we'll do evaluation outside this library.
General advice
Any general advice? Any important detail I should pay extra attention? I'm currently following the Kaggle notebook mentioned in the readme as a guide, although some of the stuff is outdated it's still very helpful.
Thank you very much for your project, really looking into it. It would be awesome to integrate it within Detectron2 pipeline as current FPN implementation is super slow.
Hi, I tried to create predictions for a custom image shape other than that used for model training. I noticed that using img_size parameter in DetBenchPredict didn't output expected results (weird boxes).
Instead I got the desired results by relying on config parameter during model construction which is a bit clunky for me.
def load_net(checkpoint_path, image_size=512):
config = get_efficientdet_config('tf_efficientdet_d6')
net = EfficientDet(config, pretrained_backbone=False)
config.num_classes = 1
config.image_size=image_size
net.class_net = HeadNet(config, num_outputs=config.num_classes, norm_kwargs=dict(eps=.001, momentum=.01))
checkpoint = torch.load(checkpoint_path)
net.load_state_dict(checkpoint['model_state_dict'])
del checkpoint
gc.collect()
net = DetBenchPredict(net, config)
Seems img_size is not used in generate_predictions function, but it takes these info from the initialization parameters from initial config.
**boxes = decode_box_outputs(box_outputs.float(), anchor_boxes, output_xyxy=True)**
boxes = clip_boxes_xyxy(boxes, img_size / img_scale) # clip before NMS better?
Any clue about how to improve this feature? Maybe I can help.
@rwightman Hi, nice work!
2020-04-11
Cleanup post-processing. Less code and a five-fold throughput increase on the smaller models. D0 running > 130 img/s on a single 2080Ti, D1 > 130 img/s on dual 2080Ti up to D7 @ 8.5 img/s.
Since coco really takes a long time to train, did u consider train D2 with a smaller dataset?
Also, I believe yolov5 repo can easily get AP up to 40+ on coco now.
Thank you very much for your project, when the training code release?
Hi!
Thx for this amazing work!
I'm not very experienced yet, can you tell me how can i compute loss (only) on validation?
Set torch.no_grad() enough for this? Cause model.eval() changes output of DetBenchTrain.
if not self.training:
# if eval mode, output detections for evaluation
I'm little confused.
Hi @rwightman,
Thank you for your amazing work.
When I tried using tf_efficientdet_d7, I found out it actually used tf_efficientdet_d6 backbone. Diving a little into your code, I found a small bug in efficientdet_model_param_dict in model_config.py.
tf_efficientdet_d7=dict(
name='tf_efficientdet_d7',
backbone_name='tf_efficientnet_b6', # ~~> should be b7
image_size=1536,
fpn_channels=384,
fpn_cell_repeats=8,
box_class_repeats=5,
anchor_scale=5.0,
fpn_name='bifpn_sum', # Use unweighted sum for training stability.
backbone_args=dict(drop_rate=0.5, drop_path_rate=0.2),
url='https://github.com/rwightman/efficientdet-pytorch/releases/download/v0.1/tf_efficientdet_d7-f05bf714.pth'
),
Hope you fix it soon.
Have a nice day.
Hi,
I'm very impressed by your wonderful work.
I wonder your implementation models are as fast as original work.
I can't find the inference time of your implementation models.
Could you let us know the inference time, too?
I have tried to add the biFPN into CenterMask and experienced that biFPN was slower than the original FPN because of its many layers.
Looks like 'efficientdet_d3' has the wrong configuration. Looks like it was just a copy and paste error, because it is currently the same as 'efficientdet_d2'.
efficientdet-pytorch/effdet/config/model_config.py
Lines 102 to 113 in 6ff9140
https://github.com/rwightman/pytorch-image-models/blob/master/timm/data/config.py#L82
in function:
def get_mean_by_model(model_name):
model_name = model_name.lower()
if 'dpn' in model_name:
return IMAGENET_DPN_STD
elif 'ception' in model_name or ('nasnet' in model_name and 'mnasnet' not in model_name):
return IMAGENET_INCEPTION_MEAN
else:
return IMAGENET_DEFAULT_MEAN
expected IMAGENET_DPN_MEAN?
Hi,
I know that you have given a link to finetuning on custom dataset. But being a naive programmer, is there a way to use your training script for finetuning. For example load your train model and fine tune it with a custom dataset which has same images and annotations as mscoco.
Hi rwightman,
It doesn't matter what batchsize I use for d7, it doesn't fit in the GPU (16GB). I am finetuning and following is the runtime command.
./distributed_train.sh 2 /datasetlocation --model tf_efficientdet_d7 -b 1 --amp --lr .04 --sync-bn --opt adam --fill-color mean --sched plateau
Please can you let me know about your setting for training d7 or am I doing something wroing.
Looks like a recent commit modified the interface to the validate
function in train.py
so that it takes an evaluator
keyword argument. The evaluator is not set in the branch that calls validate for ema. This causes the default 'map' --eval-metric
to fail.
Train: 0 [3664/3665 (100%)] Loss: 3.457180 (3.6070) Time: 3.425s, 5.25/s (0.722s, 24.92/s) LR: 1.000e-04 Data: 0.174 (0.011)
/pytorch/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
Test: [ 0/154] Time: 1.064 (1.064) Loss: 2.4959 (2.4959)
Test: [ 50/154] Time: 0.412 (0.428) Loss: 2.6239 (2.5469)
Test: [ 100/154] Time: 0.413 (0.421) Loss: 2.3167 (2.5493)
Test: [ 150/154] Time: 0.410 (0.428) Loss: 2.5362 (2.5629)
Test: [ 154/154] Time: 1.433 (0.434) Loss: 2.5216 (2.5620)
Loading and preparing results...
DONE (t=3.12s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=41.79s).
Accumulating evaluation results...
DONE (t=15.55s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.002
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.002
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.005
Test (EMA): [ 0/154] Time: 1.070 (1.070) Loss: 2.5276 (2.5276)
Test (EMA): [ 50/154] Time: 0.378 (0.391) Loss: 2.6372 (2.5649)
Test (EMA): [ 100/154] Time: 0.376 (0.384) Loss: 2.3414 (2.5664)
Test (EMA): [ 150/154] Time: 0.372 (0.382) Loss: 2.5519 (2.5790)
Test (EMA): [ 154/154] Time: 0.376 (0.381) Loss: 2.5492 (2.5783)
Traceback (most recent call last):
File "train.py", line 608, in <module>
main()
File "train.py", line 448, in main
lr_scheduler.step(epoch + 1, eval_metrics[eval_metric])
KeyError: 'map'
Here is the code snippet:
Lines 437 to 444 in 91f5172
Looks like a simple fix .. and workaround is simply not to set --ema
.
Hello Ross,
Thank you for this awesome project!
I'm been playing with it to build a pipeline and got stuck by the annoying input of target["img_scale"] & target["img_size"]
required by the DetBenchTrain
class. I expect the model to output pred boxes
during validation, but I can't feed the right target to the model.
def forward(self, x, target):
class_out, box_out = self.model(x)
cls_targets, box_targets, num_positives = self.anchor_labeler.batch_label_anchors(
x.shape[0], target['bbox'], target['cls'])
loss, class_loss, box_loss = self.loss_fn(class_out, box_out, cls_targets, box_targets, num_positives)
output = dict(loss=loss, class_loss=class_loss, box_loss=box_loss)
if not self.training:
# if eval mode, output detections for evaluation
class_out, box_out, indices, classes = _post_process(self.config, class_out, box_out)
output['detections'] = _batch_detection(
x.shape[0], class_out, box_out, self.anchors.boxes, indices, classes,
target['img_scale'], target['img_size'])
return output
For example, my batch_size=2, image_size=512
, how should I feed in these 2 tensors? The bug pops whatever I'm trying to feed. Please help.
Great thanks!
Hi @rwightman ,
Thank you for your great job.
Have you a plan for integration into detectron2?
The question is how do you think the performance is still keep like that if you intergrate to detectron2.
Thank you so much.
minor nit but to make this as future proof as possible:
34 in transforms.py, ImageToTensor:
np_img = np.rollaxis(np_img, 2) # HWC to CHW
np wants people to move to 'moveaxis' so I updated my copy to:
np_img = np.moveaxis(np_img,2,0)
docs: This function continues to be supported for backward compatibility, but you should prefer moveaxis. The moveaxis function was added in NumPy 1.11.
and moveaxis doc:
https://numpy.org/doc/stable/reference/generated/numpy.moveaxis.html#numpy.moveaxis
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.