There are three possible representations for batch of bounding boxes. <ol dir="aut

Concatenating list of batched images means <code clas

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Here are rules we will be using. bb<

Representation for batch of bounding boxes about chainercv HOT 9 CLOSED

chainer commented on July 27, 2024

Representation for batch of bounding boxes

from chainercv.

Comments (9)

Hakuyume commented on July 27, 2024

When number of bounding boxes per image is varying, use (R', 4) and (R',).

Let me confirm. (R', 4) is the coordinates of bounding boxes and (R') is the indices of corresponding image in input batch. Is it right?

I think this is a trade-off problem between simplicity of APIs and efficiency.
If we use list of (R', 4) format for batched images, the APIs will be more simple because (R', 4) and (R',) format is very different from other two. We can convert it to (R', 4) and (R',) format internally if needed. As you pointed out, this conversion will cause an overhead. Actually, I don't know how large the overhead will be. But I guess the overhead will not be so large and we can choose simplicity of APIs.

from chainercv.

yuyu2172 commented on July 27, 2024

Let me confirm. (R', 4) is the coordinates of bounding boxes and (R') is the indices of corresponding image in input batch. Is it right?

Yes. Thanks for confirming.

from chainercv.

yuyu2172 commented on July 27, 2024

As I pointed out in the initial post, the convention needs to be carried on when bbox comes with other data type.
This means that variable sized batch of images need to be represented by the same convention used for bbox.
Concatenating list of batched images may introduce non-negligible overheads.

from chainercv.

Hakuyume commented on July 27, 2024

Concatenating list of batched images may introduce non-negligible overheads.

Perhaps, I am misunderstanding. Concatenating list of batched images is list of (B, C, H, W) -> (N, B, C, H, W)?

from chainercv.

yuyu2172 commented on July 27, 2024

Concatenating list of batched images means [(R_1, C, H, W) ... , (R_B, C, H, W)] -> (R_1 + ... + R_B, C, H, W).

from chainercv.

yuyu2172 commented on July 27, 2024

In terms of simplicity, I am not sure if list of bbox is better.
Since you need to convert batch array representation back into list representation whenever exposing values publicly, extra conversion code becomes necessary making internal code less obvious.

For example, I tried this convention in a building block of Faster RCNN, and it required me some extra conversion codes that do not look nice to me.
In the code, a function takes (B, C, H, W) and list of bboxes as inputs and returns list of (R, C).

https://github.com/yuyu2172/chainercv/blob/4e25beb116336c4c7c1462b752f38937cae1a2db/chainercv/links/model/faster_rcnn/faster_rcnn_vgg.py#L94

from chainercv.

yuyu2172 commented on July 27, 2024

@Hakuyume

What do you think?

from chainercv.

Hakuyume commented on July 27, 2024

Since you need to convert batch array representation back into list representation whenever exposing values publicly, extra conversion code becomes necessary making internal code less obvious.

What I was talking about is simplicity of APIs. I thought less representations are better. As you pointed out, internal conversion code makes the code a little complicated but I think it is acceptable.

from chainercv.

yuyu2172 commented on July 27, 2024

Here are rules we will be using.

bb is a (R,)
bbox is a (R, 4)
bboxes is a (B, R, 4) or list of (R_i, 4) i=1, ..., B.
rois is a (R', 4) which consists of bounding boxes for multiple images. Assuming that there are B images each containing R_i bounding boxes, R' = \sum R_i. rois comes together with a (R',) array called batch_indices, which contains batch indices of images to which bounding boxes correspond to.

from chainercv.

Representation for batch of bounding boxes about chainercv HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent