Comments (9)
When number of bounding boxes per image is varying, use (R', 4) and (R',).
Let me confirm. (R', 4)
is the coordinates of bounding boxes and (R')
is the indices of corresponding image in input batch. Is it right?
I think this is a trade-off problem between simplicity of APIs and efficiency.
If we use list of (R', 4)
format for batched images, the APIs will be more simple because (R', 4)
and (R',)
format is very different from other two. We can convert it to (R', 4)
and (R',)
format internally if needed. As you pointed out, this conversion will cause an overhead. Actually, I don't know how large the overhead will be. But I guess the overhead will not be so large and we can choose simplicity of APIs.
from chainercv.
Let me confirm. (R', 4) is the coordinates of bounding boxes and (R') is the indices of corresponding image in input batch. Is it right?
Yes. Thanks for confirming.
from chainercv.
As I pointed out in the initial post, the convention needs to be carried on when bbox comes with other data type.
This means that variable sized batch of images need to be represented by the same convention used for bbox.
Concatenating list of batched images may introduce non-negligible overheads.
from chainercv.
Concatenating list of batched images may introduce non-negligible overheads.
Perhaps, I am misunderstanding. Concatenating list of batched images
is list of (B, C, H, W)
-> (N, B, C, H, W)
?
from chainercv.
Concatenating list of batched images
means [(R_1, C, H, W) ... , (R_B, C, H, W)]
-> (R_1 + ... + R_B, C, H, W)
.
from chainercv.
In terms of simplicity, I am not sure if list of bbox is better.
Since you need to convert batch array representation back into list representation whenever exposing values publicly, extra conversion code becomes necessary making internal code less obvious.
For example, I tried this convention in a building block of Faster RCNN, and it required me some extra conversion codes that do not look nice to me.
In the code, a function takes (B, C, H, W)
and list of bboxes as inputs and returns list of (R, C)
.
from chainercv.
What do you think?
from chainercv.
Since you need to convert batch array representation back into list representation whenever exposing values publicly, extra conversion code becomes necessary making internal code less obvious.
What I was talking about is simplicity of APIs. I thought less representations are better. As you pointed out, internal conversion code makes the code a little complicated but I think it is acceptable.
from chainercv.
Here are rules we will be using.
bb
is a(R,)
bbox
is a(R, 4)
bboxes
is a(B, R, 4)
or list of(R_i, 4) i=1, ..., B
.rois
is a(R', 4)
which consists of bounding boxes for multiple images. Assuming that there are B images each containingR_i
bounding boxes,R' = \sum R_i
.rois
comes together with a(R',)
array calledbatch_indices
, which contains batch indices of images to which bounding boxes correspond to.
from chainercv.
Related Issues (20)
- Faster RCNN training result problem HOT 2
- Add a img.resize function in utils HOT 2
- A function to return segmented image HOT 2
- no module named 'chainercv.datasets' HOT 6
- Problems of FCIS HOT 6
- Problem about eval_detection HOT 2
- Accuracy problems of FCIS example HOT 5
- loc_normalize_std in ProposalTargetCreator HOT 5
- yolo/train_v3.py does not work HOT 2
- DirectoryParsingLabelDataset fails to read images with an alpha channel
- Allow empty object bounding box for SSD training
- `neg_iou_thresh_lo` value in `ProposalTargetCreator`
- Is it fixed for loading the trained weights for FPN model? HOT 2
- Change Request in chainercv/examples/fpn/train_multi.py HOT 1
- build wheels for chainerCV failed HOT 1
- can't install environment, invalid channel HOT 3
- "Introduction to Chainer" doc link broken
- Request for train.py for YOLO
- eval_semantic_segmentation and calc_semantic_segmentation_confusion for when we have ignore label
- possible bug in the way that mIoU is computed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chainercv.