Comments (11)
label_id is always integer. But coordinates sometimes take float values.
I understand that this is a problem.
Currently, this is handled by making a bounding box to be float type (e.g. https://github.com/pfnet/chainercv/blob/master/chainercv/datasets/pascal_voc/voc_detection_dataset.py#L100), and asking user to convert it to integer type whenever a user needs integer data.
Personally, I do not like to add additional type (bbox_dtype = [('rect', 'f8', 4), ('label', 'u4')]
), as it would be cumbersome for users to import or copy this dtype every time to their own projects. I think this is not really Pythonic. Also, I am not sure if this dtype
is a standard convention in the community or not.
How about changing type of tuple returned by Dataset?
Currently, it returns
img (CHW) and bbox (R, 5)
Instead,
img (CHW), bbox_no_label (R, 4), and Label (R,)
from chainercv.
Also, it is important that DetectionVisReport is easy enough for users to use.
http://chainercv.readthedocs.io/en/latest/reference/extensions.html#chainercv.extensions.DetectionVisReport
from chainercv.
I understand why you do not want to use a structured array. I agree that an additional dtype
is a little cumbersome.
As you suggested, img (CHW), bbox_no_label (R, 4), and Label (R,)
seems a better solution.
from chainercv.
By defining bbox
as a set of four coordinates, we can replace the name bbox_no_label
with bbox
.
This definition is natural because bbox
of Pascal VOC does not contain class information.
from chainercv.
Does img (CHW) float32, bbox (R, 4) float32, label (R,) int32
sound good to you?
from chainercv.
Yes. It looks good to me.
from chainercv.
OK.
I will change specification of bbox related functionalities.
from chainercv.
@furushchev
What do you think about this proposal?
from chainercv.
LGTM to change to (R, 4)
because:
- For resizing bounding box according to image size,
(R, 4)
would be better. - labels are integer, values of bbox may be float.
from chainercv.
I find the same issue in keypoint
.
https://github.com/pfnet/chainercv/blob/master/chainercv/transforms/keypoint/resize_keypoint.py#L4
vaild
is always bool
while x
and y
may be float.
from chainercv.
Resolved #47.
from chainercv.
Related Issues (20)
- Faster RCNN training result problem HOT 2
- Add a img.resize function in utils HOT 2
- A function to return segmented image HOT 2
- no module named 'chainercv.datasets' HOT 6
- Problems of FCIS HOT 6
- Problem about eval_detection HOT 2
- Accuracy problems of FCIS example HOT 5
- loc_normalize_std in ProposalTargetCreator HOT 5
- yolo/train_v3.py does not work HOT 2
- DirectoryParsingLabelDataset fails to read images with an alpha channel
- Allow empty object bounding box for SSD training
- `neg_iou_thresh_lo` value in `ProposalTargetCreator`
- Is it fixed for loading the trained weights for FPN model? HOT 2
- Change Request in chainercv/examples/fpn/train_multi.py HOT 1
- build wheels for chainerCV failed HOT 1
- can't install environment, invalid channel HOT 3
- "Introduction to Chainer" doc link broken
- Request for train.py for YOLO
- eval_semantic_segmentation and calc_semantic_segmentation_confusion for when we have ignore label
- possible bug in the way that mIoU is computed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chainercv.