The beesbook binary schema has now a more serious form. It is time to review and c

Thank you for your comments. <a class="user-mention notranslate" data-hovercard-type="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Capnp Schema Review about bb_binary HOT 6 CLOSED

berleon commented on September 26, 2024

Capnp Schema Review

from bb_binary.

Comments (6)

daign commented on September 26, 2024

The ground truth data for the tracking should be stored along with the tracking results, because these two are similar structured and have to be compared with each other. And as I understand this binary schema is not for storing the tracking results.

from bb_binary.

walachey commented on September 26, 2024

This is to exist in parallel to the(/a) database, right? Especially for the later steps of the processing.

PS: I don't really know how the camera metadata is stored at the moment, but is the left/right/top/bottom thing enough? That means that a camera could never be at an angle (in respect of the ground).

from bb_binary.

berleon commented on September 26, 2024

Thank you for your comments. @daign: I see your point. The binary format is not capable of storing tracks.
@walachey: The later steps, aka tracking will write the results into a database. You are right about the orientation. A float would be the better option.

from bb_binary.

fraboeni commented on September 26, 2024

In a parallel version of the tracking, global IDs could be useful as data of many frames would be loaded and processed at the same time. By using global IDs we could avoid having detections with the same ID in one dataframe.

from bb_binary.

b2m commented on September 26, 2024

@berleon thank you for your effort in deducing the lean schema. As asked I am going to comment on this:

Questions/Answers

1. When should we use global ids?

As far as I understand we could identify each detection via a combination of frame_id and detection_tagIdx. This joint key follows the structure the data is stored and accessed so imho this is enough and we do not need an additional global id for each detection.

When processing we are able to generate this joint key and use it as unique/global id on the fly.

2. Store Camera Metadata?

This is most likely only interesting for the processing pipeline producing the detection data and not for the tracking part. The current tracking approaches do not consider the images and therefore are not interested in the camera metadata. But maybe Lukas has something to add to this point?

3. How to save Ground Truth?

Actually there are two currently used approaches to save ground truth: the approach of @timlandgraf is to generate or add detections which also identifies false negatives. The other approach used by @daign is to add a truthID to each true positive detection from the pipeline. Currently we are working with the second approach.

Thinking in pipelines and Big Data I would like to have a separate data structure for saving/loading ground truth that easily could be joined with this data model. Considering ground truth in the binary schema is a waste of space, as only a small percentage of the data actually has a ground truth.

As the ground truth data is not that big, csv and in memory data structures should be enough and could be handled by almost every data processing tool.

To the point of what sort of ground truth data we should generate or save (see my comment to the two approaches at the beginning of this section) it is my opinion that this github issue is the wrong place to discuss it.

4. Better names for `DetectionCVP` and `DetectionDP`

Not better but an alternative approach would be to describe what the data looks like instead of where it comes from. So DetectionMultiples and DetectionSingle would be an idea, as the CVP produces multiple id suggestions for each detection, whereas the new pipeline only has one.

Further comments on the schema

`tagIdx`

Maybe I got it wrong, but I am confused about the name tagIdx. We are using the term tag for the id tags we clued to the back of the bees. In this case tagIdx does not refer to one tag with its distinct id, but to a unique detection that could or could not be a tag (false positive). To put it differently: we have one tag on multiple frames and possibly in multiple detections (doubles or triples) in one frame. But we have unique detections (in one frame) that we are trying to identify via their tagIdx. This is confusing and maybe the naming of this id should be adjusted accordingly (maybe only id?).

`candidateIdx`

I guess the candidateIdx is referring to the multiple suggestions per detection of the CVP? Again this is kind of confusing as we are using the term candidate in other contexts.

`gridIdx`

I guess the gridIdx is referring to the three grids the decoder is using to identify the id? So each detection has multiple candidates and each candidate has multiple grids that are used to identify them? At least this is what I get from the schema... but I guess we actually have multiple grids per detection and multiple candidates per grid.

`DetectionDP`

When trying to match detections and combine them into tracks it would be helpful to know how certain the decoder is/was about the decodedId. So a score on each detection would be very appreciated.

Further Processing

In further processing steps we would need more metadata fields like

tracking candidate id (meaning the updated id determined by the tracking algorithm)
some dancing information (who is dancing, who is following whom...)

But as already mentioned, this should be a problem of the next processing step where we might not have the constraints we have when using the cray pipeline (e.g. much less data...).

from bb_binary.

b2m commented on September 26, 2024

As the schema is used in production for some time now I am going to close this issue. Further changes should be discussed in a separate issue.

from bb_binary.

Capnp Schema Review about bb_binary HOT 6 CLOSED

Comments (6)

Questions/Answers

1. When should we use global ids?

2. Store Camera Metadata?

3. How to save Ground Truth?

4. Better names for `DetectionCVP` and `DetectionDP`

Further comments on the schema

`tagIdx`

`candidateIdx`

`gridIdx`

`DetectionDP`

Further Processing

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Comments (6)

Questions/Answers

1. When should we use global ids?

2. Store Camera Metadata?

3. How to save Ground Truth?

4. Better names for DetectionCVP and DetectionDP

Further comments on the schema

tagIdx

candidateIdx

gridIdx

DetectionDP

Further Processing

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

4. Better names for `DetectionCVP` and `DetectionDP`

`tagIdx`

`candidateIdx`

`gridIdx`

`DetectionDP`