Giter VIP home page Giter VIP logo

slytest03's Introduction

Overview

The repo contains the script, which extracts clouds of points from scenes in NuScenes dataset.
Disclaimer: this script doesn't work correctly, since it's not reading the angles of objects. It's just an example of how the same problem can be solved, but the solution is unfinished and will generate incorrect data.
Note: the script requires NuScenes mini dataset to be present in the directory where the script will be launched.
The script uses nuscenes and numpy to work with the dataset and `open3d`` to write data to the ply files. In addition, the repo contains a simple logger, which uses Python's built-in logging module, and writes logs both to the file and stdout. The log_handler module also contains a decorator for measuring the runtime of the functions.

Get started

The script is designed to be launched from the command line with arguments.

Example of splitting the image: python3 extract_boxes.py --scene_number <>
Where scene_number is required and should be in the range from 0 to 9 (since the mini dataset contains only 10 items).

Asciinema examples

Extracting clouds of points:
asciicast

Report

I started solving this problem by getting familiar with each dataset, that was provided in the test task (waymo, nuscenes, kitti3d). I checked how I can download the datasets, what are their sizes and which tools are provided by the developers of the dataset. I decided that NuScenes will provide convenient tools along with easy data download and downloaded the mini version of the dataset.
However, I ran into problems installing scenes for Python because while installing dependencies some errors occurred. I spent some time trying to solve the problem, but it ended with the simple answer from Google: "it's just won't work on Windows". So I "moved" my development to Linux and successfully installed all necessary libraries. After that, I started to read NuScenes tutorial and found out, that the developers of NuScenes did a great job since it was clear enough how to start working and most importantly - understand how data is stored in the dataset. I believe that this part of the task took the most time since I was new to data like this and I was researching it step by step. It took a considerable amount of time because before I started to work with this task, I didn't quite understand how lidars work in general, and how they store data and calculate coordinates. So it was pretty interesting to know these details.
After I finally understood the structure of the data and how to get needed parts of it, I started the approach with the easiest way: we've got a box with its sizes and center coordinates, and we've got points, which have the coordinates too. It didn't sound too complex to get all those points. But there was one thing, that I still didn't know about lidars and how they store object coordinates.
So, at first, it was really easy: get all boxes from the dataset and check if the name of the box is the one we need. Then we get the coordinates of the center, and dimensions and calculate the area of interest. But it just didn't work. Since I was new to this, I considered that it's something wrong with the mask, what I write or how I unpacked sizes or calculate the area. I didn't check a range of values in the ps array, where coordinates were stored. So I spent some time continuously trying to change the mask, the way how I used coordinates, etc. After I ran out of ideas about what can be wrong with my code in the algorithm, I finally decided to check the values in the points array. And found out that all this time I was trying to catch the points which were stored as global coordinates, while boxes had relative coordinates.
The understanding of this helped a lot, so I started to look for useful data in the dataset, because I remembered that I read somewhere about those global/relative coordinates, while googling and reading tutorials. And I found the file which was containing translation data for each sample. I used this data to convert pc array with points from global to relative (but for some reason I'm still not sure that I did it right) and finally received some results. The solution that presents in this repo wasn't supposed to be a final, since I used a mask just to understand what I'm doing, how coordinates look, etc. I noticed that box objects also have data about angles, so it was obvious that my algorithm won't work, just because it's calculating rectangles without any data about the position in space. I was planning to use some functions from nuscenes-devkit for the final solution.
I found that nuscenes-devkit contains the points_in_box() function, which takes a points array and a Box object and returns exactly what I needed to solve the problem. The Box object was also easy-to-create with the data, that I have: box = Box(box.center, box.wlh, box.orientation, name=box.name). I created an example of this class and pass it into the function, but it turned out that the function expects that the Box object contains corner data. But for some reason, I didn't manage to create a correct Box object with corner data even though the box.corners() function was returning correct corner data. I've tried to manually add corner data to the object, but it didn't help either. I've checked the source code of the function in the NuScenes GitHub repo, but it didn't help me to understand what's wrong here. Since I had no other options and understanding of what and why is going wrong, I decided to stop solving the problem and provided the current solution, which doesn't return the correct data, but at least it returns anything.

slytest03's People

Contributors

iwatkot avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.