Giter VIP home page Giter VIP logo

orienternet's Introduction

OrienterNet
Visual Localization in 2D Public Maps
with Neural Matching

Paul-Edouard Sarlin · Daniel DeTone · Tsun-Yi Yang · Armen Avetisyan · Julian Straub
Tomasz Malisiewicz · Samuel Rota Bulo · Richard Newcombe · Peter Kontschieder · Vasileios Balntas

CVPR 2023

teaser
OrienterNet is a deep neural network that can accurately localize an image
using the same 2D semantic maps that humans use to orient themselves.

This repository hosts the source code for OrienterNet, a research project by Meta Reality Labs. OrienterNet leverages the power of deep learning to provide accurate positioning of images using free and globally-available maps from OpenStreetMap. As opposed to complex existing algorithms that rely on 3D point clouds, OrienterNet estimates a position and orientation by matching a neural Bird's-Eye-View with 2D maps.

Installation

OrienterNet requires Python >= 3.8 and PyTorch. To run the demo, clone this repo and install the minimal requirements:

git clone https://github.com/facebookresearch/OrienterNet
python -m pip install -r requirements/demo.txt

To run the evaluation and training, install the full requirements:

python -m pip install -r requirements/full.txt

Demo ➡️ hf Open In Colab

Try our minimal demo - take a picture with your phone in any city and find its exact location in a few seconds!

demo
OrienterNet positions any image within a large area - try it with your own images!

Evaluation

Mapillary Geo-Localization dataset

[Click to expand]

To obtain the dataset:

  1. Create a developper account at mapillary.com and obtain a free access token.
  2. Run the following script to download the data from Mapillary and prepare it:
python -m maploc.data.mapillary.prepare --token $YOUR_ACCESS_TOKEN

By default the data is written to the directory ./datasets/MGL/. Then run the evaluation with the pre-trained model:

python -m maploc.evaluation.mapillary --experiment OrienterNet_MGL model.num_rotations=256

This downloads the pre-trained models if necessary. The results should be close to the following:

Recall xy_max_error: [14.37, 48.69, 61.7] at (1, 3, 5) m/°
Recall yaw_max_error: [20.95, 54.96, 70.17] at (1, 3, 5) m/°

This requires a GPU with 11GB of memory. If you run into OOM issues, consider reducing the number of rotations (the default is 256):

python -m maploc.evaluation.mapillary [...] model.num_rotations=128

To export visualizations for the first 100 examples:

python -m maploc.evaluation.mapillary [...] --output_dir ./viz_MGL/ --num 100 

To run the evaluation in sequential mode:

python -m maploc.evaluation.mapillary --experiment OrienterNet_MGL --sequential model.num_rotations=256

The results should be close to the following:

Recall xy_seq_error: [29.73, 73.25, 91.17] at (1, 3, 5) m/°
Recall yaw_seq_error: [46.55, 88.3, 96.45] at (1, 3, 5) m/°

The sequential evaluation uses 10 frames by default. To increase this number, add:

python -m maploc.evaluation.mapillary [...] chunking.max_length=20

KITTI dataset

[Click to expand]
  1. Download and prepare the dataset to ./datasets/kitti/:
python -m maploc.data.kitti.prepare
  1. Run the evaluation with the model trained on MGL:
python -m maploc.evaluation.kitti --experiment OrienterNet_MGL model.num_rotations=256

You should expect the following results:

Recall directional_error: [[50.33, 85.18, 92.73], [24.38, 56.13, 67.98]] at (1, 3, 5) m/°
Recall yaw_max_error: [29.22, 68.2, 84.49] at (1, 3, 5) m/°

You can similarly export some visual examples:

python -m maploc.evaluation.kitti [...] --output_dir ./viz_KITTI/ --num 100

To run in sequential mode:

python -m maploc.evaluation.kitti --experiment OrienterNet_MGL --sequential model.num_rotations=256

with results:

Recall directional_seq_error: [[81.94, 97.35, 98.67], [52.57, 95.6, 97.35]] at (1, 3, 5) m/°
Recall yaw_seq_error: [82.7, 98.63, 99.06] at (1, 3, 5) m/°

Aria Detroit & Seattle

We are currently unable to release the dataset used to evaluate OrienterNet in the CVPR 2023 paper.

Training

MGL dataset

We trained the model on the MGL dataset using 3x 3090 GPUs (24GB VRAM each) and a total batch size of 12 for 340k iterations (about 3-4 days) with the following command:

python -m maploc.train experiment.name=OrienterNet_MGL_reproduce

Feel free to use any other experiment name. Configurations are managed by Hydra and OmegaConf so any entry can be overridden from the command line. You may thus reduce the number of GPUs and the batch size via:

python -m maploc.train experiment.name=OrienterNet_MGL_reproduce \
  experiment.gpus=1 data.loading.train.batch_size=4

Be aware that this can reduce the overall performance. The checkpoints are written to ./experiments/experiment_name/. Then run the evaluation:

# the best checkpoint:
python -m maploc.evaluation.mapillary --experiment OrienterNet_MGL_reproduce
# a specific checkpoint:
python -m maploc.evaluation.mapillary \
    --experiment OrienterNet_MGL_reproduce/checkpoint-step=340000.ckpt

KITTI

To fine-tune a trained model on the KITTI dataset:

python -m maploc.train experiment.name=OrienterNet_MGL_kitti data=kitti \
    training.finetune_from_checkpoint='"experiments/OrienterNet_MGL_reproduce/checkpoint-step=340000.ckpt"'

Interactive development

We provide several visualization notebooks:

OpenStreetMap data

[Click to expand]

To make sure that the results are consistent over time, we used OSM data downloaded from Geofabrik in November 2021. By default, the dataset scripts maploc.data.[mapillary,kitti].prepare download pre-generated raster tiles. If you wish to use different OSM classes, you can pass --generate_tiles, which will download and use our prepared raw .osm XML files.

You may alternatively download more recent files from Geofabrik. Download either compressed XML files as .osm.bz2 or binary files .osm.pbf, which need to be converted to XML files .osm, for example using Osmium: osmium cat xx.osm.pbf -o xx.osm.

License

The MGL dataset is made available under the CC-BY-SA license following the data available on the Mapillary platform. The model implementation and the pre-trained weights follow a CC-BY-NC license. OpenStreetMap data is licensed under the Open Data Commons Open Database License.

BibTex citation

Please consider citing our work if you use any code from this repo or ideas presented in the paper:

@inproceedings{sarlin2023orienternet,
  author    = {Paul-Edouard Sarlin and
               Daniel DeTone and
               Tsun-Yi Yang and
               Armen Avetisyan and
               Julian Straub and
               Tomasz Malisiewicz and
               Samuel Rota Bulo and
               Richard Newcombe and
               Peter Kontschieder and
               Vasileios Balntas},
  title     = {{OrienterNet: Visual Localization in 2D Public Maps with Neural Matching}},
  booktitle = {CVPR},
  year      = {2023},
}

orienternet's People

Contributors

sarlinpe avatar skydes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

orienternet's Issues

Downloading of KITTI dataset for evaluation

Hi,

Thank you so much for sharing your amazing work. I've got a really simple problem that I can't seem to solve, hence submitting an issue.
I was able to replicate the evaluation for the MGL dataset, however when downloading and preparing the dataset for kitti, I would always encounter the following error:

OrienterNetTest) tester@workstation3:~/ChingHui/OrienterNet$ python -m maploc.data.kitti.prepare
Traceback (most recent call last):
File "/home/tester/miniconda3/envs/OrienterNetTest/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/tester/miniconda3/envs/OrienterNetTest/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/tester/ChingHui/OrienterNet/maploc/data/kitti/prepare.py", line 100, in
"--data_dir", type=Path, default=Path(KittiDataModule.default_cfg["local_dir"])
KeyError: 'local_dir'

How can I resolve this issue? Thank you so much for your time!

Error in demo.localize output

Hi. Thanks for the great work!!!
I was trying to run orienternet/maploc/demo.py and there seemed to be something wrong about the returned yaw.

Looking at the output of the ``localize'' function, at line 211, it seems there is a small bug, so it is returning the scaled longitude instead of the yaw:
return xyr[:2], xyr[1], prob, neural_map, data["image"]
while it should probably be:
return xyr[:2], xyr[2], prob, neural_map, data["image"].

Other than this, this project looks great =)

Datasets

Hi, sir, thanks for your excellent job. When I download the MGL and the kitti dataset, the process bar keeps stable as follows. Could you please offer a google link or other simple way to download the datasets?
image
image

Running visualize_predictions_sequences.ipynb

Hello, i've got a problem while running sequence mode predictions

Traceback (most recent call last) ────────────────────────────────╮
│ in :21 │
│ │
│ 18 │ del pred │
│ 19 │
│ 20 uvt_p = torch.stack([p["uvr_max"] for p in preds]) │
│ ❱ 21 uvt_seq = torch.stack([p["uvr_max_seq"] for p in preds]) │
│ 22 xy_p = torch.stack([c.to_xy(uv) for c, uv in zip(canvas, uvt_p[:,:2])]) │
│ 23 xy_seq = torch.stack([c.to_xy(uv) for c, uv in zip(canvas, uvt_seq[:,:2])]) │
│ 24 logprobs = torch.stack([p["log_probs"] for p in preds]) │
│ │
│ in :21 │
│ │
│ 18 │ del pred │
│ 19 │
│ 20 uvt_p = torch.stack([p["uvr_max"] for p in preds]) │
│ ❱ 21 uvt_seq = torch.stack([p["uvr_max_seq"] for p in preds]) │
│ 22 xy_p = torch.stack([c.to_xy(uv) for c, uv in zip(canvas, uvt_p[:,:2])]) │
│ 23 xy_seq = torch.stack([c.to_xy(uv) for c, uv in zip(canvas, uvt_seq[:,:2])]) │
│ 24 logprobs = torch.stack([p["log_probs"] for p in preds]) │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
KeyError: 'uvr_max_seq'

It seems like model is not providing a key "uvr_max_seq" in predictions
If i run preds[0].keys() i got dict_keys(['map', 'pixel_scales', 'bev', 'scores', 'log_probs', 'uvr_max', 'uv_max', 'yaw_max', 'uvr_expectation', 'uv_expectation', 'yaw_expectation', 'features_image', 'features_bev', 'valid_bev', 'log_probs_seq', 'xyr_max_seq'])

Where to find reconstruction statistics?

Thank you for the great work.

In the dataset preprocessing section, it says "We discard sequences with poor reconstruction statistics". Would there be more details on how this is done?

I was not able to find similar fields in the Mapillary API.

Thanks!

UnboundLocalError while preparing Mapillary dataset

When running python3 -m maploc.data.mapillary.prepare --token $MAPILLARY_TOKEN I get the following error:
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/mohammed/Desktop/Maynooth/Dev/OrienterNet-main/maploc/data/mapillary/prepare.py", line 399, in
process_location(
File "/home/mohammed/Desktop/Maynooth/Dev/OrienterNet-main/maploc/data/mapillary/prepare.py", line 372, in process_location
plotter.bbox(projection.unproject(bbox_tiling), "black", "tiling bounding box")
UnboundLocalError: local variable 'bbox_tiling' referenced before assignment

I'm using Python 3.10.12 on Ubuntu 22.04 LTS

Error regarding ReadTimeout occured while preparing MGL dataset

Hi,

Thank you for the great work.
I just had a small question about downloading MGL dataset. I keep getting the following error while running the code to download the Mapillary Geo-Localization dataset.
Errors such as asyncio.exceptions.CancelledError, TimeoutError, and httpcore.ReadTimeout keep occurring.

This is the code I executed python -m maploc.data.mapillary.prepare --token "my_token"

In my case, I am using a clone of the most recent version of OrienterNet. Exactly, I am using Bugfixes when preparing the Mapillary dataset(#29) commit.

Could you please let me know if there is anything I missed or need to fix while downloading the dataset?

image

image

size mismatch when fine-tuning model on KITTI dataset

Hi, thanks for your sharing. When I tried to fine tune the model on KITTI dataset using command:
python -m maploc.train experiment.name=OrienterNet_MGL_kitti data=kitti experiment.gpus=1 data.loading.train.batch_size=2 training.finetune_from_checkpoint='"experiments/OrienterNet_MGL_reproduce/orienternet_mgl.ckpt"'

It generates such error:
size mismatch for model.map_encoder.encoder.adaptation.0.0.weight: copying a param with shape torch.Size([9, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([8, 64, 1, 1]).
size mismatch for model.map_encoder.encoder.adaptation.0.0.bias: copying a param with shape torch.Size([9]) from checkpoint, the shape in current model is torch.Size([8]).

I want to ask what size should be correct, I tried change configuration in orienternet.yaml but it doesn`t work.

Question about MGL Dataset

There is no issue when using the prepare.py with the KITTI dataset, but there is a problem of connection interruption when using prepare.py with the MGL dataset. How can I fix it?

/home/dyh/anaconda3/envs/dataset/bin/python /home/dyh/Project/OrienterNet/maploc/data/mapillary/prepare.py
[2024-01-22 21:14:55 maploc INFO] Starting processing for location sanfrancisco_soma.
[2024-01-22 21:14:55 maploc INFO] Fetching metadata for all images.
5%|▍ | 1884/41861 [00:24<08:32, 78.02it/s]
Traceback (most recent call last):
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
yield
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpcore/_backends/anyio.py", line 78, in start_tls
raise exc
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpcore/_backends/anyio.py", line 69, in start_tls
ssl_stream = await anyio.streams.tls.TLSStream.wrap(
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/anyio/streams/tls.py", line 132, in wrap
await wrapper._call_sslobject_method(ssl_object.do_handshake)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/anyio/streams/tls.py", line 147, in _call_sslobject_method
data = await self.transport_stream.receive()
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 1132, in receive
raise self._protocol.exception from None
anyio.BrokenResourceError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpx/_transports/default.py", line 67, in map_httpcore_exceptions
yield
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpx/_transports/default.py", line 371, in handle_async_request
resp = await self._pool.handle_async_request(req)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpcore/_async/connection_pool.py", line 268, in handle_async_request
raise exc
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpcore/_async/connection_pool.py", line 251, in handle_async_request
response = await connection.handle_async_request(request)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpcore/_async/http_proxy.py", line 317, in handle_async_request
stream = await stream.start_tls(**kwargs)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpcore/_backends/anyio.py", line 78, in start_tls
raise exc
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/contextlib.py", line 131, in exit
self.gen.throw(type, value, traceback)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ConnectError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/dyh/Project/OrienterNet/maploc/data/mapillary/prepare.py", line 410, in
process_location(
File "/home/dyh/Project/OrienterNet/maploc/data/mapillary/prepare.py", line 305, in process_location
image_infos, num_fail = loop.run_until_complete(
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/home/dyh/Project/OrienterNet/maploc/data/mapillary/download.py", line 139, in fetch_image_infos
i, info = await task
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/asyncio/tasks.py", line 619, in _wait_for_one
return f.result() # May raise f.exception().
File "/home/dyh/Project/OrienterNet/maploc/data/mapillary/download.py", line 130, in fetch_image_info
info = await downloader.get_image_info_cached(i, path)
File "/home/dyh/Project/OrienterNet/maploc/data/mapillary/download.py", line 95, in get_image_info_cached
info = await self.get_image_info(image_id)
File "/home/dyh/Project/OrienterNet/maploc/data/mapillary/download.py", line 74, in get_image_info
r = await self.call_api(url)
File "/home/dyh/Project/OrienterNet/maploc/data/mapillary/download.py", line 63, in call_api
r = await self.client.get(url)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpx/_client.py", line 1786, in get
return await self.request(
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpx/_client.py", line 1559, in request
return await self.send(request, auth=auth, follow_redirects=follow_redirects)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpx/_client.py", line 1646, in send
response = await self._send_handling_auth(
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpx/_client.py", line 1674, in _send_handling_auth
response = await self._send_handling_redirects(
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpx/_client.py", line 1711, in _send_handling_redirects
response = await self._send_single_request(request)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpx/_client.py", line 1748, in _send_single_request
response = await transport.handle_async_request(request)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpx/_transports/default.py", line 371, in handle_async_request
resp = await self._pool.handle_async_request(req)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/contextlib.py", line 131, in exit
self.gen.throw(type, value, traceback)
File "/home/dyh/anaconda3/envs/dataset/lib/python3.8/site-packages/httpx/_transports/default.py", line 84, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ConnectError

fail to run the colab file and demo

  1. Why did I fail to run the colab file you provided directly?

UnpicklingError Traceback (most recent call last)
in <cell line: 22>()
20 # but num_rotations=64~128 is often sufficient.
21 # To reduce the memory usage, we can reduce the tile size in the next cell.
---> 22 demo = Demo(num_rotations=256, device='cpu') # change to "cuda" if you have a GPU.

2 frames
/usr/local/lib/python3.10/dist-packages/torch/serialization.py in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
1031 "functionality.")
1032
-> 1033 magic_number = pickle_module.load(f, **pickle_load_args)
1034 if magic_number != MAGIC_NUMBER:
1035 raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, '<'.

2、Unable to run download osm smoothly

MaxRetryError: HTTPSConnectionPool(host='api.openstreetmap.org', port=443): Max retries exceeded with url: /api/0.6/map.json?bbox=8.548173412781503%2C47.37771879777533%2C8.55013322635351%2C47.37904999377059 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x00000293A557CEB0>, 'Connection to api.openstreetmap.org timed out. (connect timeout=10)'))

Potential label leakage issue due to tile stitching in SD map

Hello! I'm truly thankful for the insights presented in your paper.

While studying this outstanding work, I noticed that you implemented a tiling process in lines 108 to 125. However, when reassembling the tiled rasters back into a single image, there may be discrepancies at the seams compared to the original image. This could be due to the fact that, when a straight line is divided into segments, the end of the line might be prematurely rounded to the next pixel, resulting in a 1-pixel difference in the reassembled image.

The example image below illustrates the difference between the original 256x256 SD map and the reassembled image from four 128x128 sub-images that were initially split and then stitched back together.

image image

Of course, such discrepancies are usually negligible; however, there is an exception in the following scenario:
When I obtain the WGS84 ground truth for a 2D query image, I use this ground truth as the center to extract our SD map, setting the dimensions to 256x256, while keeping the tile_size at the default value of 128.

So the tile_manager splits the tile into four parts right along the coordinates of the ground truth. Later, when we randomly select a 128x128 bounding box on this 256x256 SD map and call this function to obtain the canvas.raster for training, the model, interestingly, accurately recognizes that the seams on these maps may reveal the true position of the GT.
Consequently, our model experiences significant label leakage🤣!

Below is the visualization. Observe the cross lines at the GT location on the neural map.

image

Therefore, my conclusion is:
The process of segmenting and then reassembling the SD map leaves scars on the map that are difficult to heal, and although they are minor, they still exhibit certain features that can be learned.
If these scars happen to coincide with the ground truth or original GPS coordinates when creating the dataset, it might enable the model to directly identify the leaked labels on the raster or interfere with the sensitivity to the GPS priors.

About roll angle rectify after image flipping

Hello,

I noticed that in line 159, you employed random flipping to augment the query and map images, and later, in line 164, the query image are rectified by setting the roll and pitch angles to zero. However, I'm wondering if a negative sign should be applied to the roll angle during the rectification process after the images have been flipped.

I'm quite new to this area, thank you for your attention to this detail!

How to calculate depth map alpha for pv image?

Hello, good job for your orienternet, it is very awesome!

I meet a problem that how to calculate depth map for pv image.
The fig.7 in your paper show up depth planes alpha, it looks like depth also learned well by pose supervise. I wanna also re-visualize the depth maps for pv image. However in your repo, i can not find related code.

So I guess the sample_depth_scores func code output the related info, right?
image

Can you tell me the detailed depth calculate method? I would be appreciate for it!

Question about sequential evaluation

In maploc/evaluation/mapillary.py or maploc/evaluation/kitti.py, I notice the"max_init_error" in "default_cfg_sequential" is set to None or zero, which means during sequential evaluation, the map data is queried as a square area centered around real GPS position instead of a noisy GPS position. This could be unfair when compared between single frame result and sequential result. Have I understood the code right? or did I miss something?

actual scale for the corresponding BEV

I have a question, what is the actual scale for the corresponding BEV? For example, if I use the demo, the generated BEV square is 64*129, and the actual scale for each grid is (m)?

Questions about the Good Semantics to localize experiment(Fig 10.)

Hello, Sarlin. your research is very intersting! Thank you for your good research.

However, I asked a question about the experimental design of "Good Semantics to localize".
loca

As described in the "which map elements are most important? section in the Appendix, how did you experiment by removing objects?

Exactly, I want to know how you removed each element from the map and I also want to ask if it is included in the code.

Thank you.

a question about the demo

Hi Sarlin:
I'm trying to use the demo on the Google colab.
When i try to run the "Localize!" part , a error occurs.
Screenshot from 2023-10-31 20-00-40

could you please help me with that?
Thanks.

Error when downloading OSM map tile info

Hi! Thanks a lot for your excellent work. I am trying to run demo.ipynb in the repository, but the following error occurred.

`[2023-06-26 09:56:04 maploc INFO] Getting https://api.openstreetmap.org/api/0.6/map.json...

JSONDecodeError Traceback (most recent call last)
Cell In[2], line 28
26 # Query OpenStreetMap for this area
27 from maploc.osm.tiling import TileManager
---> 28 tiler = TileManager.from_bbox(proj, bbox + 10, demo.config.data.pixel_per_meter)
29 canvas = tiler.query(bbox)
31 # Show the inputs to the model: image and raster map

File /data/OrienterNet/maploc/osm/tiling.py:102, in TileManager.from_bbox(cls, projection, bbox, ppm, path, tile_size)
100 assert osm.box.contains(bbox_osm)
101 else:
--> 102 osm = OSMData.from_dict(get_osm(bbox_osm, path))
104 osm.add_xy_to_nodes(projection)
105 map_data = MapData.from_osm(osm)

File /data/OrienterNet/maploc/osm/download.py:35, in get_osm(boundary_box, cache_path, overwrite)
33 with cache_path.open("bw+") as fp:
34 fp.write(content)
---> 35 return json.loads(content_str)

File ~/anaconda3/envs/orienternet/lib/python3.8/json/init.py:357, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
352 del kw['encoding']
354 if (cls is None and object_hook is None and
355 parse_int is None and parse_float is None and
356 parse_constant is None and object_pairs_hook is None and not kw):
--> 357 return _default_decoder.decode(s)
358 if cls is None:
359 cls = JSONDecoder

File ~/anaconda3/envs/orienternet/lib/python3.8/json/decoder.py:337, in JSONDecoder.decode(self, s, _w)
332 def decode(self, s, _w=WHITESPACE.match):
333 """Return the Python representation of s (a str instance
334 containing a JSON document).
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()
339 if end != len(s):

File ~/anaconda3/envs/orienternet/lib/python3.8/json/decoder.py:355, in JSONDecoder.raw_decode(self, s, idx)
353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)`

This seems to occur when downloading the OSM map's JSON file. I think it was caused by not correctly passing bbox information as a parameter into the query connection.
I checked get_web_data() function in maploc/osm/download.py, the link https://api.openstreetmap.org/api/0.6/map.json shows a message "The parameter bbox is required, and must be of the form min_lon,min_lat,max_lon,max_lat."

I don't know if this is due to my network issue, the parameter issue with urllib's request, or is it a potential bug?

Inquiry regarding the yaw range of images in the MGL dataset

During my usage, I noticed that the yaw range of the images is defined as -360° to 360°, rather than the conventional 0° to 360° range. I am perplexed by this design decision and would like to understand the reasoning behind it.

Could you kindly explain why you chose to define the yaw angle within the range of -360° to 360°? Are there any specific technical or practical requirements that led to this setting? Furthermore, what implications does this variation have on data processing and analysis compared to the traditional 0° to 360° range?

About adapting the map crop size

Hello Sarlin,

The city's content which I use is very sparse such like buildings, etc.
I have to enlarge the map size to contain more content to assist localization.
So If I adapt map crop size, like 64 to 96, which other params also need to adapt?

Looking forward your reply, thanks!

Generalize to other city is not good

Hi Sarlin,

Thanks for the fantastic work. I tried several images from different cities, such as Ann Arbor, Mi; Columbus, Ohio; Atlanta, GA. The location is not as I expect. How should this work generalize better to other cities? Thank you!

Supporting localization for new land areas via Mapillary

Dear all,

I would like to extend the vanilla OrienterNet system to support localization in a new area of land which, as always, is bounded by a large rectangular patch. The patch is defined by two points A and B with latitude and longitude coordinates. I want to use the Mapillary service to provide such a functionality in the OrienterNet.

To do this, if I am correct (please correct me if I am wrong), I first need to update the file at ./maploc/data/mapillary/splits_MGL_13loc.json which is internally copied into datasets/MGL once you call "python3 -m maploc.data.mapillary.prepare --token MYTOKEN". There, I have to create two new key-value pairs for the region I need to support.

Here are some basic questions:

  1. How can we obtain the "image IDs" used in the definitions of areas such as in sanfrancisco_soma? What script is needed to obtain the image IDs corresponding to the rectangular patch of a new geographic area? Can someone point to a Mapillary snippet that does so (especially the one that was actually used by the developers of the OrienterNet to create entries for, e.g., sanfrancisco_soma)?

  2. In the file ./maploc/data/mapillary/splits_MGL_13loc.json, what is the difference between two key-value pairs that are put under "train" and under "val"? In particular, how should the image IDs be obtained (collected/selected) for either case?

Many thanks in advance.

Question about rastering the osm

Thx for producing such great work!
I have met some questions about render raster mask function for osm data. During the render_raster_mask function, I find out multipoints(more than 10) in xy space are ploted as two or zero points after the cv2.polylines function.

Inquiring about the potential issues of score and map position offsets caused by the camera position not being at the center of the BEV.

Hello, this is a very practical piece of work. I have some questions regarding the source code that I would like to consult:

  1. In conv2d_fft_batchwise, the kernel_padded is padded in the bottom-right corner to match the signal size. I am somewhat unclear on why the padding is done in the bottom-right corner.

  2. My understanding is that conv2d_fft_batchwise is for acceleration, and it actually corresponds to convolution in the time domain. The convolution results obtain a score that represents the similarity between the center of the kernel and each position on the map. However, for the Bird's Eye View (BEV) used as the kernel, the camera position should be at the very bottom center of the BEV, not at the center of the kernel. This indicates that the score does not directly correspond to the map's position but is offset by a translation quantity. This translation becomes more complex when rotating the kernel. The mapping between score and map in the loss and inference does not consider this translation quantity. I wonder if the network is expected to cover this translation through learning, and if so, will this increase the difficulty of learning? Thank you very much.

Tensor with 0 values after rectification when trying demo

Hello @ALL,

I have the problem that i cannot successfully run the demo. As I investigated the problem it seems to be a problem with the rectify_image. The image thensor has reasonable values before this step but if I step over this step the tensor has all zeros.

rectified = torch.nn.functional.grid_sample( image[None], grid_norm[None], align_corners=False, mode="bilinear" ).squeeze(0)

Rectified then only is returned as tensor([[[0., 0., 0., ..., 0., 0., 0.]

My system runs macOS 13 Ventura with a M1 chip.

Any ideas what could goes wrong here?

Loss=-inf problem when fine-tuning on KITTI

Hello,
When I fine-tune my model on KITTI following the instruction, I encountered loss=-inf problem. The fine-tuning command is

python -m maploc.train experiment.name=OrienterNet_MGL_kitti data=kitti \
    training.finetune_from_checkpoint='"experiments/OrienterNet_MGL_reproduce/checkpoint-step=340000.ckpt"'

I believe this is a issue related to yaw_prior mask. When running on KITTI, a yaw prior is created around the ground truth yaw angle. At localization, any angle masked out by the yaw prior are set to log_prob=-inf. However, the range of prior is too narrow. So at grid sampling, interpolation might use value outside the prior mask, resulting in loss=-inf.

I disabled the yaw prior by comment this line

"max_init_error_rotation": 10,

After that, the losses are back to normal.
To fix this issue, I suggest setting piror_range_rotation larger than max_init_error_rotation or disable the yaw prior mask at finetuning, because I don't see the point of using it at finetuning.
Best regards

Question about BEV generation

How to generate BEV features using panoramic data, such as downloaded Google panoramic images?
Additionally, for example, if I have panoramic imagery, can I divide it into 4 single-view images, and the geographical location of the four views is consistent, which can improve the positioning accuracy?

build error of sphinx

After I run this
python -m pip install -r requirements/full.txt

And I get this output:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'sphinx.setup_command'

But I have already install the sphinx, can you tell me how to fix the error.

Question about sample_xyz

hi sarlinpe,
Thank you for your excellent work! I noticed that in the sample_xyz function, you concatenate the angle and xy_grid, resulting in grid_norm containing an index of (angle, x, y).

grid = torch.concat([angle_norm.unsqueeze(-1), xy_norm], -1)

However, the last three dimensions in volume_padded are (x, y, angle), for you change the channel from (B,N,H,W) to (B,H,W,N) in the line below, it seems to be inconsistent.

scores = scores.moveaxis(1, -1) # B,H,W,N

Could you please confirm if this is a mistake?

Thank you!

Question about prepare MGL dataset

When attempting to download the MGL dataset using the command python -m maploc.data.mapillary.prepare --token $YOUR_ACCESS_TOKEN, the program does not respond.How can I fix it?

(py38) net@ubuntu:~/Desktop/catkin_ws/ori_ws/OrienterNet$ python -m maploc.data.mapillary.prepare --token .....
[2024-02-19 00:45:41 maploc INFO] Starting processing for location sanfrancisco_soma.
[2024-02-19 00:45:41 maploc INFO] Fetching metadata for all images.
0%| | 0/41861 [00:00<?, ?it/s]

2

But data can be successfully obtained using the following method in the same network environment.
微信图片_20240219165858

Question about the loading of osm data

Hello, thx for your great work of orienternet. I have a question about the loading of osm data. I found that in reader.py, it supports loading OSMData from "from_dict", "from_json" and "from_xml". But the main format of osm.data is "xx.osm.pbf". I wonder the utils functions of openstreetmap is written by yourself or from other resources. Or can you provide a "from_pbf" function in the code.

Thank you very much!

Cam parameters

Hi, thanks for your sharing! I`m trying to fine tune the model, I want to ask how to produce the parameters in calib_cam_to_cam.txt, calib_imu_to_velo.txt, calib_velo_to_cam.txt. Will it make the fine tune ineffective if I use these parameters of KITTI dataset provided directly?

ValueError: Need prior latlon

Why demo needs exact location in JPG file?

ValueError                                Traceback (most recent call last)

[<ipython-input-3-cf99829e5c30>](https://localhost:8080/#) in <cell line: 20>()
     18 print(f"Using image {image_path} with location prior '{address}'")
     19 
---> 20 image, camera, gravity, proj, bbox, prior_latlon = read_input_image(
     21     image_path,
     22     prior_address=address,

[/content/OrienterNet/maploc/demo.py](https://localhost:8080/#) in read_input_image(image_path, prior_latlon, prior_address, fov, tile_size_meters)
    110             logger.info("Could not find any prior location in EXIF.")
    111     if latlon is None:
--> 112         raise ValueError("Need prior latlon")
    113     latlon = np.array(latlon)
    114 

ValueError: Need prior latlon

A simple test with OrienterNet

Hi sarlin:

I make a small testing sample, followed maploc/data/mapillary/prepare.py
left all the key parameters as default:

crop_size_meters: 64
z_max: 32
x_max: 32
pixel_per_meter: 2

It's a single-test, I took 4 photos at the same place 1 by 1 as the following map indicates:
image
This is really where I stand and bearing. Then use OrienterNet to get the results(in the same order):

image
image
image
image

The red arrow is a dummy position/direction, because beside a high building the accuration of gps is really bad, so I manually filled a random data in image_infors/xxx.json.

I uploaded my testing files as ON_test and shared you with the access, sorry for personal privacy I don't want to make it public.
Just type:
python test_single.py
If you have time please have a try.

The result is not ideal. I tried so many modification on parameters but failed,such as:

  1. enlarge crop_size_meters
  2. Increase/decrease z_max/x_max/ppm

I think, maybe the problem is the depth estimation, it's easy to inference a relative depth but not absolute depth from a mono photo, for example, in the first photo there is a building about 200 meters away, but that appeared in photo and be figured out in BEV, this is already far beyond the range of z_max(32 meters), so it's hard to let the BEV to match the map in real size. I had tried to increase z_max but my GPU only has 40GB ram, after adjust z_max to 36, it crashed OOM.
I know the current BEV-Map registering is by a FFT optimized template matching, the rotation can be 64/256, is it possible to add a scale mechanism just like rotation in the matching process(of cause the resolution should be down sampled to fit the GPU RAM size).

Thanx!

BTW, why don't you use SuperGlue in this project?

A small bug

in maploc.data.mapillary.prepare.py
line 255:

shots_out = [(i, s) for i, ss in enumerate(shots_out) for s in ss if ss is not None]

if ss is None, this code will raise error, after modified into:

shots_out = [(i, s) for i, ss in enumerate(shots_out) if ss is not None for s in ss]

that will be ok

A small sample will show this problem:

a = [[1,2,3],[4,5,6],None,[7,8,9]]
b = [vi for i,v in enumerate(a) for vi in v if v is not None]

Traceback (most recent call last):
File "", line 1, in
File "", line 1, in

b = [vi for i,v in enumerate(a) if v is not None for vi in v]
b
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Generalize to Nuscenes is not good

Hi, Thx for the great work. I recently have tried to apply your work on various autonomous driving dataset and find out the performance is way lower compare to the information you gave in paper such as KITTI, under same max init error. E.g. The position XY 5m recall is only around 40% under 32 init error.

Question about scale bins

Hi,

Thank you for this piece of art, it’s so well thought out and I enjoyed going through your paper and the code. I just had a small question about the number of scale bins - I might have missed these details but I just wanted to clarify this anyway.

In the paper, it’s mentioned that 32 scale bins are used. The code however has an argument set to 33.

num_scale_bins: 33

Is this some sort of dummy parameter or does it have a significant meaning to it that I’m missing?

Thank you for your time!

Guidance on the training method(code) using the KITTI dataset

Hello, Sarlin.

I have a question regarding the training method.
In table 3 of the paper, it seems that (c) is trained only on the KITTI dataset.

table3

I attemped to train using only KITTI dataset by modifying the 'orienternet.yaml' > default > data : mapiliary to kitti.
However, when I tried training the results show -inf.
So, can you provide a guidance on the training method using the KITTI dataset?

Additionally, considering the different input image sizes between Mapiliary and KITTI, is it BEV's shape(size) also needs to change?
I am curious if there is any code to adaptively adjust the BEV shape based on the image size or focal length

I appreciate your response. Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.