Giter VIP home page Giter VIP logo

aann's Introduction

[WIP] aann

Approximate all nearest-neighbor ("aann") search using neighborhood graphs. Implemented in Rust with Python bindings.

Based on the paper ALL NEAREST NEIGHBOUR CALCULATION BASED ON DELAUNAY GRAPHS, Soudani & Karami, 2018 (arXiv) (link).

Problem

Given two point clouds Q and P, for each point q in Q find its nearest neighbor p among the points in P.

Solution

  1. Calculate Delaunay graphs for both point clouds.
  2. Start with a random vertex q in Q and traverse P using an A* search to find its nearest neighbor p.
  3. Move to a vertex adjacent to q and search P for its nearest neighbor using p as the start. Since we start the search where we have already established spatial proximity the A* search should finish quickly.
  4. Rinse-repeat until we found nearest neighbors for all points in Q.

TODOs

  • generalize to N-dimensions (currently only 3D)
  • implement k-nearest neighbors (currently only 1)
  • use SIMD (singe instruction multiple data) for distance calculations
  • test various neighborhood graphs
  • see if we can immplement additional parameters (e.g. distance_upper_bound or maybe a distance_lower_bound if that's useful)
  • implement alternative distance metrics (currently only Euclidean)
  • benchmarks

Usage

import aann
import numpy as np

N = 10000
x = Delaunay(np.random.rand(N, 3))
y = Delaunay(np.random.rand(N, 3))

distances, indices = aann.all_nearest_neighbours(x, y)

Build

  1. cd into directory
  2. Activate virtual environment: source .venv/bin/activate
  3. Run maturin build --release to build a wheel or use maturin develop to compile and install in development mode

SIMD

aann makes use of core::simd module which means:

  1. You need to use the nightly build:

    # Install and update nightly
    rustup install nightly
    rustup update nightly
    # Make sure you are in project directory
    cd aann
    # Tell the project to use nightly
    rustup override set nightly
  2. By default, only the oldest SIMD extension ssse2 is enabled during compilation. It is very likely that your processor supports newer extensions such as avx2 or even avx512f. To check what's supported run:

    $ cargo install cargo-simd-detect --force
    $ cargo simd-detect
    extension       width                   available       enabled
    sse2            128-bit/16-bytes        true            true
    avx2            256-bit/32-bytes        true            false
    avx512f         512-bit/64-bytes        true            false

    You can tell the compiler to use newer extensions by setting rust flags:

    # To activate a specific extension
    export RUSTFLAGS="-C target-feature=+avx2"
    
    # Alternatively to activate all available extensions
    export RUSTFLAGS="-C target-cpu=native"

Test

First make sure pytest and pandas are installed:

pip install pytest -U

Then run the test-suite like so:

pytest --verbose -s

Note that unless you compiled with maturin develop --release the timings will be much slower (up to 10x) than in a release build.

aann's People

Contributors

schlegelp avatar

Stargazers

Gregory Jefferis avatar

Watchers

 avatar  avatar

aann's Issues

Use Gabriel graphs

A Gabriel graph is a subset of a Delaunay graph where points are neighbours only if their diametral sphere is empty:

Screenshot 2024-02-03 at 13 32 49

This leads to much sparser graphs, i.e. smaller neighbourhoods to search.

Neighborhood-based-on-graph-structures-A-Delaunay-triangulation-and-B-Gabriel-graph

Algorithms for Fast Vector Quantization

The paper Algorithms for Fast Vector Quantization, Sunil Arya proposes a bunch of KDTree optimisations (some of which are used by e.g. pykdtree). A couple of them might be interesting for us too:

  1. partial distance calculation that exits early if the sum of the squared distances along the dimensions goes above the current minimum distance
  2. a search on (modified) relative neighbourhood graphs (see also this paper)
  3. a limit on the number of points visited

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.