Giter VIP home page Giter VIP logo

siftmetal's Introduction

SIFTMetal

Luke Van In, 2023

An implementation of the Scale Invariant Feature Transform (SIFT) algorithm, for Apple devices, written in Swift using Metal compute.

SIFT is described in the paper "Distinctive Image Features from Scale-Invariant Keypoints" by David Lowe published in 2004[1].

This implementation is based on the source code from the "Anatomy of the SIFT Method" by Ives Ray-Otero and Mauricio Delbracio published in the Image Processing Online (IPOL) journal in 2014[2], and source code by Rob Whess[3].

The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving.[4]

A novel Approximate K Nearest Neighbors algorithm is provided for matching SIFT descriptors, using a trie data structure. The complexity of the algorithm is:

  • Initial construction and update is linear O(n) complexity.
  • Nearest neighbor search is O(1) complexity.

SIFT keypoints of objects are first extracted from a set of reference images[1] and stored in a database. An object is recognized in a new image by individually comparing each feature from the new image to this database and finding candidate matching features based on Euclidean distance of their feature vectors. From the full set of matches, subsets of keypoints that agree on the object and its location, scale, and orientation in the new image are identified to filter out good matches. The determination of consistent clusters is performed rapidly by using an efficient hash table implementation of the generalised Hough transform. Each cluster of 3 or more features that agree on an object and its pose is then subject to further detailed model verification and subsequently outliers are discarded. Finally the probability that a particular set of features indicates the presence of an object is computed, given the accuracy of fit and number of probable false matches. Object matches that pass all these tests can be identified as correct with high confidence.[2] [1]: https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf "Distinctive Image Features from Scale-Invariant Keypoints", Lowe, International Journal of Computer Vision, 2004 [2]: https://github.com/robwhess/opensift OpenSIFT, Whess, GitHub, 2012 [3]: http://www.ipol.im/pub/art/2014/82/article.pdf "Anatomy of the SIFT Method", Rey-Otero & Delbracio, IPOL, 2014 [4]: https://en.wikipedia.org/wiki/Scale-invariant_feature_transform Scale-invariant feature transform, Wikipedia

siftmetal's People

Contributors

lukevanin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

siftmetal's Issues

About the LICENSE

Hello, first of all, thank you for opening this wonderful code.
I'm considering about using this feature/descriptor module(both extraction and matching) in my iOS project. However I first need to check the LICENSE of this repository before testing the module's performance and adding it as a dependency framework. So I'd be grateful if you could let me know whether you have any plans of specifying the LICENSE explicitly in the repository.
Thank you for your kind work again.

Runtime Error with ConvertSRGBToGrayscaleKernel.encode in SIFTMetal Package

Hello,

Thank you for this wonderful project. I tried using this package, but I couldn't get it to work. I would appreciate it if you could point out any mistakes in my usage.

MTLTexture format: MTLPixelFormat(rawValue: 70), width: 944, height: 624
fopen failed for data file: errno = 2 (No such file or directory)
Errors found! Invalidating cache...
๐œŽ(1, 0) = 1.2489997
GaussianKernel sigma=1.2489997 radius=5 size=11
octave 0 dimensions = IntegralSize(width: 1888, height: 1248) delta = 0.5 sigmas = [0.8, 1.0079368, 1.2699208, 1.6, 2.0158737, 2.5398417]
๐œŒ[0 โ†’ 1] = 1.2262733
๐œŒ[1 โ†’ 2] = 1.5450078
๐œŒ[2 โ†’ 3] = 1.946588
๐œŒ[3 โ†’ 4] = 2.4525466
๐œŒ[4 โ†’ 5] = 3.0900156
GaussianKernel sigma=1.2262733 radius=5 size=11
GaussianKernel sigma=1.5450078 radius=7 size=15
GaussianKernel sigma=1.946588 radius=8 size=17
GaussianKernel sigma=2.4525466 radius=10 size=21
GaussianKernel sigma=3.0900156 radius=13 size=27
octave 1 dimensions = IntegralSize(width: 944, height: 624) delta = 1.0 sigmas = [1.6, 2.0158737, 2.5398417, 3.2, 4.0317473, 5.0796833]
๐œŒ[0 โ†’ 1] = 1.2262733
๐œŒ[1 โ†’ 2] = 1.5450078
๐œŒ[2 โ†’ 3] = 1.946588
๐œŒ[3 โ†’ 4] = 2.4525466
๐œŒ[4 โ†’ 5] = 3.0900156
GaussianKernel sigma=1.2262733 radius=5 size=11
GaussianKernel sigma=1.5450078 radius=7 size=15
GaussianKernel sigma=1.946588 radius=8 size=17
GaussianKernel sigma=2.4525466 radius=10 size=21
GaussianKernel sigma=3.0900156 radius=13 size=27
octave 2 dimensions = IntegralSize(width: 472, height: 312) delta = 2.0 sigmas = [3.2, 4.0317473, 5.0796833, 6.4, 8.063495, 10.159367]
๐œŒ[0 โ†’ 1] = 1.2262733
๐œŒ[1 โ†’ 2] = 1.5450078
๐œŒ[2 โ†’ 3] = 1.946588
๐œŒ[3 โ†’ 4] = 2.4525466
๐œŒ[4 โ†’ 5] = 3.0900156
GaussianKernel sigma=1.2262733 radius=5 size=11
GaussianKernel sigma=1.5450078 radius=7 size=15
GaussianKernel sigma=1.946588 radius=8 size=17
GaussianKernel sigma=2.4525466 radius=10 size=21
GaussianKernel sigma=3.0900156 radius=13 size=27
octave 3 dimensions = IntegralSize(width: 236, height: 156) delta = 4.0 sigmas = [6.4, 8.063495, 10.159367, 12.8, 16.12699, 20.318733]
๐œŒ[0 โ†’ 1] = 1.2262733
๐œŒ[1 โ†’ 2] = 1.5450078
๐œŒ[2 โ†’ 3] = 1.946588
๐œŒ[3 โ†’ 4] = 2.4525466
๐œŒ[4 โ†’ 5] = 3.0900156
GaussianKernel sigma=1.2262733 radius=5 size=11
GaussianKernel sigma=1.5450078 radius=7 size=15
GaussianKernel sigma=1.946588 radius=8 size=17
GaussianKernel sigma=2.4525466 radius=10 size=21
GaussianKernel sigma=3.0900156 radius=13 size=27
octave 4 dimensions = IntegralSize(width: 118, height: 78) delta = 8.0 sigmas = [12.8, 16.12699, 20.318733, 25.6, 32.25398, 40.637466]
๐œŒ[0 โ†’ 1] = 1.2262733
๐œŒ[1 โ†’ 2] = 1.5450078
๐œŒ[2 โ†’ 3] = 1.946588
๐œŒ[3 โ†’ 4] = 2.4525466
๐œŒ[4 โ†’ 5] = 3.0900156
GaussianKernel sigma=1.2262733 radius=5 size=11
GaussianKernel sigma=1.5450078 radius=7 size=15
GaussianKernel sigma=1.946588 radius=8 size=17
GaussianKernel sigma=2.4525466 radius=10 size=21
GaussianKernel sigma=3.0900156 radius=13 size=27
octave 5 dimensions = IntegralSize(width: 59, height: 39) delta = 16.0 sigmas = [25.6, 32.25398, 40.637466, 51.2, 64.50796, 81.27493]
๐œŒ[0 โ†’ 1] = 1.2262733
๐œŒ[1 โ†’ 2] = 1.5450078
๐œŒ[2 โ†’ 3] = 1.946588
๐œŒ[3 โ†’ 4] = 2.4525466
๐œŒ[4 โ†’ 5] = 3.0900156
GaussianKernel sigma=1.2262733 radius=5 size=11
GaussianKernel sigma=1.5450078 radius=7 size=15
GaussianKernel sigma=1.946588 radius=8 size=17
GaussianKernel sigma=2.4525466 radius=10 size=21
GaussianKernel sigma=3.0900156 radius=13 size=27
octave 6 dimensions = IntegralSize(width: 29, height: 19) delta = 32.0 sigmas = [51.2, 64.50796, 81.27493, 102.4, 129.01591, 162.54987]
๐œŒ[0 โ†’ 1] = 1.2262733
๐œŒ[1 โ†’ 2] = 1.5450078
๐œŒ[2 โ†’ 3] = 1.946588
๐œŒ[3 โ†’ 4] = 2.4525466
๐œŒ[4 โ†’ 5] = 3.0900156
GaussianKernel sigma=1.2262733 radius=5 size=11
GaussianKernel sigma=1.5450078 radius=7 size=15
GaussianKernel sigma=1.946588 radius=8 size=17
GaussianKernel sigma=2.4525466 radius=10 size=21
GaussianKernel sigma=3.0900156 radius=13 size=27
v(1, 0) Convert texture to grayscale

This is the output log from the console.

SiftTest`ConvertSRGBToGrayscaleKernel.encode(commandBuffer:inputTexture:outputTexture:):
    0x102cb4590 <+0>:   sub    sp, sp, #0x80
    0x102cb4594 <+4>:   stp    x26, x25, [sp, #0x30]
    0x102cb4598 <+8>:   stp    x24, x23, [sp, #0x40]
    0x102cb459c <+12>:  stp    x22, x21, [sp, #0x50]
    0x102cb45a0 <+16>:  stp    x20, x19, [sp, #0x60]
    0x102cb45a4 <+20>:  stp    x29, x30, [sp, #0x70]
    0x102cb45a8 <+24>:  add    x29, sp, #0x70
    0x102cb45ac <+28>:  mov    x19, x2
    0x102cb45b0 <+32>:  mov    x21, x1
    0x102cb45b4 <+36>:  mov    x22, x0
    0x102cb45b8 <+40>:  adrp   x8, 68
    0x102cb45bc <+44>:  ldr    x9, [x8, #0x100]
    0x102cb45c0 <+48>:  add    x9, x9, #0x1
    0x102cb45c4 <+52>:  str    x9, [x8, #0x100]
    0x102cb45c8 <+56>:  adrp   x8, 68
    0x102cb45cc <+60>:  ldr    x9, [x8, #0x108]
    0x102cb45d0 <+64>:  add    x9, x9, #0x1
    0x102cb45d4 <+68>:  str    x9, [x8, #0x108]
    0x102cb45d8 <+72>:  adrp   x24, 60
    0x102cb45dc <+76>:  ldr    x1, [x24, #0xc78]
    0x102cb45e0 <+80>:  mov    x0, x21
    0x102cb45e4 <+84>:  bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb45e8 <+88>:  mov    x23, x0
    0x102cb45ec <+92>:  ldr    x1, [x24, #0xc78]
    0x102cb45f0 <+96>:  mov    x0, x19
    0x102cb45f4 <+100>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb45f8 <+104>: cmp    x23, x0
    0x102cb45fc <+108>: b.ne   0x102cb4798               ; <+520> [inlined] Swift runtime failure: precondition failure at <compiler-generated>
    0x102cb4600 <+112>: adrp   x8, 68
    0x102cb4604 <+116>: ldr    x9, [x8, #0x110]
    0x102cb4608 <+120>: add    x9, x9, #0x1
    0x102cb460c <+124>: str    x9, [x8, #0x110]
    0x102cb4610 <+128>: adrp   x25, 60
    0x102cb4614 <+132>: ldr    x1, [x25, #0xc80]
    0x102cb4618 <+136>: mov    x0, x21
    0x102cb461c <+140>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb4620 <+144>: mov    x23, x0
    0x102cb4624 <+148>: ldr    x1, [x25, #0xc80]
    0x102cb4628 <+152>: mov    x0, x19
    0x102cb462c <+156>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb4630 <+160>: cmp    x23, x0
    0x102cb4634 <+164>: b.ne   0x102cb479c               ; <+524> [inlined] Swift runtime failure: precondition failure at <compiler-generated>
    0x102cb4638 <+168>: adrp   x8, 68
    0x102cb463c <+172>: ldr    x9, [x8, #0x118]
    0x102cb4640 <+176>: add    x9, x9, #0x1
    0x102cb4644 <+180>: str    x9, [x8, #0x118]
    0x102cb4648 <+184>: adrp   x23, 60
    0x102cb464c <+188>: ldr    x1, [x23, #0xc70]
    0x102cb4650 <+192>: mov    x0, x21
    0x102cb4654 <+196>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb4658 <+200>: cmp    x0, #0x50
    0x102cb465c <+204>: b.ne   0x102cb47a0               ; <+528> [inlined] Swift runtime failure: precondition failure at <compiler-generated>
    0x102cb4660 <+208>: adrp   x8, 68
    0x102cb4664 <+212>: ldr    x9, [x8, #0x120]
    0x102cb4668 <+216>: add    x9, x9, #0x1
    0x102cb466c <+220>: str    x9, [x8, #0x120]
    0x102cb4670 <+224>: ldr    x1, [x23, #0xc70]
    0x102cb4674 <+228>: mov    x0, x19
    0x102cb4678 <+232>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb467c <+236>: cmp    x0, #0x37
    0x102cb4680 <+240>: b.ne   0x102cb47a4               ; <+532> [inlined] Swift runtime failure: precondition failure at <compiler-generated>
    0x102cb4684 <+244>: adrp   x8, 60
    0x102cb4688 <+248>: ldr    x1, [x8, #0xd30]
    0x102cb468c <+252>: mov    x0, x22
    0x102cb4690 <+256>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb4694 <+260>: mov    x29, x29
    0x102cb4698 <+264>: bl     0x102cd8164               ; symbol stub for: objc_retainAutoreleasedReturnValue
    0x102cb469c <+268>: cbz    x0, 0x102cb47b0           ; <+544> [inlined] Swift runtime failure: Unexpectedly found nil while unwrapping an Optional value at <compiler-generated>
    0x102cb46a0 <+272>: mov    x22, x0
    0x102cb46a4 <+276>: ldr    x2, [x20, #0x10]
    0x102cb46a8 <+280>: adrp   x8, 60
    0x102cb46ac <+284>: ldr    x1, [x8, #0xe48]
    0x102cb46b0 <+288>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb46b4 <+292>: adrp   x20, 60
    0x102cb46b8 <+296>: ldr    x1, [x20, #0xe90]
    0x102cb46bc <+300>: mov    x0, x22
    0x102cb46c0 <+304>: mov    x2, x19
    0x102cb46c4 <+308>: mov    x3, #0x0
    0x102cb46c8 <+312>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb46cc <+316>: ldr    x1, [x20, #0xe90]
    0x102cb46d0 <+320>: mov    x0, x22
    0x102cb46d4 <+324>: mov    x2, x21
    0x102cb46d8 <+328>: mov    w3, #0x1
    0x102cb46dc <+332>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb46e0 <+336>: ldr    x1, [x24, #0xc78]
    0x102cb46e4 <+340>: mov    x0, x19
    0x102cb46e8 <+344>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb46ec <+348>: adds   x20, x0, #0x10
    0x102cb46f0 <+352>: b.vs   0x102cb47a8               ; <+536> [inlined] Swift runtime failure: arithmetic overflow at <compiler-generated>
    0x102cb46f4 <+356>: ldr    x1, [x25, #0xc80]
    0x102cb46f8 <+360>: mov    x0, x19
    0x102cb46fc <+364>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb4700 <+368>: adds   x8, x0, #0x10
    0x102cb4704 <+372>: b.vs   0x102cb47ac               ; <+540> [inlined] Swift runtime failure: arithmetic overflow at <compiler-generated>
    0x102cb4708 <+376>: sub    x9, x8, #0x1
    0x102cb470c <+380>: sub    x10, x20, #0x1
    0x102cb4710 <+384>: add    x11, x20, #0xe
    0x102cb4714 <+388>: cmp    x10, #0x0
    0x102cb4718 <+392>: csel   x10, x11, x10, lt
    0x102cb471c <+396>: asr    x10, x10, #4
    0x102cb4720 <+400>: add    x8, x8, #0xe
    0x102cb4724 <+404>: cmp    x9, #0x0
    0x102cb4728 <+408>: csel   x8, x8, x9, lt
    0x102cb472c <+412>: asr    x8, x8, #4
    0x102cb4730 <+416>: adrp   x9, 60
    0x102cb4734 <+420>: ldr    x1, [x9, #0xd60]
    0x102cb4738 <+424>: stp    x10, x8, [sp, #0x18]
    0x102cb473c <+428>: mov    w8, #0x1
    0x102cb4740 <+432>: str    x8, [sp, #0x28]
    0x102cb4744 <+436>: mov    w9, #0x10
    0x102cb4748 <+440>: dup.2d v0, x9
    0x102cb474c <+444>: str    q0, [sp]
    0x102cb4750 <+448>: str    x8, [sp, #0x10]
    0x102cb4754 <+452>: add    x2, sp, #0x18
    0x102cb4758 <+456>: mov    x3, sp
    0x102cb475c <+460>: mov    x0, x22
    0x102cb4760 <+464>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb4764 <+468>: adrp   x8, 60
    0x102cb4768 <+472>: ldr    x1, [x8, #0xd70]
    0x102cb476c <+476>: mov    x0, x22
    0x102cb4770 <+480>: bl     0x102cd80bc               ; symbol stub for: objc_msgSend
    0x102cb4774 <+484>: mov    x0, x22
    0x102cb4778 <+488>: bl     0x102cd8764               ; symbol stub for: swift_unknownObjectRelease
    0x102cb477c <+492>: ldp    x29, x30, [sp, #0x70]
    0x102cb4780 <+496>: ldp    x20, x19, [sp, #0x60]
    0x102cb4784 <+500>: ldp    x22, x21, [sp, #0x50]
    0x102cb4788 <+504>: ldp    x24, x23, [sp, #0x40]
    0x102cb478c <+508>: ldp    x26, x25, [sp, #0x30]
    0x102cb4790 <+512>: add    sp, sp, #0x80
    0x102cb4794 <+516>: ret    
    0x102cb4798 <+520>: brk    #0x1
    0x102cb479c <+524>: brk    #0x1
->  0x102cb47a0 <+528>: brk    #0x1
    0x102cb47a4 <+532>: brk    #0x1
    0x102cb47a8 <+536>: brk    #0x1
    0x102cb47ac <+540>: brk    #0x1
    0x102cb47b0 <+544>: brk    #0x1

This is the thread where the error occurred.

import UIKit
import SIFTMetal
import Metal

class SiftDescriber {
   var device: MTLDevice!
   
   init() {
       // Metalใƒ‡ใƒใ‚คใ‚นใฎไฝœๆˆ
       self.device = MTLCreateSystemDefaultDevice()
       guard let device = self.device else {
           print("Failed to create Metal device")
           return
       }
       
       // ็”ปๅƒใƒ‡ใƒผใ‚ฟใฎใƒญใƒผใƒ‰
       guard let image = loadImageFromDocuments() else {
           print("Failed to load image file")
           return
       }
       
       // ็”ปๅƒใƒ‡ใƒผใ‚ฟใ‚’MTLTextureใซๅค‰ๆ›
       guard let texture = texture(from: image, device: device) else {
           print("Failed to convert image to MTLTexture")
           return
       }
       
       // MTLTextureใฎๅฝขๅผใจใ‚ตใ‚คใ‚บใ‚’ใƒญใ‚ฐๅ‡บๅŠ›
       print("MTLTexture format: \(texture.pixelFormat), width: \(texture.width), height: \(texture.height)")
       
       // SIFTใฎ่จญๅฎš
       let width = texture.width
       let height = texture.height
       let configuration = SIFT.Configuration(inputSize: SIFTMetal.IntegralSize(width: width, height: height))
       let sift = SIFT(device: device, configuration: configuration)
       
       // ใ‚ญใƒผใƒใ‚คใƒณใƒˆใฎๅ–ๅพ—
       let keypoints = sift.getKeypoints(texture)
       print("Success get keypoints: \(keypoints)")
   }
   
   func loadImageFromDocuments() -> UIImage? {
       let fileManager = FileManager.default
       guard let documentsURL = fileManager.urls(for: .documentDirectory, in: .userDomainMask).first else {
           return nil
       }
       let imagePath = documentsURL.appendingPathComponent("Photogrammetry/images/000.png")
       guard fileManager.fileExists(atPath: imagePath.path) else {
           print("File does not exist at path: \(imagePath.path)")
           return nil
       }
       return UIImage(contentsOfFile: imagePath.path)
   }
   
   func texture(from image: UIImage, device: MTLDevice) -> MTLTexture? {
       guard let cgImage = image.cgImage else {
           return nil
       }

       let width = cgImage.width
       let height = cgImage.height

       let textureDescriptor = MTLTextureDescriptor.texture2DDescriptor(pixelFormat: .rgba8Unorm, width: width, height: height, mipmapped: false)

       guard let texture = device.makeTexture(descriptor: textureDescriptor) else {
           return nil
       }

       let context = CGContext(data: nil,
                               width: width,
                               height: height,
                               bitsPerComponent: 8,
                               bytesPerRow: 4 * width,
                               space: CGColorSpaceCreateDeviceRGB(),
                               bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue)

       context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))

       guard let pixelData = context?.data else {
           return nil
       }

       texture.replace(region: MTLRegionMake2D(0, 0, width, height),
                       mipmapLevel: 0,
                       withBytes: pixelData,
                       bytesPerRow: 4 * width)

       return texture
   }
}

This is the code I tried.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.