Giter VIP home page Giter VIP logo

matft's People

Contributors

ensan-hcl avatar hartwoolery avatar jjjkkkjjj avatar rehatkathuria avatar sunkensplash avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

matft's Issues

Adding demo for image processing

Hi jjjkkkjjj,

Great work on bring Numpy to Swift!!!

I am learning how to do inference within CoreML using Swift.

So far I have gotten the UIImage from an image picker and I need to do preprocessing
e.g. resize, transpose, normalize(mean=(0,0,0), std=(1,1,1))

And after hours and hours searching, Swift just proofed that it is not a language which is friendly for image processing.
And I found your Repo here which has all the amazing feature I need.

So I think it is very helpful if you could add a demo for this.

Cheers

Boolean Indexing support ?

Hi ! Thanks for open sourcing your code.

Would you mind suggesting the best way to do boolean indexing like numpy ? for example, I can do this in numpy easily

import numpy as np

img = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
], dtype=np.uint8)
img[img > 3] = 10
print(img)

# [[ 1  2  3]
# [10 10 10]
# [10 10 10]]

I tried to do it element-wise but I found the performance is significantly slower than using Swift Array

var testing = Array(repeating: 1, count: 160000)
var bar = MfArray(testing, shape: [400,400])
var start = Date()
for i in 0..<400 {
    for j in 0..<400 {
        bar[i,j] = bar[i,j] as! Int + 1
    }
}
print("\(start.timeIntervalSinceNow * -1) seconds elapsed")
//3.706043004989624 seconds elapsed


start = Date()
for i in 0..<400 {
    for j in 0..<400 {
        let index1D = i*400+j
        testing[index1D] = testing[index1D] + 1
    }
}
print("\(start.timeIntervalSinceNow * -1) seconds elapsed")
//0.05165994167327881 seconds elapsed

Thanks !

Tips to convert an MfArray to MLMultiArray

Could you kindly inform me if there is a way to convert MfArray to MLMultiArray , as I have found a method to convert the latter to the former but I am uncertain if there is a similar method for the reverse operation. If such a method doesn't exists, could you please recommend the most effective/performant approach for accomplishing this?

My current approach is

extension MfArray {
    func toMLMultiArray() throws -> MLMultiArray {
        guard let array = self.astype(.Float).flatten().toArray() as? [Float] else {
              //throw some errors here
        }
        let arrShape = self.shape
        let mlShapedArray: MLShapedArray<Float> = MLShapedArray(scalars: array, shape: arrShape)
        return MLMultiArray(mlShapedArray)
    }
}

Thanks in advance!

Add fancy indexing

Hey, I'm just curious if there are any plans to implement "fancy indexing", where you can pass a list of indeces to an MfArray, and return the items at those indeces, like in numpy. Thanks, its a great library so far.

Boolean indexing doesnt support equality?

let img = MfArray([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], mftype: .UInt8)
img[img == 3]

this throws
Referencing operator function '==' on 'BinaryInteger' requires that 'MfArray' conform to 'BinaryInteger'

Subscript bug using `Matft.arange()`

Numpy version

a = np.zeros([88, 2, 2])
b = np.zeros([32, 88, 2])
print(a[np.arange(88), b[0, :, 0].astype(int), 0].shape) # Outputs to (88,)

Matft version

let a = Matft.nums(0, shape: [88, 2, 2])
let b = Matft.nums(0, shape: [32, 88, 2]).astype(.Int)
print(a[Matft.arange(start: 0, to: 88, by: 1), b[0, Matft.all, 0], 0].shape) // Outputs to [88, 88]

According to Numpy the shape of the MfArray should be [88], not [88, 88].

Numpy FFT implementation?

Hi there

Firstly, this library looks amazing. You've done a ton of work and it looks really promising. Thank you!

I was curious if there was any plans to implement Numpy's FFTs in vDSP? If not, do you do work for hire?

I did some work looking into this myself for work on porting OpenAI's Whisper to CoreML / Accelerate https://github.com/vade/OpenAI-Whisper-CoreML

And I documented some of my findings in this issue here: vade/OpenAI-Whisper-CoreML#1

It seems like Numpy, PyTorch, Rosa / RosaKit all use PocketFFT to do non power of 2 DFTs, which is why the output matches more or less exactly numerically.

PocketFFT doesn't use vDSP, but rather scalar - no simd acceleration.

Given the rest of MatFT's current implementation, doing the STFT and Log Mel work in MatFT would just work. The only missing piece is a numerically equivalent implementation of the Numpy / Torch 'real to complex' (rfft) logic.

Thank you again for all the work on Mattft!

memory leak...

I wrote pointer's value without initializing, too many many many memory leaks were occurred :)

See ref
Use move or initialize first.

Bad code eg. here, here, here #

Multidimensional MfArray with different length Arrays in axis 1

Hello, I need to write a lot of python (numpy) code in swift and there I found your library. It would be a great help. Is it possible to create MfArrays that have different lengths at axis=1. Here is an example of what I mean:

[
    [1, 2, 3, 4],
    [1, 2, 3],
    [1, 2, 3, 4, 5]
]

If so, how would I need to instantiate the array?

divide by zero error for empty arrays

thank you for the wonderful Swift library -- I have a relatively minor issue where something like MfArray([]) throws a divide by zero error. I fixed it in a fork by changing one line in shape2strides func in mfstructure.swift:

ret[index] = prevAxisNum / max(shape[index],1)

(I simply made the divisor a minimum of 1)

Let me know if I should submit a pull request or if you would like to push the change. Thank you!

Complex support

To use vdsp, DSPSplitComplex seems to be needed according to documents

To use blas package, DSPComplex will be needed according to this discussion

to achieve to support complex type, using DSPComplex is ideal?
I must check the difference between DSPComplex and DSPSplitComplex at first.

DSPComplex is the consecutive float values.

Complex data are stored as ordered pairs of floating-point numbers. Because they are stored as ordered pairs, complex vectors require address strides that are multiples of two.

by document

On the other hand, DSPSplitComplex is stored in different memories

A structure that represents a single-precision complex vector with the real and imaginary parts stored in separate arrays.

by document

So I need to implement the function to connect this difference of memory layout

Convert MfArray back to regular Swift's Array with a specific type

Hi @jjjkkkjjj . After I did some math transformations in a MfArray, I want to convert it back to a Swift's Array. What's the most efficient way to do it using your library?

For example, I want to convert the MfArray back to [Int]. Right now I'm doing it this way:

let swiftArrayAny = Array(someMfArray).data)

guard let swiftArrayInt32 = swiftArrayAny as? [Int32] else {
    fatalError()
}

let swiftArrayInt = swiftArrayInt32.map { Int($0) }

ToArray doesn't respect slices

Thank you for providing such a nice library. I've found some cases which might potentially be buggy. Could you help take a look?

Here is the test case

import Matft

final class MatftTest: XCTestCase {
    func testMatftCase1() throws {
        let a = MfArray([], mftype: .Float)
    }

    func testMatftCase2() throws {
        let b = MfArray([], mftype: .Float, shape: [0])
    }

    func testMatftCase3() throws {
        let a = MfArray([
            [1,2,3],
            [4,5,6]
        ], mftype: .Float)
        XCTAssertEqual(a[1], MfArray([4,5,6], mftype: .Float))
        XCTAssertEqual(a[1].toArray() as! [Float], [4.0,5,6])  //XCTAssertEqual failed: ("[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]") is not equal to ("[4.0, 5.0, 6.0]")
    }
}

Thanks!

How to get the content of the MfArray?

Hi I'm trying to get the values inside the MfArray by indexing. The usage is similar to the item method of numpy. Wondering what is the best way to do it?

var a = MfArray([[1,2,3], [4, 5, 6]], mftype: .Int32)
a[0, 0] as! Int //works

a = MfArray([1,2,3], mftype: .Int32)
a[0] as! Int //doesn't work. I can do a[0][0][0][0][0][0][0] for infinite number of times

For now I'm converting the MfArray to Array and get the first element as work around

Thanks in advance!

[Bug] Invalid access on some complex operation

Reason
DSPComplex is the normal struct. Therefore, when I want to forward the this pointer (ptr: UnsafePointer<DSPComplex>), I must not use + operator.

let newptr = ptr + 2 // invalid!!

Matft.math.pow is not working as expected

Hello,

Firstly, thank you for working on this! This is really helpful.

Matft.math.pow does not work like np.power().

np.power(a,3) raises each element of a to the third power. That is,

a = np.eye(3) * 3
np.power(a,3)
array([[27., 0., 0.],
[ 0., 27., 0.],
[ 0., 0., 27.]])

The above is returned as a result.

However, Matft.math.pow takes in a float for the first argument, and a matrix for the second, and behaves as attached following:

Screen Shot 2021-09-06 at 3 12 36 PM
Screen Shot 2021-09-06 at 3 14 38 PM

How to contribute - have some small numpy / scipy vDSP implementations

Hi there

Ive got a few minor implementations of some numpy / scipy functions on vectors in vDSP along with associated XCTests that I think would make good contributions to Matft.

I'm curious how to best / properly implement these into Matft, as it seems like having a single home for them makes sense, and your code seems very well organized.

I'm not entirely sure of the code structure / best place to implement the logic in a generic way that leverages Matfts existing code base.

Do you have any suggestions?

numpy.allclose() as an extension to Array (without NaN equality)

func allCloseTo(array: [Float], rtol: Float = 1e-5, atol: Float = 1e-8) -> Bool
    {
        precondition(self.count == array.count, "Arrays must have same size")
        
        let absDiff = vDSP.absolute( vDSP.subtract(self, array) )
            
        let maxAbsDiff = vDSP.maximum(absDiff)
        
        let scaledTol = Swift.max(atol, rtol * vDSP.maximum( vDSP.absolute(self) + vDSP.absolute(array) ) )
    
        return maxAbsDiff <= scaledTol
    }

scipy.spatial.distance.cosine

func CosineDistance(_ v1: [Float], _ v2: [Float]) -> Float
{
    precondition(v1.count == v2.count, "Arrays must have same size")

    var dotProduct: Float = 0.0
    var v1Norm: Float = 0.0
    var v2Norm: Float = 0.0
    
    let n = vDSP_Length(v1.count)
    
    // Calculate dot product of v1 and v2
    vDSP_dotpr(v1, 1, v2, 1, &dotProduct, n)
    
    // Calculate the Euclidean norm of v1
    vDSP_svesq(v1, 1, &v1Norm, n)
    v1Norm = sqrt(v1Norm)
    
    // Calculate the Euclidean norm of v2
    vDSP_svesq(v2, 1, &v2Norm, n)
    v2Norm = sqrt(v2Norm)
    
    // Calculate cosine distance
    let distance = 1.0 - (dotProduct / (v1Norm * v2Norm))
    
    return distance
}

and scipy.ndimage.gaussian_filter_1d as array extensions allowing one to cache the computed gaussian kernel.

Note I only really implement the default padding of reflect so far.

 static func generateGaussianKernel(sigma:Float, truncate:Float = 4.0) -> [Float]
    {
        let radius:Int = Int( ceil(truncate * sigma) )
        let sigma2 = sigma * sigma
        let x:[Float] = Array<Int>( ( -radius ... radius  ) ).map { Float( $0 ) }
        let x2 = vForce.pow(bases: x, exponents: [Float](repeating: 2.0, count: x.count) )
        let y = vDSP.multiply(-0.5 / sigma2, x2)
        let phi_x = vForce.exp(y)
        return vDSP.divide(phi_x, vDSP.sum(phi_x))
    }
    
    enum PaddingMode {
        case reflect
        case edge
    }

    private func padInputArray(_ input: [Float], sigma: Float, truncate: Float, paddingMode: PaddingMode) -> [Float] {
        var paddedInput = [Float]()
        let windowSize = Int(2.0 * sigma * truncate + 1.0)
        let padSize = Swift.max(windowSize - input.count, 0)

        if padSize > 0
        {
            switch (paddingMode)
            {
                case .reflect:
                                
                var paddingStart:[Float]
                var paddingEnd:[Float]
                
                // If we pad less than our input arrays count, we select what we need from the input array
                // This wont be a 'full' pad, as we wont have all items in the array
                if padSize <= input.count
                {
                    paddingStart = Array<Float>( input[ 0 ..< Int(padSize)].reversed() )
                    paddingEnd = Array<Float>( input[ input.count - Int(padSize) ..< input.count].reversed() )
                }
                // Otherwise, we repeat reflection until we accrue pad size
                else
                {
                    paddingStart = input.reversed()
                    paddingEnd = paddingStart
                    
                    while paddingStart.count <= padSize
                    {
                        paddingStart.insert(contentsOf: paddingStart.reversed(), at: 0)
                        paddingEnd.append(contentsOf: paddingEnd.reversed())
                        
                        paddingStart = paddingStart.reversed()
                        paddingEnd = paddingEnd.reversed()
                    }
                    
                    paddingStart = Array<Float>( paddingStart.suffix( Int(sigma * truncate)  ) )
                    paddingEnd = Array<Float>( paddingEnd.prefix( Int(sigma * truncate) ) )
                }
                                
                paddedInput.append(contentsOf: paddingStart)
                paddedInput.append(contentsOf: input)
                paddedInput.append(contentsOf: paddingEnd)
                
                break

            case .edge:
                let edge = input.first ?? 0.0
                paddedInput = Array(repeating: edge, count: padSize) + input + Array(repeating: edge, count: padSize)
                
            }
            return paddedInput
        }
        
        return input

    }
    
    // Make sure your Sigma and Truncate values match above:
    func gaussianFilter1D(kernel:[Float], sigma:Float, truncate:Float = 4.0, paddingMode:PaddingMode = .reflect) -> [Float]
    {
        let paddedInput = self.padInputArray(self, sigma:sigma, truncate:truncate, paddingMode:paddingMode)
        
        var output = [Float](repeating: 0.0, count: self.count)

        vDSP.convolve(paddedInput, withKernel: kernel, result: &output)

        // Technically is this needed, our sum is always 1 ?
//        vDSP.divide(output, sigma, result: &output)
//        let sum = vDSP.sum(kernel)
//        vDSP.multiply(sum, output, result: &output)

        return output
    }

Reshape bug (Column Order)

let a = MfArray([[1, 3, 5],
                             [2, -4, -1]], mforder: .Column)
print(a.reshape([3, 1, 2]))
/*
mfarray = 
[[[	1,		2]],

[[	3,		-4]],

[[	5,		-1]]], type=Int, shape=[3, 1, 2]
*/

but must be

array([[[ 1,  3]],

       [[ 5,  2]],

       [[-4, -1]]])

Invalid broadcasting

Hi @jjjkkkjjj

I'm trying to do the following broadcasting, but it is considered as error in Matft:

let a = Matft.arange(start: 1, to: 7, by: 1, shape: [3, 2])
let b = Matft.arange(start: 1, to: 5, by: 1, shape: [2, 1, 2])
print(a - b)

Error message:

Fatal error: could not broadcast from shape 3, [2, 1, 2] into shape 3, [1, 3, 2]: file /Users/alwc/Library/Developer/Xcode/DerivedData/testnpy-afxfserzcwmbfdfgfaszjulxzrwa/SourcePackages/checkouts/Matft/Sources/Matft/core/function/conversion.swift, line 290

2020-07-21 19:05:02.261713+0800 testnpy[21523:5245927] Fatal error: could not broadcast from shape 3, [2, 1, 2] into shape 3, [1, 3, 2]: file /Users/alwc/Library/Developer/Xcode/DerivedData/testnpy-afxfserzcwmbfdfgfaszjulxzrwa/SourcePackages/checkouts/Matft/Sources/Matft/core/function/conversion.swift, line 290

In Python this is legal. For example,

x = np.random.randint(10, size=(3, 2))
y = np.random.randint(10, size=(2, 1, 2))

# Returns a shape (2, 3, 2) nd-array
print(x - y)      

Support for atan2

Thanks for great library!

Is there any way to get atan2 from combining 2 MfArrays? Like this:

let R = MfArray([a, b ,c])

let x = atan2((R[2, 1], R[2, 2])

The issue for us is that we're porting the code from numpy logic, and functionality like that is supported there. E.g. we can retrieve single Double from the expression like R[2, 1] so we're struggling now on how to get similar behaviour.

Thanks in advance!

Passing strange arguments in subscription

let a = try! Matft.mfarray.broadcast_to(MfArray([[2, 5, -1],
                                                             [3, 1, 0]]), shape: [2,2,2,3])
let b = a[0~, ~1, ~~2]

b[0, ~1] = MfArray([222]) >>>>>>> Precondition failed: -2 is out of bounds for axis 1 with 1: file

subscription arguments [0, ~1] was passed as Int of Array [0, -2]...

It's strange

Copy On Write implementation

I think it is easier for Matft to implement COW than I expected.
Because MfArray has a data class, which is MfData, all we have to do are 2 points. First add “mutating” keyword into conversion method and subscript function. Second check the _isView property in those “mutating” function and then replace the referenced MfData into the new one if the _isView is true.

[Fatal error] Subscription of view MfArray

Fatal error was occurred...
I think this was caused by extracting view's base directly.

let a = Matft.mfarray.arange(start: 0, to: 27*2, by: 2, shape: [3,3,3], mftype: .Double, mforder: .Column)

            XCTAssertEqual(a[~-1], MfArray([[[ 0, 18, 36],
                                             [ 6, 24, 42],
                                             [12, 30, 48]],

                                            [[ 2, 20, 38],
                                             [ 8, 26, 44],
                                             [14, 32, 50]]], mftype: .Double))
            let b = a[~-1]
            

            
            XCTAssertEqual(b[~1, ~2], MfArray([[[18, 19, 20],
                                                [21, 22, 23]]], mftype: .Double)) >>>>>>>>Not equal!!
            
            XCTAssertEqual(b[0], MfArray([[18, 19, 20],
                                          [21, 22, 23],
                                          [24, 25, 26]], mftype: .Double))  >>>>>>>>Not equal!!

Add @inline

Adding @inline may be efficient, and improve the performance

Difference between lapack and vDSP

There’s a difference for handling negative stride between lapack and vDSP.

In vDSP,

var a = [1,2,3,4.0]
var b = [5,6,7,2.0]
var c = [0,0,0,0.0]
        
vDSP_vaddD(&a, vDSP_Stride(1), &b, vDSP_Stride(-1), &c, vDSP_Stride(1), vDSP_Length(4))
//c -> [6.0, 2.0, 3.0, 4.0]
//cannot add properly!!!

correct one is

vDSP_vaddD(&a, vDSP_Stride(1), &b + 3, vDSP_Stride(-1), &c, vDSP_Stride(1), vDSP_Length(4))

On the other hand, in cblas

cblas_dcopy(Int32(4), &b, Int32(-1), &c, Int32(1))
//c -> [2.0, 7.0, 6.0, 5.0]
//can copy properly!!!

This line (vDSP) must be

let bptr = bptr.baseAddress! + vDSPPrams.b_offset
let sptr = sptr.baseAddress! + vDSPPrams.s_offset
dstptrT = dstptrT + vDSPPrams.b_offset

instead of

let bptr = vDSPPrams.b_offset >= 0 ? bptr.baseAddress! + vDSPPrams.b_offset : bptr.baseAddress! + bigger_mfarray.offsetIndex + vDSPPrams.b_offset
let sptr = vDSPPrams.s_offset >= 0 ? sptr.baseAddress! + vDSPPrams.s_offset : sptr.baseAddress! + smaller_mfarray.offsetIndex + vDSPPrams.s_offset
dstptrT = vDSPPrams.b_offset >= 0 ? dstptrT + vDSPPrams.b_offset : dstptrT + bigger_mfarray.offsetIndex + vDSPPrams.b_offset

cblas must be

let srcptr = cblasPrams.s_stride >= 0 ? srcptr.baseAddress! + cblasPrams.s_offset : srcptr.baseAddress! - mfarray.offsetIndex + cblasPrams.s_offset
let dstptr = cblasPrams.b_stride >= 0 ? dstptr.baseAddress! + cblasPrams.b_offset : dstptr.baseAddress! - dsttmpMfarray.offsetIndex + cblasPrams.b_offset

Subscript’s getter and setter

If I don’t use generics in MfArray, subscript’s getter and setter must be handled as Any.
However, using Any type causes unexpected error or performance loss.

MfArray<MfType: MfTypable>{
    Hoge
}

//Initialization 
let a = MfArray<Int>([1,2,3])

//Getter and setter
//Note that scalar will be handled only
subscript(indices: Int...) -> MfType{
    hoge
}

//Note that MfArray will be handled only
subscript(mfslices: MfSlice...) -> MfArray{
    fuga
}

Incorrect string description

Hi @jjjkkkjjj, whenever the MfArray is too large, the print description will be incorrect. For example,

let a = Matft.arange(start: 1, to: 40001, by: 1, shape: [40000])

print(a)
/*
mfarray = 
[    1,        2,        3,        ...,        39997,        39998,        39999], type=Int, shape=[40000]
*/

print(a[-1])
/*
40000
*/

Wrong Shape after "ufuncReduce add"

Hello,

I think there is a bug in ufuncReduce. This code

Matft.ufuncReduce(mfarray: MfArray([1,2,3,4,5,6,7,8,9,10] as [Double]), ufunc: Matft.add)

returns

MfArray([55, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

I would expect that I would get scalar or array of shape [1]. What am I doing wrong please?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.