jjjkkkjjj / matft Goto Github PK

View Code? Open in Web Editor NEW

120.0 3.0 17.0 3.28 MB

Numpy-like library in swift. (Multi-dimensional Array, ndarray, matrix and vector library)

License: BSD 3-Clause "New" or "Revised" License

Swift 90.94% Jupyter Notebook 1.02% Ruby 0.10% C 7.94%

swift math ndimensional-arrays ndarray matrix-library numpy complex-numbers image image-processing signal-processing

matft's People

Contributors

Stargazers

Watchers

Forkers

ensan-hcl alwc hovhanns drneurosurg trendingtechnology sonsongithub rehatkathuria synopsis dinneo yiqinzhao sunkensplash moeflon florin-pop yangyanzhan gzqyl hartwoolery head-inthe-cloud

matft's Issues

Adding demo for image processing

Hi jjjkkkjjj,

Great work on bring Numpy to Swift!!!

I am learning how to do inference within CoreML using Swift.

So far I have gotten the UIImage from an image picker and I need to do preprocessing
e.g. resize, transpose, normalize(mean=(0,0,0), std=(1,1,1))

And after hours and hours searching, Swift just proofed that it is not a language which is friendly for image processing.
And I found your Repo here which has all the amazing feature I need.

So I think it is very helpful if you could add a demo for this.

Cheers

Other Cubic spline Interpolation

Now, I just have implemented natural cubic spline only. Other boundary condition(clamped, not a knot, periodic) is not supported
Ref: https://github.com/scipy/scipy/blob/v1.5.4/scipy/interpolate/_cubic.py#L464-L847

Boolean Indexing is slow

Regarding #17

Official boolean indexing code is
https://github.com/numpy/numpy/blob/cf1306a842d7b1064270bd06951a485121e60816/numpy/core/src/multiarray/mapping.c#L1010

SIMD function is
https://github.com/numpy/numpy/blob/45bc13e6d922690eea43b9d807d476e0f243f836/numpy/core/src/umath/loops_comparison.dispatch.c.src#L36

Tips to convert an MfArray to MLMultiArray

Could you kindly inform me if there is a way to convert MfArray to MLMultiArray , as I have found a method to convert the latter to the former but I am uncertain if there is a similar method for the reverse operation. If such a method doesn't exists, could you please recommend the most effective/performant approach for accomplishing this?

My current approach is

extension MfArray {
    func toMLMultiArray() throws -> MLMultiArray {
        guard let array = self.astype(.Float).flatten().toArray() as? [Float] else {
              //throw some errors here
        }
        let arrShape = self.shape
        let mlShapedArray: MLShapedArray<Float> = MLShapedArray(scalars: array, shape: arrShape)
        return MLMultiArray(mlShapedArray)
    }
}

Thanks in advance!

Add fancy indexing

Hey, I'm just curious if there are any plans to implement "fancy indexing", where you can pass a list of indeces to an MfArray, and return the items at those indeces, like in numpy. Thanks, its a great library so far.

Boolean indexing doesnt support equality?

let img = MfArray([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]], mftype: .UInt8)
img[img == 3]

this throws
Referencing operator function '==' on 'BinaryInteger' requires that 'MfArray' conform to 'BinaryInteger'

reshape no copy version

Current implementation is copying mfarray in reshaping.
But copying may not be needed.

Subscript bug using `Matft.arange()`

Numpy version

a = np.zeros([88, 2, 2])
b = np.zeros([32, 88, 2])
print(a[np.arange(88), b[0, :, 0].astype(int), 0].shape) # Outputs to (88,)

Matft version

let a = Matft.nums(0, shape: [88, 2, 2])
let b = Matft.nums(0, shape: [32, 88, 2]).astype(.Int)
print(a[Matft.arange(start: 0, to: 88, by: 1), b[0, Matft.all, 0], 0].shape) // Outputs to [88, 88]

According to Numpy the shape of the MfArray should be [88], not [88, 88].

Numpy FFT implementation?

Hi there

Firstly, this library looks amazing. You've done a ton of work and it looks really promising. Thank you!

I was curious if there was any plans to implement Numpy's FFTs in vDSP? If not, do you do work for hire?

I did some work looking into this myself for work on porting OpenAI's Whisper to CoreML / Accelerate https://github.com/vade/OpenAI-Whisper-CoreML

And I documented some of my findings in this issue here: vade/OpenAI-Whisper-CoreML#1

It seems like Numpy, PyTorch, Rosa / RosaKit all use PocketFFT to do non power of 2 DFTs, which is why the output matches more or less exactly numerically.

PocketFFT doesn't use vDSP, but rather scalar - no simd acceleration.

Given the rest of MatFT's current implementation, doing the STFT and Log Mel work in MatFT would just work. The only missing piece is a numerically equivalent implementation of the Numpy / Torch 'real to complex' (rfft) logic.

Thank you again for all the work on Mattft!

memory leak...

I wrote pointer's value without initializing, too many many many memory leaks were occurred :)

See ref
Use move or initialize first.

Bad code eg. here, here, here #

Multidimensional MfArray with different length Arrays in axis 1

Hello, I need to write a lot of python (numpy) code in swift and there I found your library. It would be a great help. Is it possible to create MfArrays that have different lengths at axis=1. Here is an example of what I mean:

[
    [1, 2, 3, 4],
    [1, 2, 3],
    [1, 2, 3, 4, 5]
]

If so, how would I need to instantiate the array?

Support MLMultiArray

Pass the MLShapedArray to MfArray With shared memories by here.
(#39)

divide by zero error for empty arrays

thank you for the wonderful Swift library -- I have a relatively minor issue where something like MfArray([]) throws a divide by zero error. I fixed it in a fork by changing one line in shape2strides func in mfstructure.swift:

ret[index] = prevAxisNum / max(shape[index],1)

(I simply made the divisor a minimum of 1)

Let me know if I should submit a pull request or if you would like to push the change. Thank you!

Complex support

To use vdsp, DSPSplitComplex seems to be needed according to documents

To use blas package, DSPComplex will be needed according to this discussion

to achieve to support complex type, using DSPComplex is ideal?
I must check the difference between DSPComplex and DSPSplitComplex at first.

DSPComplex is the consecutive float values.

Complex data are stored as ordered pairs of floating-point numbers. Because they are stored as ordered pairs, complex vectors require address strides that are multiples of two.

by document

On the other hand, DSPSplitComplex is stored in different memories

A structure that represents a single-precision complex vector with the real and imaginary parts stored in separate arrays.

by document

So I need to implement the function to connect this difference of memory layout

Convert MfArray back to regular Swift's Array with a specific type

Hi @jjjkkkjjj . After I did some math transformations in a MfArray, I want to convert it back to a Swift's Array. What's the most efficient way to do it using your library?

For example, I want to convert the MfArray back to [Int]. Right now I'm doing it this way:

let swiftArrayAny = Array(someMfArray).data)

guard let swiftArrayInt32 = swiftArrayAny as? [Int32] else {
    fatalError()
}

let swiftArrayInt = swiftArrayInt32.map { Int($0) }

ToArray doesn't respect slices

Thank you for providing such a nice library. I've found some cases which might potentially be buggy. Could you help take a look?

Here is the test case

import Matft

final class MatftTest: XCTestCase {
    func testMatftCase1() throws {
        let a = MfArray([], mftype: .Float)
    }

    func testMatftCase2() throws {
        let b = MfArray([], mftype: .Float, shape: [0])
    }

    func testMatftCase3() throws {
        let a = MfArray([
            [1,2,3],
            [4,5,6]
        ], mftype: .Float)
        XCTAssertEqual(a[1], MfArray([4,5,6], mftype: .Float))
        XCTAssertEqual(a[1].toArray() as! [Float], [4.0,5,6])  //XCTAssertEqual failed: ("[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]") is not equal to ("[4.0, 5.0, 6.0]")
    }
}

Thanks!

How to get the content of the MfArray?

Hi I'm trying to get the values inside the MfArray by indexing. The usage is similar to the item method of numpy. Wondering what is the best way to do it?

var a = MfArray([[1,2,3], [4, 5, 6]], mftype: .Int32)
a[0, 0] as! Int //works

a = MfArray([1,2,3], mftype: .Int32)
a[0] as! Int //doesn't work. I can do a[0][0][0][0][0][0][0] for infinite number of times

For now I'm converting the MfArray to Array and get the first element as work around

Thanks in advance!

[Bug] Invalid access on some complex operation

Reason
DSPComplex is the normal struct. Therefore, when I want to forward the this pointer (ptr: UnsafePointer<DSPComplex>), I must not use + operator.

let newptr = ptr + 2 // invalid!!

Matft.math.pow is not working as expected

Hello,

Firstly, thank you for working on this! This is really helpful.

Matft.math.pow does not work like np.power().

np.power(a,3) raises each element of a to the third power. That is,

a = np.eye(3) * 3
np.power(a,3)
array([[27., 0., 0.],
[ 0., 27., 0.],
[ 0., 0., 27.]])

The above is returned as a result.

However, Matft.math.pow takes in a float for the first argument, and a matrix for the second, and behaves as attached following:

How to contribute - have some small numpy / scipy vDSP implementations

Hi there

Ive got a few minor implementations of some numpy / scipy functions on vectors in vDSP along with associated XCTests that I think would make good contributions to Matft.

I'm curious how to best / properly implement these into Matft, as it seems like having a single home for them makes sense, and your code seems very well organized.

I'm not entirely sure of the code structure / best place to implement the logic in a generic way that leverages Matfts existing code base.

Do you have any suggestions?

numpy.allclose() as an extension to Array (without NaN equality)

func allCloseTo(array: [Float], rtol: Float = 1e-5, atol: Float = 1e-8) -> Bool
    {
        precondition(self.count == array.count, "Arrays must have same size")
        
        let absDiff = vDSP.absolute( vDSP.subtract(self, array) )
            
        let maxAbsDiff = vDSP.maximum(absDiff)
        
        let scaledTol = Swift.max(atol, rtol * vDSP.maximum( vDSP.absolute(self) + vDSP.absolute(array) ) )
    
        return maxAbsDiff <= scaledTol
    }

scipy.spatial.distance.cosine

func CosineDistance(_ v1: [Float], _ v2: [Float]) -> Float
{
    precondition(v1.count == v2.count, "Arrays must have same size")

    var dotProduct: Float = 0.0
    var v1Norm: Float = 0.0
    var v2Norm: Float = 0.0
    
    let n = vDSP_Length(v1.count)
    
    // Calculate dot product of v1 and v2
    vDSP_dotpr(v1, 1, v2, 1, &dotProduct, n)
    
    // Calculate the Euclidean norm of v1
    vDSP_svesq(v1, 1, &v1Norm, n)
    v1Norm = sqrt(v1Norm)
    
    // Calculate the Euclidean norm of v2
    vDSP_svesq(v2, 1, &v2Norm, n)
    v2Norm = sqrt(v2Norm)
    
    // Calculate cosine distance
    let distance = 1.0 - (dotProduct / (v1Norm * v2Norm))
    
    return distance
}

and scipy.ndimage.gaussian_filter_1d as array extensions allowing one to cache the computed gaussian kernel.

Note I only really implement the default padding of reflect so far.

 static func generateGaussianKernel(sigma:Float, truncate:Float = 4.0) -> [Float]
    {
        let radius:Int = Int( ceil(truncate * sigma) )
        let sigma2 = sigma * sigma
        let x:[Float] = Array<Int>( ( -radius ... radius  ) ).map { Float( $0 ) }
        let x2 = vForce.pow(bases: x, exponents: [Float](repeating: 2.0, count: x.count) )
        let y = vDSP.multiply(-0.5 / sigma2, x2)
        let phi_x = vForce.exp(y)
        return vDSP.divide(phi_x, vDSP.sum(phi_x))
    }
    
    enum PaddingMode {
        case reflect
        case edge
    }

    private func padInputArray(_ input: [Float], sigma: Float, truncate: Float, paddingMode: PaddingMode) -> [Float] {
        var paddedInput = [Float]()
        let windowSize = Int(2.0 * sigma * truncate + 1.0)
        let padSize = Swift.max(windowSize - input.count, 0)

        if padSize > 0
        {
            switch (paddingMode)
            {
                case .reflect:
                                
                var paddingStart:[Float]
                var paddingEnd:[Float]
                
                // If we pad less than our input arrays count, we select what we need from the input array
                // This wont be a 'full' pad, as we wont have all items in the array
                if padSize <= input.count
                {
                    paddingStart = Array<Float>( input[ 0 ..< Int(padSize)].reversed() )
                    paddingEnd = Array<Float>( input[ input.count - Int(padSize) ..< input.count].reversed() )
                }
                // Otherwise, we repeat reflection until we accrue pad size
                else
                {
                    paddingStart = input.reversed()
                    paddingEnd = paddingStart
                    
                    while paddingStart.count <= padSize
                    {
                        paddingStart.insert(contentsOf: paddingStart.reversed(), at: 0)
                        paddingEnd.append(contentsOf: paddingEnd.reversed())
                        
                        paddingStart = paddingStart.reversed()
                        paddingEnd = paddingEnd.reversed()
                    }
                    
                    paddingStart = Array<Float>( paddingStart.suffix( Int(sigma * truncate)  ) )
                    paddingEnd = Array<Float>( paddingEnd.prefix( Int(sigma * truncate) ) )
                }
                                
                paddedInput.append(contentsOf: paddingStart)
                paddedInput.append(contentsOf: input)
                paddedInput.append(contentsOf: paddingEnd)
                
                break

            case .edge:
                let edge = input.first ?? 0.0
                paddedInput = Array(repeating: edge, count: padSize) + input + Array(repeating: edge, count: padSize)
                
            }
            return paddedInput
        }
        
        return input

    }
    
    // Make sure your Sigma and Truncate values match above:
    func gaussianFilter1D(kernel:[Float], sigma:Float, truncate:Float = 4.0, paddingMode:PaddingMode = .reflect) -> [Float]
    {
        let paddedInput = self.padInputArray(self, sigma:sigma, truncate:truncate, paddingMode:paddingMode)
        
        var output = [Float](repeating: 0.0, count: self.count)

        vDSP.convolve(paddedInput, withKernel: kernel, result: &output)

        // Technically is this needed, our sum is always 1 ?
//        vDSP.divide(output, sigma, result: &output)
//        let sum = vDSP.sum(kernel)
//        vDSP.multiply(sum, output, result: &output)

        return output
    }

Reshape bug (Column Order)

let a = MfArray([[1, 3, 5],
                             [2, -4, -1]], mforder: .Column)
print(a.reshape([3, 1, 2]))
/*
mfarray = 
[[[	1,		2]],

[[	3,		-4]],

[[	5,		-1]]], type=Int, shape=[3, 1, 2]
*/

but must be

array([[[ 1,  3]],

       [[ 5,  2]],

       [[-4, -1]]])

Refactoring pointer

I didn’t understand a pointer… lol
I must refactor this!!!

Invalid broadcasting

Hi @jjjkkkjjj

I'm trying to do the following broadcasting, but it is considered as error in Matft:

let a = Matft.arange(start: 1, to: 7, by: 1, shape: [3, 2])
let b = Matft.arange(start: 1, to: 5, by: 1, shape: [2, 1, 2])
print(a - b)

Error message:

Fatal error: could not broadcast from shape 3, [2, 1, 2] into shape 3, [1, 3, 2]: file /Users/alwc/Library/Developer/Xcode/DerivedData/testnpy-afxfserzcwmbfdfgfaszjulxzrwa/SourcePackages/checkouts/Matft/Sources/Matft/core/function/conversion.swift, line 290

2020-07-21 19:05:02.261713+0800 testnpy[21523:5245927] Fatal error: could not broadcast from shape 3, [2, 1, 2] into shape 3, [1, 3, 2]: file /Users/alwc/Library/Developer/Xcode/DerivedData/testnpy-afxfserzcwmbfdfgfaszjulxzrwa/SourcePackages/checkouts/Matft/Sources/Matft/core/function/conversion.swift, line 290

In Python this is legal. For example,

x = np.random.randint(10, size=(3, 2))
y = np.random.randint(10, size=(2, 1, 2))

# Returns a shape (2, 3, 2) nd-array
print(x - y)

not found np.unique()

Support for atan2

Thanks for great library!

Is there any way to get atan2 from combining 2 MfArrays? Like this:

let R = MfArray([a, b ,c])

let x = atan2((R[2, 1], R[2, 2])

The issue for us is that we're porting the code from numpy logic, and functionality like that is supported there. E.g. we can retrieve single Double from the expression like R[2, 1] so we're struggling now on how to get similar behaviour.

Thanks in advance!

Passing strange arguments in subscription

let a = try! Matft.mfarray.broadcast_to(MfArray([[2, 5, -1],
                                                             [3, 1, 0]]), shape: [2,2,2,3])
let b = a[0~, ~1, ~~2]

b[0, ~1] = MfArray([222]) >>>>>>> Precondition failed: -2 is out of bounds for axis 1 with 1: file

subscription arguments [0, ~1] was passed as Int of Array [0, -2]...

It's strange

Copy On Write implementation

I think it is easier for Matft to implement COW than I expected.
Because MfArray has a data class, which is MfData, all we have to do are 2 points. First add “mutating” keyword into conversion method and subscript function. Second check the _isView property in those “mutating” function and then replace the referenced MfData into the new one if the _isView is true.

[Fatal error] Subscription of view MfArray

Fatal error was occurred...
I think this was caused by extracting view's base directly.

let a = Matft.mfarray.arange(start: 0, to: 27*2, by: 2, shape: [3,3,3], mftype: .Double, mforder: .Column)

            XCTAssertEqual(a[~-1], MfArray([[[ 0, 18, 36],
                                             [ 6, 24, 42],
                                             [12, 30, 48]],

                                            [[ 2, 20, 38],
                                             [ 8, 26, 44],
                                             [14, 32, 50]]], mftype: .Double))
            let b = a[~-1]
            

            
            XCTAssertEqual(b[~1, ~2], MfArray([[[18, 19, 20],
                                                [21, 22, 23]]], mftype: .Double)) >>>>>>>>Not equal!!
            
            XCTAssertEqual(b[0], MfArray([[18, 19, 20],
                                          [21, 22, 23],
                                          [24, 25, 26]], mftype: .Double))  >>>>>>>>Not equal!!

Creation function

Use vDSP_vrampmul and vDSP_vgen instead of Array and Strides?

Add @inline

Adding @inline may be efficient, and improve the performance

Difference between lapack and vDSP

There’s a difference for handling negative stride between lapack and vDSP.

In vDSP,

var a = [1,2,3,4.0]
var b = [5,6,7,2.0]
var c = [0,0,0,0.0]
        
vDSP_vaddD(&a, vDSP_Stride(1), &b, vDSP_Stride(-1), &c, vDSP_Stride(1), vDSP_Length(4))
//c -> [6.0, 2.0, 3.0, 4.0]
//cannot add properly!!!

correct one is

vDSP_vaddD(&a, vDSP_Stride(1), &b + 3, vDSP_Stride(-1), &c, vDSP_Stride(1), vDSP_Length(4))

On the other hand, in cblas

cblas_dcopy(Int32(4), &b, Int32(-1), &c, Int32(1))
//c -> [2.0, 7.0, 6.0, 5.0]
//can copy properly!!!

This line (vDSP) must be

let bptr = bptr.baseAddress! + vDSPPrams.b_offset
let sptr = sptr.baseAddress! + vDSPPrams.s_offset
dstptrT = dstptrT + vDSPPrams.b_offset

instead of

let bptr = vDSPPrams.b_offset >= 0 ? bptr.baseAddress! + vDSPPrams.b_offset : bptr.baseAddress! + bigger_mfarray.offsetIndex + vDSPPrams.b_offset
let sptr = vDSPPrams.s_offset >= 0 ? sptr.baseAddress! + vDSPPrams.s_offset : sptr.baseAddress! + smaller_mfarray.offsetIndex + vDSPPrams.s_offset
dstptrT = vDSPPrams.b_offset >= 0 ? dstptrT + vDSPPrams.b_offset : dstptrT + bigger_mfarray.offsetIndex + vDSPPrams.b_offset

cblas must be

let srcptr = cblasPrams.s_stride >= 0 ? srcptr.baseAddress! + cblasPrams.s_offset : srcptr.baseAddress! - mfarray.offsetIndex + cblasPrams.s_offset
let dstptr = cblasPrams.b_stride >= 0 ? dstptr.baseAddress! + cblasPrams.b_offset : dstptr.baseAddress! - dsttmpMfarray.offsetIndex + cblasPrams.b_offset

Feature Request: Numpy Dot

Thank you for all your hard work on this framework! Will there be support for dot in the future?

https://numpy.org/doc/stable/reference/generated/numpy.dot.html?highlight=dot#numpy-dot

equallity

Use vDSP_veqvi(::::::_:) for UnsafeRawPointer.
XNOR=
a|b|ret
0|0|1
0|1|0
1|0|0
1|1|1

Image processing

Create OpenCV Mat by https://stackoverflow.com/questions/39579398/opencv-how-to-create-mat-from-uint8-t-pointer

and pass it by “with” statements.

simple image processing function is vImage Module

FFFF means float types (8888 means UInt8)

https://developer.apple.com/documentation/accelerate/1515929-vimageconvolve_argbffff

Subscript’s getter and setter

If I don’t use generics in MfArray, subscript’s getter and setter must be handled as Any.
However, using Any type causes unexpected error or performance loss.

MfArray<MfType: MfTypable>{
    Hoge
}

//Initialization 
let a = MfArray<Int>([1,2,3])

//Getter and setter
//Note that scalar will be handled only
subscript(indices: Int...) -> MfType{
    hoge
}

//Note that MfArray will be handled only
subscript(mfslices: MfSlice...) -> MfArray{
    fuga
}

Incorrect string description

Hi @jjjkkkjjj, whenever the MfArray is too large, the print description will be incorrect. For example,

let a = Matft.arange(start: 1, to: 40001, by: 1, shape: [40000])

print(a)
/*
mfarray = 
[    1,        2,        3,        ...,        39997,        39998,        39999], type=Int, shape=[40000]
*/

print(a[-1])
/*
40000
*/

Wrong Shape after "ufuncReduce add"

Hello,

I think there is a bug in ufuncReduce. This code

Matft.ufuncReduce(mfarray: MfArray([1,2,3,4,5,6,7,8,9,10] as [Double]), ufunc: Matft.add)

returns

MfArray([55, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

I would expect that I would get scalar or array of shape [1]. What am I doing wrong please?

jjjkkkjjj / matft Goto Github PK

matft's People

Contributors

Stargazers

Watchers

Forkers

matft's Issues

Recommend Projects

Recommend Topics

Recommend Org