trthatcher / mlkernels.jl Goto Github PK

View Code? Open in Web Editor NEW

78.0 9.0 37.0 5.13 MB

Machine learning kernels in Julia.

Home Page: http://trthatcher.github.io/MLKernels.jl/dev/

License: MIT License

Julia 100.00%

julia machine-learning kernel-functions mercer-kernels

mlkernels.jl's People

Contributors

Stargazers

Watchers

mlkernels.jl's Issues

Derivatives (wrt parameters and input values)

Hi,

to be able to use this kernel module in my Gaussian Process regression code - instead of rolling my own covariance kernels, which do basically the same, just not as fancy as your code - I will need the derivatives of kernels, both with respect to the parameters (for optimizing log marginal likelihoods) and with respect to input values (for training from and prediction of derivatives of function values, not just the function values themselves). What are your thoughts (if any) on the interface for that ?

Have a look at https://github.com/st--/juliagp/blob/master/covariances.jl for how I've implemented it so far...

-ST

Speed vs code clarity

So quite a bit of the code looks like it's rather optimised for some things - e.g. euclidean_distance() with BLAS.axpy! and BLAS.dot - is that actually that much faster than Base.dot(x, y) that it's worth it ? And what happens if x and y happen to have different lengths (shouldn't happen, but you never know)? Base.dot() would just throw a DimensionMismatch. The BLAS stuff doesn't complain, just computes a different number...

On a very similar note, is it worth writing convert(T,2) instead of just 2? Doesn't the compiler automatically convert types?

And on a not so related note, but too small to bother opening a separate issue: the Coveralls thing seems to think that kernelize_scalar() for the MercerSigmoidKernel (separablekernels.jl) isn't covered - but I think it should be ? When I add a println() statement to the function, it gets printed out lots of times. Any idea what's going on?

Random Fourier Features

Random Fourier features is a technique for approximating inner products in the RKHS using a randomized feature map. It would be great to have this in MLKernels.

Here's a paper introducing them:
Random Features for Large-Scale Kernel Machines

A blog post demonstrating their use:
Random Fourier Features for Kernel Density Estimation

The kernel being approximated needs to have some specific properties. Mainly it needs to be stationary (shift-invariant) and scaled properly.

I need to look into this a bit more, but it seems like it might not be too difficult to implement. The main part being a function spectraldensity, which takes a kernel as an argument and returns a distribution to sample from.

Roadmap

@st--

I think we should establish a roadmap. For the most part, I already have what I need from this package. All I need to do is expand/finish the approximations.

We have a good foundation for the kernel derivatives defined for the Gaussian kernel. I can expand it to include all the other kernels, it's just a matter of sinking some time into it.

What are you priorities in terms of the features? How do you want to approach it?

Limits on parameter ranges: PolynomialKernel,

PolynomialKernel required d to be integer. ExponentialKernel, RationalQuadraticKernel, PowerKernel, LogKernel all require gamma to be <= 1. What's the reason for that ? (Maybe a kernel would not be Mercer if gamma > 1 (or d real, for PolynomialKernel), but it might still be useful. We could just amend ismercer() so that it actually checks the parameter value?)

kernelmatrix function

I also get this error with kernel function sometimes.

Kernels for HAC estimators?

I am unsure if this package is the right place to implement these, but wanted to ask. For robust variance covariance estimators sometimes kernels are implemented to account for spatial or temporal correlation of an assumed or estimated form (Heteroscedastic and Autocorrelation Consistent / HAC variance covariance estimators). Would this package be a good place to have these kernels implemented? Here is a reference of the R implementation of these Sandwich and here is another implementation in Julia CovarianceMatrices.jl.

kernel function

I just started using your latest version of MLKernels. I'm having an issue which may just be a syntax problem on my end. I'm attempting to use the kernel function like I have in the past but here's the result:

julia> kernel(GaussianKernel(), 2., 4.)

ERROR: type KernelComposition has no field k
 in kernel at /Users/joshualeond/.julia/v0.4/MLKernels/src/kernelfunction.jl:14

I usually use the kernel to compare arrays and have the same result except the error is at line 16. The version that isn't working for me is v0.1.0+ which I cloned from Github and the previous version I was using was v0.1.0 installed from Metadata.

Composited composite kernels

Currently, a CompositeKernel can only be made out of StandardKernels, which means something like GaussianKernel()+GaussianKernel()*GaussianKernel() doesn't work. Is there a good reason for that?

Hyperparameters

The parameters for Kernels should be abstracted to a HyperParameter type that can be used to aid in optimization and ensure consistency of constraints.

ARD parameter (weights) derivative

Just realised that kernelmatrix_dp was missing, and added it in ab3c092 -- piggy-backing on kernelmatrix seemed easiest. The only issue with this is that, currently, kernel_dp(k::ARD, :weights, x, y) returns an array with the length of k.weights. From a computational POV it makes sense to calculate all dk/dw[i]s at once (and usually you'd want all derivatives, not just one of them, I think), but from an interface/API POV it would be better if kernel_dp always returned a scalar... how to reconcile that ?

Re-implementing Base functions

In one of your recent commits you made use of (Base.)scale!() which I hadn't come across before, but it looks like it's basically the same functionality as gdmm!/dgmm! in matrixfunctions.jl - the only difference is that the Base version doesn't use @inbounds... does that macro make things so much faster that it's worth keeping an in-package version of matrix-vector scaling ?

Periodic Kernel does not appear to be a Squared Distance kernel

I was trying to find some documentation on the periodic kernel. In all the papers I look at, it seems like the sin() function is applied to the element-wise distances and squared rather than taking the squared sine of the distance.

sum sin(xi - yi)^2 for all i

rather than sin(||x-y||)^2

Here's a paper for example:

http://jmlr.org/proceedings/papers/v33/hajighassemi14.pdf

Equality test for Kernels

julia> GaussianKernel(1.0) == GaussianKernel(1.0)
false

my best guess:

julia> isimmutable(GaussianKernel(1.0))
true

julia> isimmutable(GaussianKernel(1.0).alpha)
false

confuses julia

sqdist_d() and scprod_d()

It seems like each version is used in only one place (the derivative for the corresponding class of kernel including ARD). Making use of these functions uses at least two - currently three - scans of the array that the kernel function returns.

Are these functions providing any real value? It seems like there's two definitions with multiple array scans when only a single definition and one scan is required.

Kernel Derivatives

There's two components to this enhancement.

Optimization

Define a theta and eta (inverse theta) function to transform parameters between an open bounded interval to a closed bounded interval (or eliminate the bounds entirely) for use in optimization methods. This is similar to how link functions work in logistic regression - unconstrained optimization is used to set a parameter value in the interval (0,1) using the logit link function.

theta - given an interval and a value, applies a transformation that eliminates finite open bounds
eta - given an interval and a value, reverses the value back to the original parameter space
gettheta returns the theta transformed variable when applied to HyperParameters and a vector of theta transformed variables when used on a Kernel
settheta! this function is used to update HyperParameters or Kernels given a vector of theta-transformed variables
checktheta used to check if the provided vector (or scalar if working with a HyperParameter) is a valid update
upperboundtheta returns the theta-transformed upper bound. For example, in the case that a parameter is restricted to (0,1], the transformed upper bound will be log(1)
lowerboundtheta returns the theta-transformed lower bound. For example, in the case that a parameter is restricted to (0,1], the transformed lower bound will be -Infinity

Derivatives

Derivatives will be with respect to theta as described above.

gradeta derivative of eta function. Using chain rule, this is applied to gradkappa to get the derivative with respect to theta. Not exported.
gradkappa derivative of the scalar part of a Kernel. This must be defined for each kernel. It will be manual, so the derivative will be analytical or a hand coded numerical derivative. It will only be defined for parameters of the kernel. Not exported. Ex. dkappa(k, Val{:alpha}, z)
gradkernel derivative of kernel. Second argument will be the variable the derivative is with respect to. A value type with the field name as a parameter will be used. Ex. dkernel(k, Val{:alpha}, x, y)
gradkernelmatrix derivative matrix.

Nystrom kernel approximation

I believe the Nystrom kernel approximation should satisfy the following invariant:
nystrom(kernel, X, [1:size(X,1)]) == kernelmatrix(kernel, X) (up to floating point inaccuracies)
The current implementation is definitely broken.
If I use just Base routines, as follows, I get what I believe is the correct approximation:

function basenystrom(kernel::Kernel, X::Matrix, xs::Vector)
    C = kernelmatrix(kernel, X, X[xs,:])
    D = C[xs,:]
    SVD = svdfact(D)
    DVC = diagm(1./sqrt(SVD[:S])) * SVD[:Vt] * C'
    MLKernels.syml(BLAS.syrk('U', 'T', 1, DVC))
end

But I don't understand your optimised version well enough to figure out where it's going wrong...

I've added this as test/test_approx.jl; have a look and have fun fiddling with your implementation until that test passes. ;-) It's not yet added to runtests.jl so it doesn't break the overall test.

Remove Separable Kernel (Mercer Sigmoid Kernel)

The separable kernel is the dot product kernel applied to an element-wise transformation of the original data vectors. The element-wise transformation is 100% equivalent to pre-processing one's data.

I don't think it's adding any value - any opposition to removal of this type/kernel?

Kernel derivatives need to be a separate package

@st--

I'm trying to be pragmatic when it comes to this package. I originally envisioned that this package would provide the following:

A vetted set of mainstream machine learning kernels (with some simple combination rules)
The ability to compute a kernel matrix quickly
The ability to compute a kernel matrix approximation

Unfortunately with the kernel derivatives, I feel the package moves too far away from that. My concerns are that:

Derivatives may not be defined and vector parameters add a layer of complexity
Many derivatives that do exist are intractable - reliance on analytic derivatives is complex/unfeasible in many cases and raises floating point accuracy in others
Derivatives appear to be catering to a very specific need - it's a specialized component being added to a generic package.

As such, it's bespoke code and necessitates its own package.

That being said, I appreciate all the help and I'm more than willing to help you out where ever I can.

Using Distances.jl and automatic differentiation

I just made a PR JuliaStats/Distances.jl#72 to make distances compatible with automatic differentiation. By using Distances.jl we could get

ARD kernels using the weighted metrics
easy higher order derivatives with automatic differentiation

Using a more general Metric type in the kernel computations could also enable kernels to be computed over other metric spaces than R^n (e.g. graphs).

Relevant issues:
#46 #53 #52

threading support

I am doing some tests with the @threads macro, in case you are interested:

https://github.com/gdkrmr/MLKernels.jl/tree/threads

For the Gaussian kernel I get around 60% speedup for two threads.

More detailed categorization of kernels

I think it would be useful to have a more detailed categorization of the kernels. In chapter 4 of this book kernels (covariance functions) are categorized by the types of their inputs. This would enable specifying the pairwise functions in the type hierarchy instead of specifying them for each kernel separately.

For example:

stationary kernels satisfy K(x, y) = k(x-y)
isotropic kernels satisfy K(x, y) = k(|x-y|)

Use of triangular dispatch invalid

This definition for ARD seems to invalid in Julia 0.4, and I think should be invalid in 0.3 too but might not be due to a change.

MLKernels.jl/src/kernels.jl

Line 62 in c1532ce

immutable ARD{T<:FloatingPoint,K<:StandardKernel{T}} <: SimpleKernel{T}

immutable ARD{T<:FloatingPoint,K<:StandardKernel{T}} <: SimpleKernel{T}

In particular the K part. I can't find exactly where it was changed, but here is related discussion

JuliaLang/julia#6620
JuliaLang/julia#3766
JuliaLang/julia#8974 (comment)

I'm a bit confused on how this all works internally, might be worth posting on julia-users if you want to get it working on 0.4 at some point.

GammaRationalQuadraticKernel parameters

So GammaRationalQuadraticKernel(alpha,beta,gamma) is a generalised form of the RationalQuadraticKernel(alpha,beta) is a generalised form of the InverseQuadraticKernel(alpha).
InverseQuadraticKernel(alpha) = RationalQuadraticKernel(alpha,beta=1), and
RationalQuadraticKernel(alpha,beta) = GammaRationalQuadraticKernel(alpha,beta,gamma=1).
But the default value of GammaRationalQuadraticKernel is gamma=0.5. Just checking if that is intentional?

Kernel/Kernel Matrix Computation

@st--

So I was thinking a bit about the approach to the kernel/kernel matrix computation.

If you have a function f:RxR -> R, then you can apply it element-wise to two vectors and sum the results. So for x = [x1,x2] and y = [y1,y2] the dot product the function would be f(x,y) = xy. Then k(x,y) = f(x1,y1) + f(x2,y2) = x1y1 + x2y2. The squared distance is just f(x,y) = (x-y)^2

If f is a valid positive or negative definite kernel in RxR, then we have a way to extend the kernel to R^n x R^2 by summing the element-wise results.

I was thinking we could take a more modular approach. Technically, the kernels we have now are a composition of a positive definite kernel (the polynomial kernel takes the dot product) or a negative definite positive-valued kernel (the euclidean distance in what we have implemented) and we could define these 'input' kernels in terms of f.

I was thinking we could abstract kernelmatrix into two functions: generic kernelize (the way it currently operates) and a generic pairwise which operates on two vectors/matrices. We would create a new class of kernels and pairwise would dispatch on those types. Examples of those new base kernels would be SquaredDistance and ScalarProduct. However, we can easily extend it:

f(x,y) = sin(x-y)^2 (sine squared kernel - also negative definite)
f(x,y) = (x-y)^2/(x+y) (chi squared kernel for R+ x R+)

These can all be extended naturally to cover the periodic kernel weights... I was thinking two weight vectors and f could be defined like this:

k(x,y) = u1_sin(w1_x1 - w1_y1)^2 + u2_sin(w2_x2 - w2_y2)^2

Where u and w are your weight vectors. (I know this could add redundancy - just an idea)

Long story short, three levels:

'Base' simple kernels that will be defined for pairwise - ARD would be defined at this level
Derived kernels that are a function of the matrix that pairwise returns
Composite kernels as we have them now.

Anyway, I think an approach along these lines would also give us a nice modular approach to the derivatives. We can always ensure there's generic fall-back methods basically exactly how we have them now. Let me know your thoughts

Kernel function discussion/kernels with variable-dimensional parameters (ARD)

I just pushed an implementation of integer-based parameter derivatives based on calling names() on the kernel object to get its field. After all that I remembered an important use case in Gaussian Processes: the "Automatic Relevance Determination" (ARD) kernel. This is basically a Gaussian kernel, but with a sigma vector (same length as data dimension):

Instead of the 1D Gaussian, exp(- vecnorm(x - y)^2 / 2k.sigma^2), you would use exp(- vecnorm((x - y) ./ k.sigmas)^2 / 2), where k.sigmas is a vector of the same dimensions as x and y.

This crops up not just for euclidean distance kernels, but also scalar product kernels - e.g. a LinearKernel with different scaling for each dimension...

Any idea on how to best implement that ?

Memory usage / in-place covariance matrix calculations

Reading up on Coverage.jl I was curious to check the memory allocation behaviour of the code. Most of it is constructing Kernels, which I think is fine. But it occurred to me that e.g. the second derivatives construct n x n matrices, and memory allocation is one of the biggest speed hits. So in the future it might be worth rewriting the kernel functions such that they can write directly into the covariance matrix. (This will require some more careful thought, and isn't urgent, but I wanted to bring it up now before I forget again!)

performance

The following code shaves off more than 0.5 seconds when X has 20_000 columns and 10 rows:

n = 20000
x = randn(10, n)

function gauss{T}(X::Matrix{T}, alpha::T)
    n = size(X, 2)
    xy = LinAlg.BLAS.syrk('U', 'T', one(T), X)
    x2 = [ sum(X[:, i] .^ 2) for i in 1:n ]
    
    LinAlg.BLAS.syr2k!('U', 'N', one(T), x2, ones(T, n), T(-2), xy)
    
    @inbounds for i in 1:n
        for j in 1:i
            xy[j, i] = exp(-alpha * xy[j, i])
        end
    end
    LinAlg.copytri!(xy, 'U')
end
gauss(x, 1.0)

compared to

kernelmatrix!(
    ColumnMajor(),
    Matrix{Float64}(n, n), 
    GaussianKernel(1.0), 
    x, true
)

I think one of the reasons is that LinAlg.syrk_wrapper! always copies the triangle in the matrix, even though you try to avoid that. I am not sure how much using syr2k saves.

DomainError in tests

in v0.2.0

INFO: Testing MLKernels.kernelmatrix!
ERROR: LoadError: LoadError: DomainError:
 in exponentialkernel at /home/me/.julia/v0.5/MLKernels/src/kernel.jl:110 [inlined]

z can be smaller than zero, I think this is due to calculating the squared euclidean distance as x^2 - 2xy + y^2

N vs. T

How useful is it to basically duplicate have the code to allow both row-wise and column-wise data matrices ? From an implementation point of view it'd be a lot simpler to just decide on one, and require users to call e.g. kernelmatrix(k, X', Y') when needed, which is a one-off overhead.

The only difference is whether the access is X[i,:] and Y[j,:] or X[:,i] and Y[:,j]... so I played around with macros and came up with one which takes a condition, a tuple of symbols, and a code block and transforms all array references to objects listed in the tuple of symbols (84ca8bf, 1564f8a). But that doesn't work so well for the cases where you have e.g. N_sqdist and T_sqdist...

kernelmatrix(k, X, Y) calculates full matrix, not only upper-right triangle as commented

Now the question is, are there going to be any kernels which are non-symmetric, for which k(x, y) != k(y, x)? If not, then kernelmatrix(k, X, Y) should only calculate half the matrix, as already the case for kernelmatrix(k, X) [which calculates kernelmatrix(k, X, X)].

On second thought, why is there a special case for kernelmatrix(k, X, X)? Should be the same code as for kernelmatrix(k, X, Y)...

LoadError with 0.5

I get the following syntax error and deprecation warnings with version 0.5.1 of Julia.

julia> Pkg.free("MLKernels")
INFO: Freeing MLKernels
INFO: No packages to install, update or remove

julia> using MLKernels
WARNING: super(T::DataType) is deprecated, use supertype(T) instead.
 in super(::DataType) at ./deprecated.jl:50
 in supertypes(::DataType) at .../.julia/v0.5/MLKernels/src/meta.jl:14
 in macro expansion; at .../.julia/v0.5/MLKernels/src/kernels.jl:164 [inlined]
 in anonymous at ./<missing>:?
 in eval_user_input(::Any, ::Base.REPL.REPLBackend) at ./REPL.jl:64
 in macro expansion at ./REPL.jl:95 [inlined]
 in (::Base.REPL.##3#4{Base.REPL.REPLBackend})() at ./event.jl:68
while loading .../.julia/v0.5/MLKernels/src/kernels.jl, in expression starting on line 161
WARNING: super(T::DataType) is deprecated, use supertype(T) instead.
 in super(::DataType) at ./deprecated.jl:50
 in supertypes(::DataType) at .../.julia/v0.5/MLKernels/src/meta.jl:17
 in macro expansion; at .../.julia/v0.5/MLKernels/src/kernels.jl:164 [inlined]
 in anonymous at ./<missing>:?
 in eval_user_input(::Any, ::Base.REPL.REPLBackend) at ./REPL.jl:64
 in macro expansion at ./REPL.jl:95 [inlined]
 in (::Base.REPL.##3#4{Base.REPL.REPLBackend})() at ./event.jl:68
while loading .../.julia/v0.5/MLKernels/src/kernels.jl, in expression starting on line 161
ERROR: LoadError: LoadError: syntax: space before "(" not allowed in "? ("
 in eval_user_input(::Any, ::Base.REPL.REPLBackend) at ./REPL.jl:64
 in macro expansion at ./REPL.jl:95 [inlined]
 in (::Base.REPL.##3#4{Base.REPL.REPLBackend})() at ./event.jl:68
while loading .../.julia/v0.5/MLKernels/src/pairwise.jl, in expression starting on line 45
while loading .../.julia/v0.5/MLKernels/src/MLKernels.jl, in expression starting on line 57

These do not appear in the dev branch or when using version 0.4.5 of Julia.

@inbounds

Various functions, particularly in vectorfunctions.jl and kernelderiv.jl, use @inbounds without making sure all the arguments have the right length etc. Is this fine, because we know how we call them, and we do check array dimensions etc. in the parent routines? Or should we add additional bounds checks in there?

pairwise_aggregate inconsistencies

sometimes you use f::Type and sometimes only ::Type in pairwise_aggregate:

https://github.com/trthatcher/MLKernels.jl/blob/master/src/PairwiseFunctions/pairwise.jl#L19
https://github.com/trthatcher/MLKernels.jl/blob/master/src/PairwiseFunctions/pairwise.jl#L27

MultiQuadraticKernel missing

Maybe that's intentional, but I just noticed that the MultiQuadraticKernel in master doesn't have any equivalent in developments. The InverseMultiQuadraticKernel maps to RationalQuadraticKernel with beta=0.5, but the MultiQuadraticKernel would be beta=-0.5 and you only allow positive values for beta...

positive definiteness of composite kernels

Are you sure that a kernel sum of one positive definite and one NOT positive definite kernel is still positive definite ? That's what the code says: isposdef(ψ::KernelSum) = isposdef(ψ.k1) | isposdef(ψ.k2), but I really wouldn't've thought so..

Only integer exponents?

Does the exponent e.g. for PowerKernel need to be integer? Can it not interpolate with real exponents?

0.7 compat

Is there any work for 0.7 compatibility on the way?

julia 0.6 compatibility

What are your plans on supporting Julia 0.6?

Segfaults

I was going to add @test_throw tests for the various ArgumentErrors e.g. in kernelmatrix, and to figure out which way around things should go I played around in the REPL with things like MLKernels.kernelmatrix!(zeros(4,4), ExponentialKernel(), ones(4,6), true), transposed dimensions for X, different values for is_trans, leaving out is_trans entirely - which sometimes led to instant segfaults, sometimes with a bit of delay. Don't have the time anymore to investigate, but I suspect it's something where we used @inbounds but didn't have enough checks to make sure we weren't going to exceed the array bounds!

Why split up test_standardkernels?

The kernels are all very similar in their tests - wouldn't splitting it up risk missing some tests for some of them / testing them vaguely differently? It's a lot of code duplication for which I can't see the gain... maybe you've got a good reason though ? Mainly curious.

Approach to derivatives

Types can be parameterized by symbols, which means we can take a different approach to derivatives. I've included a gist to illustrate how we could take advantage of this ("Approach 2")

It would cut down on the number of functions and it seems to perform better on my machine (and definitely no worse).

Definity of kernels

I had assumed the Sigmoid kernel was c.p.d., but according to the paper I just read about the Mercer Sigmoid kernel apparently that's only true for some parameter values, depending on the data set... so I suppose we should have iscondposdef(::SigmoidKernel) = false?

Also, I think we might've been working with isposdef(::Kernel) meaning "positive semi-definite", but Base.isposdef is "strictly positive definite" (e.g. isposdef(0) = false...), and if we extend a Base function we should follow that. For kernels it doesn't seem to make much difference whether it's positive semi-definite or strictly positive definite, so maybe move to our own function? We could have ismercer()... as positive semi-definity is sufficient and necessary for a kernel being a Mercer kernel.

Consistency in Exceptions: ArgumentError vs DimensionMismatch

kernelmatrix! for StandardKernel throws ArgumentError, kernelmatrix! for SquaredDistanceKernel or ScalarProductKernel throws DimensionMismatch (by scprodmatrix!/sqdistmatrix!). matrix_prod! and matrix_sum! for generic Arrays call error(), but for matrices with is_upper argument the former throws DimensionMismatch and the latter throws ArgumentError. I think this should all be made a bit more consistent.

Is there a specific reason to use error() rather than throw() in some cases ?

(NB. there should be @test_throws as well, which I'm about to add.)

Positive Definite vs Mercer

I see that you treat positive definite kernels as being synonymous to mercer kernels, while as far as I understand there is a slight difference in that a mercer is more restrictive (e.g. continuity).

Was this an oversight, or a conscious design decision (for simplification maybe)?

source: here at 36:40

Edge cases

LogKernel is not pos.def. anyway; constructor says gamma has to be in (0,1] but only checks for gamma>0. Is there a reason for gamma<=1? If not, error message should be adjusted, otherwise code & test...

Release

@st--

I would really like to finish up the non-derivative portion of the package ASAP (next 10 days or so) so that I may use it in my other projects and so that I can release a version of this package to http://pkg.julialang.org/. Originally, this would have happened last week, but the derivatives and ARD obviously added to that time. That's okay (:

I'm going through all the standard kernel definitions and correcting/refining all definitions to ensure they are clean and correct. I'm also going to revisit the standard kernel tests (broken on my last commit - don't worry, will fix up). The composite kernels will need to be sorted for the recursive definition.

Regarding the kernel derivatives, there's a couple options since they probably won't be ready. First, exclude them from the first release and just keep them in a development branch. Alternatively, they could be included in the package code, but not extracted/documented. You would need to explicitly import them. Lastly, they could be broken out into their own package with this one as a dependency (you would be the owner of the deriv package - but I could still co-maintain).

How would you like to approach it?

Kernel combination implementation and derivatives

Hello,
It would be amazing for the package to feature simple kernel combination such as kernel sum and product.
I wrote my own kernel function module because I needed it imperatively but yours is a lot more robust, maybe I could implement it given some directions.
Thanks!

tag a new version

I am developing a package that depends on MLKernels, but the latest tagged version is 2 years old, which makes it hard to specify dependencies.

If you are interested: https://github.com/gdkrmr/KernelRidgeRegression.jl

MLKernels with julia v0.5

MLKernels is nice, but there are a few issues when using with julia 0.5.
Here is what I've found:

In kernelapproximation.jl: replace Base.blasfunc with Base.LinAlg.BLAS.@basefunc and chksquare with checksquare
In meta.jl: replace super with supertype
Inpairwise.jl: replace text like i = store_upper ? (1:j) : (j:n) with i = (store_upper ? (1:j) : (j:n))
In kernelfunctions.jl: in the first 4 call statements, there is the issue of adding methods to an abstract type in Julia 0.5 e.g. sisl/BayesNets.jl#28 . What functionality in MLKernels would be lost if these statements were excluded?

trthatcher / mlkernels.jl Goto Github PK

mlkernels.jl's People

Contributors

Stargazers

Watchers

Forkers

mlkernels.jl's Issues

Optimization

Derivatives

Recommend Projects

Recommend Topics

Recommend Org