accord-net / framework Goto Github PK

Machine learning, computer vision, statistics and general scientific computing for .NET

License: GNU Lesser General Public License v2.1

Shell 0.01% C# 93.76% Smalltalk 0.82% PowerShell 0.01% F# 0.03% C++ 1.25% C 3.71% Inno Setup 0.01% Batchfile 0.06% Makefile 0.01% M4 0.01% Objective-C 0.01% HTML 0.29% CSS 0.01% Visual Basic .NET 0.03%

c-sharp computer-vision ffmpeg framework image-processing machine-learning neural-network nuget statistics support-vector-machines unity3d visual-studio

framework's People

Contributors

Stargazers

Watchers

Forkers

primaryobjects natepan yoyossy jungwon redknightlois cureos cdoru ekdrifter cezary12 qusma monapochi friartuck jaechoon2 chaojie mkohlmann maslbl4 keithnel kingtam fmsaraujo doraemon75 jfreax samarth-b dsblank mikkelporse marek-stoj aggieben phykas david-durrleman dr-dos-ok antonyss huanzl0503 mysl linearregression biyituan simonhjsong vrdate votrongdao kommusoft diegocatalano sami1971 baovinh007 surfingnerd hydrophis-spiralis jenny0126 fudong1127 547872495 edhubbell madmaxreach peterdachuan nagyist peerct glyphard kinpro fujianhai mapleyustat franklike lxnnao chagge martin-ly airy-ict azraelrabbit mohsenuss91 liuzhiping imkevinyang lseyesl olachan idailylife jeffreyyangmicareo mathiassamuelides robinlan leo-zhou vimalkumarnair viure sivank dlastor kevlar90 wytek mazenhit dreammaster38 fxbit kowalot nikolasmarkou mjvh80 pjensen rodrigomorae anhlbt roy77 mindnumb jschenxinpeng kinhtkblbp sconan32 bygreencn antwan666 aldoismael1 tuananhlc yungmone jolinxql xinixini mao47 intranetfactory

framework's Issues

Add Zero-phase Component Analysis

It would be interesting to incorporate Zero-phase Component Analysis (ZCA) in the framework as described in:

C Zhu, Optimization based whitening; available at:
http://cs229.stanford.edu/proj2010/Zhu-OptimizationBasedWhitening.pdf

C4.5Learning: tree depends on the sorting order when the desicion variable is continuous

For DecisionVariableKind=Continuous, candidates for cut values are determined by this code:
if (o[j] != o[j + 1])
candidates.Add((v[j] + v[j + 1]) / 2.0);

The order of the output values are determined by the sorting order of the actual values thus can change for several equal values.
For example, after sorting:
Values are: 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 24, 24, 26, 26, 28, 29, 30, 30
Labels are: 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1

Candidates for split in this case are: 2, 2, 2, 3, 3, 3

Another option for sorting would be:
Values are: 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 24, 24, 26, 26, 28, 29, 30, 30
Labels are: 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1
(I've just switched the order of two equal values)

In that case, although the values are exactly the same, the candidates for split would be: 2, 2, 2, 3, 3, 3, 3, 13.5.

This behavior may results in the same decision tree no matter if the values are:
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 24, 24, 26, 26, 28, 29, 30, 30 OR
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 240, 240, 260, 260, 280, 290, 300, 300 OR
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 2400, 2400, 2600, 2600, 2800, 2900, 3000, 3000 OR..
Since both the last 3 and the 24 are labeled as 1 (other 3's are labeled as 0)

Any suggestions on what can be done?

Thanks a lot,
Sivan.

Unexplained Divide By Zero exception

I'm unable to create a code book for the images from the National Data Science bowl; I get a DivideByZero exception during the Compute phase.

The stack trace indicates the problem is in the ProcessImage method of the SURF detector.

I've written an easily runnable linqpad example and posted it as a GIST here: https://gist.github.com/Crisfole/bb636955891327f8b057

To reproduce:

Download the train.zip data files from the National Data Science bowl. (You may have to create a kaggle account to do this, TOS seem to be ok w/ practicing on their data).
Extract the training images somewhere on you computer.
Copy-Paste the Gist into Linq-Pad and replace the directory at the top of the Main() method with yours.
Run.

Consider strategies to support computing bag-of-visual words in large datasets

The current bag-of-visual-words impementation considers that images, resources, everything would be already in main memory. However, it should also be possible to create an alternate mechanism so images can be loaded from the disk in batches. A new clustering algorithm may also need to be implemented (such as an onine version of k-means).

https://groups.google.com/forum/#!topic/accord-net/AGeG___EWNg

GC-5: Maximum cross-correlation formula

In the correlationMatching.cs, there is a formula for Normalized Cross Correlation matrix like:

               for (int i = 0; i < windowSize; i++)
               {
                   for (int j = 0; j < windowSize; j++)
                   {
                       sum1 += w1[i, j] * w2[i, j];
                       sum2 += w2[i, j] * w2[i, j];
                   }
               }
               matrix[n1, n2] = sum1 / System.Math.Sqrt(sum2);

But in reference pages, I found that sum2 should be like : sum2 += w1[i, j] * w1[i, j] * w2[i, j] * w2[i, j];

Improve stopping criteria for the SVM learning algorithms

Event handlers could be added to keep track of the SVM training.

Implement Grubbs' test

In short, a new GrubbTest class should be created in the Accord.Statistics.Testing namespace, implementing the HypothesisTest abstract class.

Details are given here: http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h1.htm

NaiveBayes 5th constructor incorrect check on priors

in NaiveBayes`1.cs:293 we see the following

public NaiveBayes(int classes, int inputs, TDistribution[,] priors, double[] classPriors)
{
    if (classes <= 0) throw new ArgumentOutOfRangeException("classes");
    if (priors == null) throw new ArgumentNullException("priors");
    if (priors.Length != classes) throw new DimensionMismatchException("priors");
    if (classPriors.Length != classes) throw new DimensionMismatchException("classPriors");

The condition in the second line from bottom is incorrect for multidimensional arrays, whose .Length property returns the total number of elements in the array. What we want here instead is .GetLength(0) to get the size of the first dimension.

Pull request with fix & test coming.

Add support for reading Attribute-Relation File Format (ARFF) files

This is the file format supported by Weka. More information can be found at http://www.cs.waikato.ac.nz/ml/weka/arff.html

ComplexSignalConstructor unit test fails

After pulling in the recent commit originally requested by @webner, the ComplexSignalConstructor unit test (in the ComplexSignalTest class and the Accord.Tests.Audio project) fails with the following message:

Test method Accord.Tests.Audio.ComplexSignalTest.ComplexSignalConstructor threw exception:
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at Accord.Audio.Windows.WindowBase.Apply(Signal signal, Int32 sampleIndex) in WindowBase.cs: line 125
at Accord.Audio.Windows.Extensions.Split(Signal signal, IWindow window, Int32 step) in Extensions.cs: line 115
at Accord.Tests.Audio.ComplexSignalTest.ComplexSignalConstructor() in ComplexSignalTest.cs: line 106

Now, I cannot judge whether the unit test is not consistent with the WindowsBase commit, or if the code change in the commit is actually incorrect (at first glance it looks reasonable). Nevertheless, I hope someone can shed some light over this issue, and ideally propose a solid fix.

Best regards,
Anders @ Cureos

Integrate JMatIO codebase for reading MATLAB's .mat files in C#

It would be interesting to incorporate JMatIO codebase into Accord.NET so the framework could be able to read to and from .mat files. JMatIO's license is compatible with the framework's license and should be relatively straightforward to port.

Details can found here: http://sourceforge.net/p/jmatio/code/HEAD/tree/trunk/JMatIO/

Add a HistogramBox viewer in the same spirit as ScatterplotBox and DataGridBox

The framework currently offers controls that mimic Windows Form's MessageBox functionality to display charts and graphs. It would be interesting to add a similar mechanism to display histograms.

Issue with string parsing into Quadratic objective function

Hi,

I'am trying to parse a string into quadratic objective function.

this is the expression : 0.5(63x0.012 + 1.41724668855095x² +39.69)
and there are the constraints : x0.2 + 6.3 = 0 and x + 63- 1 = 100

systematically the parsing does not work.

First I thought it was because my decimal separator was comma instead of point, but after modifications, still does not work.

I have tried to parse a very simple expression as "2x² -5" but does not work too.

I am looking for some help : parsing expression regarding the white space, the signs, the coefficients position ....

Thank you so much for any help.

Regards,

Iris

GC-4: Blend doesn't fully account for Alpha channel when two 32 bit images are combined

'''What steps will reproduce the problem?'''

1.Create two images with fully transparent areas that can be combined
2.the first image blends properly
3.the second images alpha channel is blended in. if its fully transparent those bits should not copy

'''What is the expected output? What do you see instead?'''
100% transparent bits should not blend int

'''What version of the product are you using? On what operating system?'''
Accord 2.1.4

'''Please provide any additional information below:'''
Attached code fixes for 100% alpha, but doesn't attempt anything fancy for partial transparency
{{{
// validate source pixel's coordinates
if ((ox >= 0) && (oy >= 0) && (ox < width) && (oy < height))
{
int c = oy * srcStride + ox * srcPixelSize;

                        // fill destination image with pixel from source image
                        if (srcPixelSize == 4 && src[c + 3] == 0)
                        {
                            // nothing to copy in
                            //NOP
                        }
                        else if (dst[3] > 0)
                        {
                            // there is a pixel from the other image here, blend
                            double d1 = distance(ox, oy, center1.X, center1.Y);
                            double d2 = distance(ox, oy, center2.X, center2.Y);
                            double f = Accord.Math.Tools.Scale(0, dmax, 0, 1, d1 - d2);

                            if (f < 0) f = 0;
                            if (f > 1) f = 1;
                            double f2 = (1.0 - f);

                            dst[0] = (byte)(src[c] * f2 + dst[0] * f);

}}}

Implement the generalized (extreme Studentized deviate) ESD test

In short, a new GeneralizedExtremeStudentizedDeviateTest class should be created in the Accord.Statistics.Testing namespace, implementing the HypothesisTest abstract class.

Details are given here: http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h3.htm

Upgrade to VS2013 and target .NET45

Analysis classes should work with jagged arrays instead of multidimensional matrices

All analysis classes and their interfaces should be refactored to work with jagged arrays instead of multidimensional matrices. This might involve updating some matrix decomposition classes to work with jagged arrays.

SVM code that worked with version 2.11.0.0, fails to converge with 2.14.0

I had an F# sample for multiclass SVM that work perfectly with version 2.11.0.0, but when I run the same code upgrading to version 2.14.0, SVM fails to converge.

The sample code I am using is here (relevant code part highlighted) - it's a multi-class SVM model on a subset of the MINST/digit recognition dataset:
https://github.com/mathias-brandewinder/Presentations/blob/master/Machine-Learning-With-FSharp/code/Classification/Script.fsx#L49-67

When I upgraded to 2.14.0, I get the error message below. Was there any significant change to the library that would explain this?

Thanks in advance!

Mathias

System.AggregateException: One or more errors occurred. ---> System.AggregateException: One or more errors occurred. ---> Accord.ConvergenceException: Convergence could not be attained. Please reduce the cost of misclassification errors by reducing the complexity parameter C or try a different kernel function.
at Accord.MachineLearning.VectorMachines.Learning.SequentialMinimalOptimization.Run(CancellationToken token, Double[] c)
at Accord.MachineLearning.VectorMachines.Learning.BaseSupportVectorLearning.Run(Boolean computeError, CancellationToken token)
at Accord.MachineLearning.VectorMachines.Learning.MulticlassSupportVectorLearning.<>c__DisplayClass4.<>c__DisplayClass6.b__1(Int32 j)
at System.Threading.Tasks.Parallel.<>c__DisplayClassf1.<ForWorker>b__c() at System.Threading.Tasks.Task.InnerInvoke() at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask) at System.Threading.Tasks.Task.<>c__DisplayClass11.<ExecuteSelfReplicating>b__10(Object param0) --- End of inner exception stack trace --- at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken) at System.Threading.Tasks.Task.Wait() at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action1 body, Action2 bodyWithState, Func4 bodyWithLocal, Func1 localInit, Action1 localFinally)
at System.Threading.Tasks.Parallel.For(Int32 fromInclusive, Int32 toExclusive, Action1 body) at Accord.MachineLearning.VectorMachines.Learning.MulticlassSupportVectorLearning.<>c__DisplayClass4.<Run>b__0(Int32 i) at System.Threading.Tasks.Parallel.<>c__DisplayClassf1.b__c()
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass11.b__10(Object param0)
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Task.Wait()
at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action1 body, Action2 bodyWithState, Func4 bodyWithLocal, Func1 localInit, Action1 localFinally) at System.Threading.Tasks.Parallel.For(Int32 fromInclusive, Int32 toExclusive, Action1 body)
at Accord.MachineLearning.VectorMachines.Learning.MulticlassSupportVectorLearning.Run(Boolean computeError, CancellationToken token)
at <StartupCode$FSI_0002>.$FSI_0002.main@() in c:\Users\Mathias Brandewinder\Documents\Visual Studio 2013\Projects\AccordDemo\AccordDemo\Script.fsx:line 59
---> (Inner Exception #0) System.AggregateException: One or more errors occurred. ---> Accord.ConvergenceException: Convergence could not be attained. Please reduce the cost of misclassification errors by reducing the complexity parameter C or try a different kernel function.
at Accord.MachineLearning.VectorMachines.Learning.SequentialMinimalOptimization.Run(CancellationToken token, Double[] c)
at Accord.MachineLearning.VectorMachines.Learning.BaseSupportVectorLearning.Run(Boolean computeError, CancellationToken token)
at Accord.MachineLearning.VectorMachines.Learning.MulticlassSupportVectorLearning.<>c__DisplayClass4.<>c__DisplayClass6.b__1(Int32 j)
at System.Threading.Tasks.Parallel.<>c__DisplayClassf1.<ForWorker>b__c() at System.Threading.Tasks.Task.InnerInvoke() at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask) at System.Threading.Tasks.Task.<>c__DisplayClass11.<ExecuteSelfReplicating>b__10(Object param0) --- End of inner exception stack trace --- at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken) at System.Threading.Tasks.Task.Wait() at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action1 body, Action2 bodyWithState, Func4 bodyWithLocal, Func1 localInit, Action1 localFinally)
at System.Threading.Tasks.Parallel.For(Int32 fromInclusive, Int32 toExclusive, Action1 body) at Accord.MachineLearning.VectorMachines.Learning.MulticlassSupportVectorLearning.<>c__DisplayClass4.<Run>b__0(Int32 i) at System.Threading.Tasks.Parallel.<>c__DisplayClassf1.b__c()
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass11.b__10(Object param0)
---> (Inner Exception #0) Accord.ConvergenceException: Convergence could not be attained. Please reduce the cost of misclassification errors by reducing the complexity parameter C or try a different kernel function.
at Accord.MachineLearning.VectorMachines.Learning.SequentialMinimalOptimization.Run(CancellationToken token, Double[] c)
at Accord.MachineLearning.VectorMachines.Learning.BaseSupportVectorLearning.Run(Boolean computeError, CancellationToken token)
at Accord.MachineLearning.VectorMachines.Learning.MulticlassSupportVectorLearning.<>c__DisplayClass4.<>c__DisplayClass6.b__1(Int32 j)
at System.Threading.Tasks.Parallel.<>c__DisplayClassf`1.b__c()
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass11.b__10(Object param0)<---
<---

---> (Inner Exception #1) System.AggregateException: One or more errors occurred. ---> Accord.ConvergenceException: Convergence could not be attained. Please reduce the cost of misclassification errors by reducing the complexity parameter C or try a different kernel function.
at Accord.MachineLearning.VectorMachines.Learning.SequentialMinimalOptimization.Run(CancellationToken token, Double[] c)
at Accord.MachineLearning.VectorMachines.Learning.BaseSupportVectorLearning.Run(Boolean computeError, CancellationToken token)
at Accord.MachineLearning.VectorMachines.Learning.MulticlassSupportVectorLearning.<>c__DisplayClass4.<>c__DisplayClass6.b__1(Int32 j)
at System.Threading.Tasks.Parallel.<>c__DisplayClassf1.<ForWorker>b__c() at System.Threading.Tasks.Task.InnerInvoke() at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask) at System.Threading.Tasks.Task.<>c__DisplayClass11.<ExecuteSelfReplicating>b__10(Object param0) --- End of inner exception stack trace --- at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions) at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken) at System.Threading.Tasks.Task.Wait() at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action1 body, Action2 bodyWithState, Func4 bodyWithLocal, Func1 localInit, Action1 localFinally)
at System.Threading.Tasks.Parallel.For(Int32 fromInclusive, Int32 toExclusive, Action1 body) at Accord.MachineLearning.VectorMachines.Learning.MulticlassSupportVectorLearning.<>c__DisplayClass4.<Run>b__0(Int32 i) at System.Threading.Tasks.Parallel.<>c__DisplayClassf1.b__c()
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass11.b__10(Object param0)
---> (Inner Exception #0) Accord.ConvergenceException: Convergence could not be attained. Please reduce the cost of misclassification errors by reducing the complexity parameter C or try a different kernel function.
at Accord.MachineLearning.VectorMachines.Learning.SequentialMinimalOptimization.Run(CancellationToken token, Double[] c)
at Accord.MachineLearning.VectorMachines.Learning.BaseSupportVectorLearning.Run(Boolean computeError, CancellationToken token)
at Accord.MachineLearning.VectorMachines.Learning.MulticlassSupportVectorLearning.<>c__DisplayClass4.<>c__DisplayClass6.b__1(Int32 j)
at System.Threading.Tasks.Parallel.<>c__DisplayClassf`1.b__c()
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass11.b__10(Object param0)<---
<---

Stopped due to error

GC-18: Normalize Statistic models to use a common convergence controlling mechanism

The learning of Logistic Regression models is done in the same fashion as AForge.NET's Neural Networks: a Run method has to be called to run iterations (or batch of iterations) manually, until the user decides convergence has been reached.

The Cox's models, the Hidden Markov Models and Hidden Conditional Random Fields, on the other hand, control the iteration themselves. You specify the number of iterations and tolerance threshold and call them once.

It is needed to decide which is most appropriated and enforce this as a common standard for all learning methods.

Exception in KPCA

hello cesar,

when applying a transform to a double[] the Transform function throws an exception. try this adapted code from the documentation using jagged arrays:

double[][] sourceMatrix = new double[][]
        {
            new double[] { 2.5,  2.4 },
            new double[] { 0.5,  0.7 },
            new double[] { 2.2,  2.9 },
            new double[] { 1.9,  2.2 },
            new double[] { 3.1,  3.0 },
            new double[] { 2.3,  2.7 },
            new double[] { 2.0,  1.6 },
            new double[] { 1.0,  1.1 },
            new double[] { 1.5,  1.6 },
            new double[] { 1.1,  0.9 }
        };

        // Create a new linear kernel
        IKernel kernel = new Linear();

        // Creates the Kernel Principal Component Analysis of the given data
        var kpca = new KernelPrincipalComponentAnalysis(sourceMatrix, kernel);

        // Compute the Kernel Principal Component Analysis
        kpca.Compute();

        // The following statement throws an exception:
        // An unhandled exception of type 'System.IndexOutOfRangeException' occurred in Accord.Math.dll
        double[] transformed = kpca.Transform(sourceMatrix[0]);

when using a plain PrincipalComponentAnalysis everything runs fine as expected.
is this a bug?

thank you,
matthias

Unable to build 2.13.1

Just downloaded latest version 2.13.1 and attempted to build using Visual Studio 2013. After conversion by VS to 2013 solution, fails to build successfully with 35 errors including:

C#: Several methods are applicable to '(double[][])': 'Accord.Math.Matrix.ToMatrix<T>(T[])' and 'Accord.Math.Matrix.ToMatrix<T>(T[], bool)' and 'Accord.Math.Matrix.ToMatrix<T>(T[][])' MatrixFormatter.cs  Accord.Math 334 35  C#  E:\Development\DevCsharp\Common\Accord.NET 2.13.1\Framework\Sources\Accord.Math\Formats\Base\MatrixFormatter.cs

There are multiple instances of the above error.

C#: Unexpected end of expression    Tools.cs    Accord.Math 555 30  C#  E:\Development\DevCsharp\Common\Accord.NET 2.13.1\Framework\Sources\Accord.Math\Tools.cs

Source: Int32 i = (Int32)&f;

Also same error occurs in
Accord.Neuro.LevenbergMarquardtLearning.cs;
Accord.Neuro.ParallelResilientBackPropagationLearning.cs
in the Parallel.For construction

Any thoughts gratefully received.

LibBFGSCComparisonTest failures

For a while now, 4 of the 7 unit tests in the LibBFGSCComparisonTest class,

ParameterBatchTest
ParameterBatchTest2
ParameterRangeTest
ParameterTest1

are failing with the following failure message:

Assert.AreEqual failed. Expected: <LBFGS_CONVERGENCE>. Actual:<LBFGS_SUCCESS>.

For all but one failing tests the stack trace starts with the following line:

at Accord.Tests.Math.LibBFGSComparisonTest.compute(List`1 problems, LBFGSComparer cmp) in LibBFGSComparisonTest.cs: line 301

For the remaining failing unit test (ParameterRangeTest) this stack trace is displayed:

at Accord.Tests.Math.LibBFGSComparisonTest.ParameterRangeTest() in LibBFGSComparisonTest.cs: line 272

NOTE! These failures have been observed when running unit tests for the PCL version of Accord.NET Framework (I am currently having a few issues when trying to run main repository unit tests). Nevertheless, the code masses should be equivalent and I expect that the failures would show up even for the main repository.

Regards,
Anders @ Cureos

Update NuGet packages

It should be necessary to upgrade the NuGet packages so we can provide framework-specific assembly versions. The Accord.MachineLearning.GPL assembly is also missing from the current distribution, and should be added.

Add new HMM methods to compute the probability of an state inside an observation sequence

Currently, it is possible to compute the probability of a state in the end of a sequence using the Forward/Viterbi methods. However, it should also be possible to compute the probability of an state in the middle of an observation sequence using both the forward and backward matrices.

https://groups.google.com/forum/#!topic/accord-net/D6nOhxngcW0

Add more statistical measures to distribution classes

It would be nice to add some missing distribution metrics, such as Skewness, Kurtosis, moment generation functions and others to most of the distribution classes.

Implement more statistical distributions

Implement the missing distributions described in http://www.itl.nist.gov/div898/handbook/eda/section3/eda366.htm

Specialized regression classes

It should be interesting to offer specialized regression classes for Exponential, Logarithm and Power regressions in the same way PolynomialRegression is currently handled.

K-medoids algorithm

Add the k-medoids algorithm in the clustering module.

A description of the k-medoids algorithm can be found in Wikipedia, alongside with worked examples that would help testing and verifying an actual implementation:

http://en.wikipedia.org/wiki/K-medoids

For practical pourposes, there is also a BSD licensed k-medoids implementation available on MATLAB Central. Since this implementation is available under the BSD license, anyone willing to work on this feature can leverage the source code on this page and include the original copyright text for the original author, Mo Chen.

In order to implement a new clustering algorithm, start by copying all the contents of the existent KMeans.cs file, and simply rename the class to KMedoids. Then, replace the core algorithm with the k-medoids one. This will simplify getting the IUnsupervisedLearning interface implementation straight.

Bug in Singular Value Decomposition

There is example of using svd for lsi at http://nlp.stanford.edu/IR-book/html/htmledition/latent-semantic-indexing-1.html

When i'm using SingularValueDecomposition class to compute svd, i get different results.
Here is program that reproduces example:

    internal class Program
    {
        private static void Main(string[] args)
        {
            SvdTest();
            Console.ReadLine();
        }


        private static void SvdTest()
        {
            var matrix = new double[,]
                         {
                             {1, 0, 1, 0, 0, 0}, 
                             {0, 1, 0, 0, 0, 0},
                             {1, 1, 0, 0, 0, 0},
                             {1, 0, 0, 1, 1, 0},
                             {0, 0, 0, 1, 0, 1},

                         };
            PrintMatrix(matrix, "Original");

            var svd = new SingularValueDecomposition(matrix,true,true,true,true);


            var u = svd.LeftSingularVectors;
            PrintMatrix(u, "U");

            var s = svd.DiagonalMatrix;
            PrintMatrix(s, "S");

            var vt = svd.RightSingularVectors.Transpose();
            PrintMatrix(vt, "Vt");

            var powerOfsigma = svd.Diagonal.Length;
            const int reducedPower = 2;

            var toRemove = powerOfsigma - reducedPower;

            var indexes = GetIndexes(powerOfsigma, toRemove);

            var semanticSpace = s.Remove(indexes, indexes)
                .Multiply(vt.Remove(indexes, null));

            PrintMatrix(semanticSpace, "reduced document semantic space model");
        }


        private static int[] GetIndexes(int value, int nTimes)
        {
            return IterateFromValue(value, nTimes).ToArray();
        }

        private static IEnumerable<int> IterateFromValue(int value, int nTimes)
        {
            for (int i = 0; i < nTimes; i++)
                yield return --value;
        }

        private static void PrintMatrix(double[,] doubles, string name)
        {
            Console.WriteLine(name);
            for (int i = 0; i < doubles.GetLength(0); i++)
            {
                for (int j = 0; j < doubles.GetLength(1); j++)
                    Console.Write(doubles[i, j].ToString("0.00") + "\t");
                Console.WriteLine();
            }
            Console.WriteLine();
        }
}

Unsafe index out of range in InfiniteAdaptiveGaussKronrod

The current InfiniteAdaptiveGaussKronrod implementation has been translated/adapted from FORTRAN with the help with a Fortran to C++ converter. It has then been converted to C# using unsafe blocks to make handling pointers (as in the original code) easier.

However, there seems to be an index out of error in the allocated unsafe memory blocks which might have been corrupting the stack and/or the heap. For the time being, the allocated memory is being padded with extra leading and trailing zeros so the out of range accesses don't touch protected memory.

Multilabel SVM - Different outputs with same SMO learning parameters

Hi Cesar,

I just noticed something weird when training Multilabel SVMs and am not sure if there is a bug involved here. When taking the example code from the documentation to setup the machines

teacher.Algorithm = (svm, classInputs, classOutputs, i, j) =>
    new SequentialMinimalOptimization(svm, classInputs, classOutputs);

for the given examples of [0,3,1,2] i get the following output:

 -1 1 -1
 -1 -1 -1
 -1 1 -1
 -1 -1 -1

When configuring the training algorithm like this (setting the values of complexity, epsilon and tolerance to the same values is just for making the point):

        teacher.Algorithm = (svm, classInputs, classOutputs, a, b) =>
           {
               var smo = new SequentialMinimalOptimization(svm, classInputs, classOutputs);
               smo.Complexity = smo.Complexity;
               smo.Epsilon = smo.Epsilon;
               smo.Tolerance = smo.Tolerance;
               return smo;
           };

the output is now:

-1 1 -1
-1 -1 1
-1 1 -1
-1 -1 -1

(notice the different value in the second row, third value)
even though the configure parameters for the learning algorithm should not have been changed, the output of the machine is different.
is this intended behaviour?

thank you for your great work!

Organize unit tests dependencies

In the main project, certain projects don't reference each other for obvious reasons. For example, Accord.Math cannot reference Accord.MachineLearning, but actually it is Accord.MachineLearning that should reference Accord.Math.

However, the same cannot be said for the unit tests. In the current version, the unit tests have references among themselves, which might be a bit confusing, specially if you are trying to disentangle them from the main project.

As such, uncommon dependencies must be removed, and the resulting broken tests and references should be fixed.

Term Frequency - Inverse Document Frequency (TF-IDF)

It would be a nice addition for those using Accord.NET in text applications, specially now that more linear optimization algorithms are available.

Kory Becker has created a nice implementation (under a compatible license) which can be used as a basis for the implementation. The current code can be found here:

https://github.com/primaryobjects/TFIDF/blob/master/TFIDFExample/TFIDF.cs

However, it seems the implementation needs a stemmer. In this case, the stemmer could be incorporated in the project by specifying a new ITextStemmer interface. Different stemmers could than be created using the Snowball project:

https://github.com/cesarsouza/snowball

It would be cool to add a new text generator in Snowball for C#. It shouldn't be that difficult given that there are working Java generators available.

K-means serialization compatibility

K-Means objects serialized using v2.10 cannot be deserialized in v2.14

SauvolaThreshold and NiblackThreshold filter are always making whole black image

the code is very simple :
// var filter = new SauvolaThreshold();
var filter = new NiblackThreshold();
using (var b = new Bitmap(baseDir + "demo.jpg"))
{
//filter.K = 0.1;
//filter.R = 110;
//filter.Radius = 10;
ImageBox.Show(filter.Apply(b));
}
the demo.jpg is just extracted from the link below
http://fiji.sc/wiki/index.php/Auto_Local_Threshold#Try_all (Original image)
I tried default parameter and many others , It does the same thing.

Add Local Contrast Normalization

It would be interesting to add the Local Contrast Normalization as referenced in

Jarrett, Kavukcuoglu, Ranzato and LeCun (2009); What is the Best Multi-Stage Architecture for Object Recognition? Available at http://yann.lecun.com/exdb/publis/pdf/jarrett-iccv-09.pdf

Pinto, Cox, Dicarlo (2008); Why is Real-World Visual Object Recognition Hard? Available at http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.0040027

Multiple Linear Regression Without Intercept term

Hi all,

First congratulations for this libraries.
I'm new and I'm facing a problem (or bug?) that i don't know what is the issue? When i launch with the sample application "Regression", i load an excel file and i change in the code behind file the boolean to false when i create an new multiple linear regression.. so i have :

mlr = new MultipleLinearRegressionAnalysis(input, output, independentNames, dependentName, false);

But when i call the compute method, i have an ArgumentOutOfRangeException (value must be between 0 and 1), except that the same data works in statsgraphics.

Thanks in advance.
Stéphane.

Distribution issue?

I used the latest installer (Accord.NET-2.13.1-installer) and the Accord.Controls.Audio.dll assembly in the Accord.NET\Framework\Release\net35 folder seems to have been built with .NET 4. Thus, I cannot use it in a project targeting .NET 3.5. I don't know if this applies to other assemblies as well.

Accord.Extensions class

Newly introduced Extensions class collides with the Accord.NET Extensions Framework which base namespace is Accord.Extensions. This issue produces major problems in usability of these two libraries.

By simply renaming the Extensions class (I see that it only contains extension method) the problem could be resolved and the compatibility would not be broken.

Implement Tietjen-Moore Test

In short, a new TietjenMooreTest class should be created in the Accord.Statistics.Testing namespace. However, implementing this test should be a bit harder than #19 since the critical value is determined by simulation. Perhaps a new TietjenMooreDistribution should be created for this statistic so the test class can inherit from HypothesisTest

Details are given here: http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h2.htm

Merge with AForge.NET

In the early days, Accord.NET was created and designed as an extension framework for AForge.NET. However, since 2012, AForge.NET public support was ended. Thus, the current plan is to incorporate AForge.NET directly in Accord.NET, merging the two.

This will solve accord-net/base#3

Add support for batch PLS (also called extended PLS or multiway PLS)

It would be nice to add support for batch PLS. Some papers discussing the approach are available here:

http://automatica.dei.unipd.it/public/Schenato/PSC/2010_2011/gruppo4-Building_termo_identification/IdentificazioneTermodinamica20072008/Biblio/sdarticle4.pdf

http://upcommons.upc.edu/e-prints/bitstream/2117/11636/1/Paper_369_Mujica.pdf

KNearestNeighborMatching should accept a main anchor image to create the k-NN model

Currently, the KNearestNeighborMatching class matches two images by creating a k-NN model using the image with the largest number of feature points, to speedup computations. However, this behavior isn't always desired. It would be more interesting to provide the most important image in the class constructor, or offer this swapping as an option.

Binomial probability mass function result differs from Excel result

I'm trying to implement some functionality that was formerly provided via an Excel sheets into a C# application, but the probability mass function differs for some reason from the excel function.

In excel the probabilty mass function, is used this way

=BINOM.DIST(250; 3779; 0.0638; FALSE)

Result:
0.021944019794458

When I try it with Accord.NET

var binom = new BinomialDistribution(3779, 0.0638);
binom.ProbabilityMassFunction(250);

Result:
Infinity

But the cumulative distribution seems to work properly (except for the last few digits, but I assumed this is just some kind of precision error)

=BINOM.DIST(250; 3779; 0.0638; TRUE)

Result:
0.736156366002849

var binom = new BinomialDistribution(3779, 0.0638);
binom.DistributionFunction(250);

Result:
0.736156366002318

Why are the results so different? And is there a way to get the Excel result with Accord?

EDIT: Extreme.Numerics calculates the same result as Excel, but I don't want to use this library, as the license system of this library always led to trouble in the past.

How get the corners match for extract an image to another with QuadrilateralTransformation?

Hi i try to detect an image inside to another image, i use the freak detector / KNearestNeighborMatching / RansacHomographyEstimator and I have the matrixh.

I can't find the way to obtain the corners for use the aforge filter QuadrilateralTransformation() for extract de image

GC-25: Implement the implicitly restarted Lanczos method for finding Eigenvalues and Eigenvectors

It would be useful to have an implementation of the implicitly restarted Lanczos method algorithm (http://en.wikipedia.org/wiki/Lanczos_algorithm). It could improve the efficiency of methods which require only the largest or the first largest Eigenvalues to be computed.

HiddenConditionalRandomField is not correctly serialized

How to duplicate the issue,

Process 1,
1, train HiddenConditionalRandomField
2, save HiddenConditionalRandomField

Process 2,
1, Load HiddenConditionalRandomField
2, Lots of fields, such as, HiddenConditionalRandomField.Function.Factors.EdgeParameters, are initialized to 0, which are apparently not correct.

Incorrect calculation of WaveDecoder.AverageBitsPerSecond

In the WaveDecoder class, the AverageBitsPerSecond is derived from the WaveStream.Format.AverageBytesPerSecond, here. However, the AverageBytesPerSecond should be multiplied with 8 to yield the AverageBitsPerSecond, nothing else.

As far as I can tell, this error has very little impact. I have only been able to identify two usages of the AverageBitsPerSecond property in the framework solution, here and here in one WaveEncoder unit test. In these lines, the expected assertion values should be modified to 1411200.

GC-24: Add easier creating and handling of factors for categorical variables

In order to use some models with categoric data, such as logistic regression, it is first necessary to convert categories into factors. It would be nice to have a more easier or standard way to create such variables.

Regularization breaks LogisticRegressionAnalysis in 2.14.0

I have a set of data which I could calculate a logistic regression for in 2.13.1, but not in 2.14.0.
I still get a 'converged' result from Compute, but the coefficients are all -1.2748188443139669E+194.
If I use the same procedure used in LogisticRegressionAnalysis, but use an IterativeReweightedLeastSquares whose Regularization is set to 0, everything works as in 2.13.1

I've uploaded the data that shows the problem