Giter VIP home page Giter VIP logo

goml's People

Contributors

arianht avatar ashnair1 avatar cdipaolo avatar jrbarron avatar juandes avatar mexeniz avatar piazzamp avatar vikashvverma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

goml's Issues

Silhouette validation for clustering

Are you planning on implementing the silhouette method for validation the clustering results. Is this a wanted feature? I could implement if you like

Allow users to disable stdout

It would be appreciated if there was a global flag to disable logging to standard out. When creating models, it's not always wanted to fill screen space with the model output.

Alternatively it'd be nice if instead of default printing, you could call a method that would give you the variables, like

model.OptimzationMethod() -> "Batch Gradient Ascent
model.TrainingExamples() -> 4000

or return a struct with all the information, so that callers can decide what and where they want to print that information.

fmt.Errorf format %v reads arg #2, but call has 1 arg

This line in kmeans throws an error fmt.Errorf format %v reads arg #2, but call has 1 arg while running tests.

A simple fix would be to replace the line in question

errors <- fmt.Errorf("ERROR: point.X must have the same dimensions as clusters (len %v). Point: %v", point)

with this

errors <- fmt.Errorf("ERROR: point.X must have the same dimensions as clusters (len %v). Point: %v", centroids, point)

Follow up question, is this project in active development?

TFIDF doesn't work

TFIDF doesn't work unless we actually save the DocsSeen value in the Bayes model.

Currently the struct for Word doesn't do this.

type Word struct { Count []uint64 Seen uint64 DocsSeen uint64 json:"-" }

Should be:

type Word struct { Count []uint64 Seen uint64 DocsSeen uint64 }

Concurrent map access during NaiveBayes's OnlineLearning sequence

In theOnlineLearn method, Words are written to the model's counts of words while the Predict, Probability, and TFIDF.InverseDocumentFrequency methods read from that same map. The only indication that training is done is that the errors channel passed in to OnlineLearn is closed, at which point it's safe to use the model. Otherwise, a runtime error will occur as a result of the concurrent map reads and writes.

Copying lock values TFIDF

When trying to cast the NaiveBayes model to the TFIDF model, I get a go-vet warning saying "TFIDF copies lock value".

There was another related issue with a fix that allowed access to the concurrent map, however I can't find a way to cast one model to the other without this issue.

I get the same issue when running tfidf_test.go

Comparison with Weka, others?

It would be very useful to compare performance (run time, memory used) with other commonly used machine learning libraries/frameworks. like Weka and Apache Mahout....

Text models, uint8 for number of classes?

I don't know that much at the moment about ML so pardon me if this is ignorant. Is there a reason that the number of classes for text classification is limited to 255 via uint8? Would it be possible to increase this?

Roadmap / Comparison to other Go ML libraries

How does goml compare to some of the other Go libraries in terms of product vision / roadmap?

There's a decent amount of overlap in terms of the implemented algorithms / models. Is your goal to eventually include all of the other types (neural networks, collaborative filtering, etc)? It seems like the stated goal of being more stream oriented than batch oriented differentiates this library too.

At the end of the day, this seems like the most active repo with an exciting direction. I'm very curious to know where you plan on taking things.

Examples

I'd like to learn more about machine learning and this library looks like a good place to start building something with. Are there any examples you could post to demonstrate some simple use cases?

Bug in k nearest neighbors

In the Predict function in knn.go you "initialize" the neighbors array with random elements from k.trainingSet and then use insertSorted to insert new data into the neighbors array.

This is a problem because insertSorted requires that the array you are inserting into be sorted; it uses binary search. The random data you initialize the neighbors vector with may not sorted.

A possible fix is to get rid of the rand package altogether, initialize the neighbors vector with the first k.K elements from k.trainingSet, and sort neighbors before calculating the nearest neighbors.

I can submit a pull request if you like.

Remove `fmt.Printf`s?

Hello!

Great library. I noticed during tests that the code decides to just fmt.Printf. I don't want the ML lib in my app to be outputting to the console without me knowing. Can we disable that? Or provide a way to provide an alternate io.Writer?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.