Giter VIP home page Giter VIP logo

randnla's Introduction

RandNLA

Introducction

Randomized matrix algorithms have been a hot topic in research the last years. Recent developments have shown their utility in large-scale machine learning and statistical data analysis applications.

RandNLA is an implementation of many Randomized algorithms for Numerical Linear Algebra on top of Numpy/Scipy.

Some of these methods are being implemented for libraries like scipy or scikit-learn. However, I could not find any widely used library implementing these methods, so I decided to implement it.

Motivation

Sketching is a way to compress matrices that preserve essential matrix properties. For some problems, sketches can be used to get faster ways to find high-precision solutions to the original problem. This tool can be used for least-squares and robust regression, eigenvector analysis, non-negative matrix factorization, etc...

The main idea of sketching matrices is not new. One of the most famous concepts behind the efficiency of random projection is the Johnson-Lindenstrauss lemma. It is used for random projections, and it has a "crude" implementation in scikit-learn

More recent work has been developed by Kenneth Clarkson and David Woodruff. In their paper Low Rank Approximation and Regression in Input Sparsity Time a new family of subspace embedding matrices is defined. The paper shows how those matrices can be used to obtain the fastest known algorithms for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and p-regression.

During my time at IBM Research Almaden, I have been worked on a xdata open source project for the last year called libSkylark. The library is suitable for general statistical data analysis and optimization applications, but it is heavily focused on distributed systems. The quality of the project is high but libSkylark is not as developer friendly as I would like. Even with bindings to python many people had troubles using the library.

Contributing

First off, thanks for taking the time to contribute!

Now, take a moment to be sure your contributions make sense to everyone else and please make sure to read the Contributing Guide before making a pull request.

Issue tracker

Found a problem? Want a new feature? First of all see if your issue or idea has already been reported. If it hasn't, just open a new clear and descriptive issue.

License

See the file LICENSE for information on the history of this software, terms & conditions for usage, and a DISCLAIMER OF ALL WARRANTIES.

randnla's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

randnla's Issues

Set up a CI Server

Given the workflow of this project would be great having a trigger each time that we want to merge something into the development branch.

Travis CI is a great tool for it and it is free for free software projects (like this one :D).

Clean Code (PEP8)

Merging some of the code of this library into the official scipy repository came with some troubles. Basically I had to fix format problems because the CI did not pass. This project must follow the best practices in order to assure a high level of code quality.

For that reason pep8 is a must before summit any code to it. It has to be integrate in our CI too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.