basalt-org / basalt Goto Github PK

View Code? Open in Web Editor NEW

198.0 9.0 19.0 12.85 MB

A Machine Learning framework from scratch in Pure Mojo 🔥

Home Page: https://basalt-docs.vercel.app/

License: Other

Python 2.04% Mojo 97.22% Shell 0.75%

ai autograd deep-learning deep-neural-networks machine-learning ml mojo neural-network tensor

basalt's Introduction

Basalt

A Machine Learning framework from scratch in pure Mojo 🔥

About The Project

Basalt is a stand-alone machine learning framework that leverages the power of Mojo.

As discussed by Modular, Mojo is a language for the future of AI development. Built on top of MLIR technology, rather than existing GCC and LLVM approaches, Mojo looks and feels like Python code, yet performs much closer to languages like Rust or C++. Parametric functions and compile time parameters allow for the graph to statically compiled. Having the static graph allows for much harder performance optimizations.

Basalt, while still in its infancy, is able to achieve speeds comparable to well established frameworks like Pytorch. Below a snapshot of the current benchmarks. But keep posted, there is much more room for improvement and we are upgrading the project on a daily basis.

Quick Start

Try out the benchmarks yourself:

mojo -I . examples/housing.mojo

mojo -I . examples/sin_estimate.mojo

mojo -I . examples/mnist.mojo

Compare to the alternative PyTorch implementation:
Make sure to install the requirements in python-requirements.txt in your python environment.

python examples/housing.py
python examples/sin_estimate.py
python examples/mnist.py

Roadmap

v0.1.0 ✅

Improve matrix multiplication and convolution kernels
Switch to custom Tensor and TensorShape implementations
Improve benchmarks and overall model execution performance
Add profiling and additional performance tests

v0.2.0 (WIP)

Add additional operators: Slice, (Un)Squeeze, Concat, Clip, Gather, Split, FMA ...
Better layer support and more activation functions
Graph submodules & graph concatenation
Computer vision benchmark.

Long-Term

Contributing

Basalt is built by community efforts and relies on your expertise and enthousiasm!
Small fixes and improvements are much appreciated. If you are considering larger contributions, feel free to contact us for a smoother communication channel on Discord. If you find a bug or have an idea for a feature, please use our issue tracker. Before creating a new issue, please:

Check if the issue already exists. If an issue is already reported, you can contribute by commenting on the existing issue.
If not, create a new issue and include all the necessary details to understand/recreate the problem or feature request.

Creating A Pull Request

Fork the Project
Create your Feature Branch
Commit your Changes
Push to the Branch
Open a Pull Request

Once your changes are pushed, navigate to your fork on GitHub. And create a pull request against the original basalt-org/basalt repository.

Before creating a PR make sure it doesn't break any of the unit-tests. (e.g. mojo run -I . test/test_ops.mojo)

Introducing new big features requires a new test!

In the pull request, provide a detailed description of the changes and why they're needed. Link any relevant issues.

If there are any specific instructions for testing or validating your changes, include those as well.

License

Distributed under the Apache 2.0 License with LLVM Exceptions. See LICENSE and the LLVM License for more information.

Acknowledgements

Built with Mojo created by Modular

basalt's People

Contributors

Stargazers

Watchers

Forkers

rajjanicodes andresnowak jackos soraros syseadmine snyata vguerra shahinsharifi dattgoswami wojtqzz21 dbl007 codingonion builtwithai wahyufauzi automata diro5t arnauddumanois awaywithcharles fnands

basalt's Issues

Gather benchmarks against Voodoo, Torch, Infermo?, Tensorflow?

fix backward POW: nans

Debugging examples

I don't think this is the best place to ask for help on this, but ...

I want to debug the file ./examples/housing.mojo that imports the package ./basalt. When I try to debug the file with Mojo's VSCode debugger, I get the error message unable to locate module 'basalt', the same error message that throws if I try to run ./examples/housing.mojo without setting the path argument (-I), so it seems like the problem is that I have to specify the path argument for the debugger too. I tried to update the launch file configuration by setting the args property to ["-I", "../"] (I have also tried "." and "./basalt"), which does show up in the final command but still doesn't work, throwing the same error message.

So my question is how to set up the correct debugging profile to debug the basalt package in Mojo.

Mojopkg gh action

I could create some GH actions to Stu run the tests and make mojopkg files for release :)

Constructive Critique and Improvement Suggestions for Basalt's Frontend APIs

I have tried to build a simple model with Basalt to get a sense of its frontend APIs and how it works and I have a few comments I would like to share.

It is a simple linear regression model that has the following code:

from basalt import Graph, Tensor, TensorShape, nn, dtype
from basalt.utils import tensorutils
from random import randn


fn create_model_graph() -> Graph:
    var g = Graph()

    var x = g.input(TensorShape(800, 2))
    var y_pred = nn.Linear(g, x, 1)
    g.out(y_pred)
    var y_true = g.input(TensorShape(800, 1))
    var loss = nn.MSELoss(g, y_pred, y_true)
    g.loss(loss)

    return g


fn main() raises:
    alias graph = create_model_graph()
    var model = nn.Model[graph]()
    var optim = nn.optim.Adam[graph](lr=0.01)
    optim.allocate_rms_and_momentum(model.parameters)

    var X_train_data = DTypePointer[dtype].alloc(800 * 2)
    randn(X_train_data, 800 * 2)
    var X_train = Tensor[dtype](X_train_data, TensorShape(800, 2))

    var true_weights = Tensor[dtype](TensorShape(2, 1))
    true_weights[0] = 2
    true_weights[1] = 3

    var y_train = Tensor[dtype](TensorShape(800, 1))
    tensorutils.dot[TensorShape(800, 2), TensorShape(2, 1)](
        y_train, X_train, true_weights
    )

    var X_test_data = DTypePointer[dtype].alloc(200 * 2)
    randn(X_test_data, 200 * 2)
    var X_test = Tensor[dtype](X_test_data, TensorShape(200, 2))

    var y_test = Tensor[dtype](TensorShape(200, 1))
    tensorutils.dot[TensorShape(200, 2), TensorShape(2, 1)](
        y_test, X_test, true_weights
    )

    for epoch in range(1000):
        var loss = model.forward(X_train, y_train)[0]
        optim.zero_grad(model.parameters)
        model.backward()
        optim.step(model.parameters)
        if epoch == 0 or (epoch + 1) % 100 == 0:
            print("Epoch: ", epoch + 1, ", Loss: ", loss, sep="")

    var test_loss = model.forward(X_test, y_test)[0]
    print("Test Loss: ", test_loss, sep="")
    print("Params: ", model.parameters.params.data[2])

My main point is that the frontend has a bad user experience. It isn't clear what the Graph API is doing at first glance, and once you realize it is for describing your model in some sort of DSL before populating it, it doesn't get easier to use it. It isn't intuitive that you have to describe your model's graph in terms of input, out, loss, param, etc. nodes, and the APIs for specifying them aren't great either.
Another pain point is the APIs for initializing tensors. I think it is clear from my code where the problem is. Most of the main function's body went for initializing simple train and test tensors.

I think the frontend can be a lot simpler. A good north star to work towards is Pytorch's frontend. tinygrad is an example of a recent framework that is applying this successfully. The result is a simple and intuitive frontend that users are already familiar with or is easy to familiarize themselves with. Here is the code for the same model in Python Pytorch:

import torch

X = torch.randn(1000, 2)
y = X.mv(torch.tensor([2, 3], dtype=torch.float))
X_train = X[:800]
y_train = y[:800]
X_test = X[800:]
y_test = y[800:]

class LinearRegression(torch.nn.Module):
    def __init__(self):
        super(LinearRegression, self).__init__()
        self.linear = torch.nn.Linear(2, 1)

    def forward(self, x):
        return self.linear(x)

model = LinearRegression()
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

for epoch in range(1000):
    optimizer.zero_grad()
    outputs = model(X_train)
    loss = criterion(outputs.squeeze(), y_train)
    loss.backward()
    optimizer.step()
    if epoch == 0 or (epoch + 1) % 100 == 0:
        print(f'Epoch {epoch + 1}, Loss: {loss.item()}')

with torch.no_grad():
    y_pred = model(X_test)
    loss = criterion(y_pred.squeeze(), y_test)
    print(f'Test loss: {loss.item()}')
    print(f'Model params: {model.linear.weight[0].numpy()} vs True weights: [2, 3]')

I'm new to Mojo so I don't know how easy is this to pull off. However, I think this is necessary if the project hopes to compete with existing alternatives (it is not enough that the code is written in Mojo) and it seems that the current frontend is far from the simplest form it can take.

These are my two cents, I hope you accept them ..