Giter VIP home page Giter VIP logo

llama-saas's Introduction

llama-saas

A real-time client and server for LLaMA.

  • ๐Ÿš€ Runs on any CPU machine, with no need for GPU ๐Ÿš€
  • The server is written in Go.
  • The client is written in Python using requests with response streaming in real time.

I personally used the smallest 7B/ model on an Intel PC / Macbook Pro, which is ~4.8G when quantized to 4 bit, or ~13G in full precision.

Examples

  • Nice example: elaborate about "Github"

  • Biased example: elaborate about "Donald Trump"

Get LLaMA Pretrained Checkpoints

Note that LLaMA cannot be used for commercial use.

  • To maintain integrity and prevent misuse, we are releasing our model under a noncommercial license focused on research use cases. Access to the model will be granted on a case-by-case basis to academic researchers; those affiliated with organizations in government, civil society, and academia; and industry research laboratories around the world. People interested in applying for access can find the link to the application in our research paper.

Apply for Official Access. You will get a unique download link once you are approved.

How to use

Assuming you have the LLaMA checkpoints (โ˜๏ธ)

  1. Clone and build https://github.com/ggerganov/llama.cpp
  2. Edit the LLAMA_MODEL_PATH and LLAMA_MAIN variables in server.go.
  3. Build and run the server:
go build
./server
  1. Run the client:
python3 -m pip install requests
python3 llama.py

References

  1. https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
  2. https://github.com/ggerganov/llama.cpp

llama-saas's People

Contributors

avilum avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

llama-saas's Issues

Huge security downside

Project is great, but the fact that you interact with lama via exec is not good, since easily the machine hosting the model could get compromised, via remote code execution.

I would suggest to implement the C bindings in Go, and call the model with the native APIs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.