Giter VIP home page Giter VIP logo

serge's Introduction

Serge - LLaMA made easy ๐Ÿฆ™

License Discord

Serge is a chat interface crafted with llama.cpp for running Alpaca models. No API keys, entirely self-hosted!

  • ๐ŸŒ SvelteKit frontend
  • ๐Ÿ’พ Redis for storing chat history & parameters
  • โš™๏ธ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings

๐ŸŽฅ Demo:

demo.webm

โšก๏ธ Quick start

๐Ÿณ Docker:

docker run -d \
    --name serge \
    -v weights:/usr/src/app/weights \
    -v datadb:/data/db/ \
    -p 8008:8008 \
    ghcr.io/serge-chat/serge:latest

๐Ÿ™ Docker Compose:

services:
  serge:
    image: ghcr.io/serge-chat/serge:latest
    container_name: serge
    restart: unless-stopped
    ports:
      - 8008:8008
    volumes:
      - weights:/usr/src/app/weights
      - datadb:/data/db/

volumes:
  weights:
  datadb:

Then, just visit http://localhost:8008/, You can find the API documentation at http://localhost:8008/api/docs

๐Ÿ–ฅ๏ธ Windows Setup

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

โ˜๏ธ Kubernetes & Docker Compose Setup

Instructions for setting up Serge on Kubernetes can be found in the wiki.

๐Ÿง  Supported Models

We currently support the following models:

  • Airoboros ๐ŸŽˆ
    • Airoboros-7B
    • Airoboros-13B
    • Airoboros-30B
    • Airoboros-65B
  • Alpaca ๐Ÿฆ™
    • Alpaca-LoRA-65B
    • GPT4-Alpaca-LoRA-30B
  • BigTrans ๐Ÿ—บ
    • BigTrans-13B
  • Chronos ๐ŸŒ‘
    • Chronos-13B
    • Chronos-33B
    • Chronos-Hermes-13B
  • GPT4All ๐ŸŒ
    • GPT4All-13B
  • Guanaco ๐Ÿฆ™
    • Guanaco-7B
    • Guanaco-13B
    • Guanaco-33B
    • Guanaco-65B
  • Koala ๐Ÿจ
    • Koala-7B
    • Koala-13B
  • Llama ๐Ÿฆ™
    • FinLlama-33B
    • Llama-Supercot-30B
  • Lazarus ๐Ÿ’€
    • Lazarus-30B
  • Minotour ๐Ÿƒ
    • Minotaur-15B
  • Nous ๐Ÿง 
    • Nous-Hermes-13B
  • OpenAssistant ๐ŸŽ™๏ธ
    • OpenAssistant-30B
  • Robin ๐Ÿน
    • Robin-7B
    • Robin-13B
    • Robin-33B
    • Robin-65B
  • Samantha ๐Ÿ‘ฉ
    • Samantha-7B
    • Samantha-13B
    • Samantha-33B
  • Tulu ๐ŸŽš
    • Tulu-7B
    • Tulu-13B
    • Tulu-30B
  • Vicuna ๐Ÿฆ™
    • Stable-Vicuna-13B
    • Vicuna-CoT-7B
    • Vicuna-CoT-13B
    • Vicuna-v1.1-7B
    • Vicuna-v1.1-13B
    • VicUnlocked-30B
    • VicUnlocked-65B
    • Vicuna-v1.3-7B
    • Vicuna-v1.3-13B
  • Wizard ๐Ÿง™
    • Wizard-Mega-13B
    • Wizard-Vicuna-Uncensored-7B
    • Wizard-Vicuna-Uncensored-13B
    • Wizard-Vicuna-Uncensored-30B
    • WizardLM-30B
    • WizardLM-Uncensored-7B
    • WizardLM-Uncensored-13B
    • WizardLM-Uncensored-30B

Additional weights can be added to the serge_weights volume using docker cp:

docker cp ./my_weight.bin serge:/usr/src/app/weights/

โš ๏ธ Memory Usage

LLaMA will crash if you don't have enough available memory for the model:

Model Max RAM Required
7B 4.5GB
7B-q2_K 5.37GB
7B-q3_K_L 6.10GB
7B-q4_1 6.71GB
7B-q4_K_M 6.58GB
7B-q5_1 7.56GB
7B-q5_K_M 7.28GB
7B-q6_K 8.03GB
7B-q8_0 9.66GB
13B 12GB
13B-q2_K 8.01GB
13B-q3_K_L 9.43GB
13B-q4_1 10.64GB
13B-q4_K_M 10.37GB
13B-q5_1 12.26GB
13B-q5_K_M 11.73GB
13B-q6_K 13.18GB
13B-q8_0 16.33GB
33B 20GB
33B-q2_K 16.21GB
33B-q3_K_L 19.78GB
33B-q4_1 22.83GB
33B-q4_K_M 22.12GB
33B-q5_1 26.90GB
33B-q5_K_M 25.55GB
33B-q6_K 29.19GB
33B-q8_0 37.06GB
65B 50GB
65B-q2_K 29.95GB
65B-q3_K_L 37.15GB
65B-q4_1 43.31GB
65B-q4_K_M 41.85GB
65B-q5_1 51.47GB
65B-q5_K_M 48.74GB
65B-q6_K 56.06GB
65B-q8_0 71.87GB

๐Ÿ’ฌ Support

Need help? Join our Discord

๐Ÿค Contributing

If you discover a bug or have a feature idea, feel free to open an issue or PR.

To run Serge in development mode:

git clone https://github.com/serge-chat/serge.git
DOCKER_BUILDKIT=1 docker compose -f docker-compose.dev.yml up -d --build

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.