Giter VIP home page Giter VIP logo

serge-chat / serge Goto Github PK

View Code? Open in Web Editor NEW
5.6K 51.0 402.0 2.99 MB

A web interface for chatting with Alpaca through llama.cpp. Fully dockerized, with an easy to use API.

Home Page: https://serge.chat

License: Apache License 2.0

Python 28.95% JavaScript 1.08% CSS 4.95% HTML 0.33% Svelte 55.95% TypeScript 2.75% Dockerfile 1.27% Shell 3.44% Smarty 1.28%
llama alpaca docker fastapi llamacpp python web svelte sveltekit tailwindcss

serge's Introduction

Serge - LLaMA made easy ๐Ÿฆ™

License Discord

Serge is a chat interface crafted with llama.cpp for running GGUF models. No API keys, entirely self-hosted!

  • ๐ŸŒ SvelteKit frontend
  • ๐Ÿ’พ Redis for storing chat history & parameters
  • โš™๏ธ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings

๐ŸŽฅ Demo:

demo.webm

โšก๏ธ Quick start

๐Ÿณ Docker:

docker run -d \
    --name serge \
    -v weights:/usr/src/app/weights \
    -v datadb:/data/db/ \
    -p 8008:8008 \
    ghcr.io/serge-chat/serge:latest

๐Ÿ™ Docker Compose:

services:
  serge:
    image: ghcr.io/serge-chat/serge:latest
    container_name: serge
    restart: unless-stopped
    ports:
      - 8008:8008
    volumes:
      - weights:/usr/src/app/weights
      - datadb:/data/db/

volumes:
  weights:
  datadb:

Then, just visit http://localhost:8008, You can find the API documentation at http://localhost:8008/api/docs

๐ŸŒ Environment Variables

The following Environment Variables are available:

Variable Name Description Default Value
SERGE_DATABASE_URL Database connection string sqlite:////data/db/sql_app.db
SERGE_JWT_SECRET Key for auth token encryption. Use a random string uF7FGN5uzfGdFiPzR
SERGE_SESSION_EXPIRY Duration in minutes before a user must reauthenticate 60
NODE_ENV Node.js running environment production

๐Ÿ–ฅ๏ธ Windows

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

โ˜๏ธ Kubernetes

Instructions for setting up Serge on Kubernetes can be found in the wiki.

๐Ÿง  Supported Models

Category Models
Alfred 40B-1023
BioMistral 7B
Code 13B, 33B
CodeLLaMA 7B, 7B-Instruct, 7B-Python, 13B, 13B-Instruct, 13B-Python, 34B, 34B-Instruct, 34B-Python
Codestral 22B v0.1
Gemma 2B, 1.1-2B-Instruct, 7B, 1.1-7B-Instruct
Gorilla Falcon-7B-HF-v0, 7B-HF-v1, Openfunctions-v1, Openfunctions-v2
Falcon 7B, 7B-Instruct, 40B, 40B-Instruct
LLaMA 2 7B, 7B-Chat, 7B-Coder, 13B, 13B-Chat, 70B, 70B-Chat, 70B-OASST
LLaMA 3 11B-Instruct, 13B-Instruct, 16B-Instruct
LLaMA Pro 8B, 8B-Instruct
Med42 70B
Medalpaca 13B
Medicine Chat, LLM
Meditron 7B, 7B-Chat, 70B
Meta-LlaMA-3 8B, 8B-Instruct, 70B, 70B-Instruct
Mistral 7B-V0.1, 7B-Instruct-v0.2, 7B-OpenOrca
MistralLite 7B
Mixtral 8x7B-v0.1, 8x7B-Dolphin-2.7, 8x7B-Instruct-v0.1
Neural-Chat 7B-v3.3
Notus 7B-v1
Notux 8x7b-v1
Nous-Hermes 2 Mistral-7B-DPO, Mixtral-8x7B-DPO, Mistral-8x7B-SFT
OpenChat 7B-v3.5-1210
OpenCodeInterpreter DS-6.7B, DS-33B, CL-7B, CL-13B, CL-70B
OpenLLaMA 3B-v2, 7B-v2, 13B-v2
Orca 2 7B, 13B
Phi 2 2.7B
Phi 3 mini-4k-instruct, medium-4k-instruct, medium-128k-instruct
Python Code 13B, 33B
PsyMedRP 13B-v1, 20B-v1
Starling LM 7B-Alpha
SOLAR 10.7B-v1.0, 10.7B-instruct-v1.0
TinyLlama 1.1B
Vicuna 7B-v1.5, 13B-v1.5, 33B-v1.3, 33B-Coder
WizardLM 2-7B, 13B-v1.2, 70B-v1.0
Zephyr 3B, 7B-Alpha, 7B-Beta

Additional models can be requested by opening a GitHub issue. Other models are also available at Serge Models.

โš ๏ธ Memory Usage

LLaMA will crash if you don't have enough available memory for the model

๐Ÿ’ฌ Support

Need help? Join our Discord

๐Ÿงพ License

Nathan Sarrazin and Contributors. Serge is free and open-source software licensed under the MIT License and Apache-2.0.

๐Ÿค Contributing

If you discover a bug or have a feature idea, feel free to open an issue or PR.

To run Serge in development mode:

git clone https://github.com/serge-chat/serge.git
cd serge/
docker compose -f docker-compose.dev.yml up --build

The solution will accept a python debugger session on port 5678. Example launch.json for VSCode:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Remote Debug",
            "type": "python",
            "request": "attach",
            "connect": {
                "host": "localhost",
                "port": 5678
            },
            "pathMappings": [
                {
                    "localRoot": "${workspaceFolder}/api",
                    "remoteRoot": "/usr/src/app/api/"
                }
            ],
            "justMyCode": false
        }
    ]
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.