Giter VIP home page Giter VIP logo

paddler's Introduction

Paddler

Paddler is an open-source load balancer and reverse proxy designed specifically for optimizing servers running llama.cpp.

Typical strategies like round robin or least connections are not effective for llama.cpp servers, which need slots for continuous batching and concurrent requests.

Paddler overcomes this by maintaining a stateful load balancer that is aware of each server's available slots, ensuring efficient request distribution. Additionally, Paddler uses agents to monitor the health of individual llama.cpp instances, providing feedback to the load balancer for optimal performance. Paddler also supports the dynamic addition or removal of llama.cpp servers, enabling integration with autoscaling tools.

How it Works

Registering llama.cpp Instances

The sequence repeats for each agent. Agents should be installed alongside llama.cpp instance to report their health status to the load balancer.

sequenceDiagram
    participant loadbalancer as Paddler Load Balancer
    participant agent as Paddler Agent
    participant llamacpp as llama.cpp

    agent->>llamacpp: Hey, are you alive?
    llamacpp-->>agent: Yes, this is my health status
    agent-->>loadbalancer: llama.cpp is still working
    loadbalancer->>llamacpp: I have a request for you to handle

Tutorials

Usage

Installation

You can download the latest release from the releases page.

Alternatively you can build the project yourself. You need go>=1.21 and nodejs (for dashboard's front-end code) to build the project.

$ git clone [email protected]:distantmagic/paddler.git
$ cd paddler
$ pushd ./management
$ make esbuild # dashboard front-end
$ popd
$ go build -o paddler

Running Agents

The agent should be installed in the same host as llama.cpp.

It needs a few pieces of information:

  1. external-* tells how the load balancer can connect to the llama.cpp instance
  2. local-* tells how the agent can connect to the llama.cpp instance
  3. management-* tell where the agent should report the health status
./paddler agent \
    --external-llamacpp-host 127.0.0.1 \
    --external-llamacpp-port 8088 \
    --local-llamacpp-host 127.0.0.1 \
    --local-llamacpp-port 8088 \
    --management-host 127.0.0.1 \
    --management-port 8085

Replace hosts and ports with your own server addresses when deploying.

Running Load Balancer

Load balancer collects data from agents and exposes reverse proxy to the outside world.

It requires two sets of flags:

  1. management-* tells where the load balancer should listen for updates from agents
  2. reverseproxy-* tells how load balancer can be reached from the outside hosts
./paddler balancer \
    --management-host 127.0.0.1 \
    --management-port 8085 \
    --reverseproxy-host 196.168.2.10 \
    --reverseproxy-port 8080

management-host and management-port in agents should be the same as in the load balancer.

You can enable dashboard to see the status of the agents with --management-dashboard-enable=true flag. If enabled it is available at the management server address under /dashboard path.

Changelog

v0.1.0

Roadmap

  • llama.cpp reverse proxy
  • Basic load balancer
  • Circuit breaker
  • OpenTelemetry observer
  • Integration with AWS Auto Scaling (and other cloud providers) - out of the box endpoint with a custom metric to scale up/down
  • Queueing requests

Community

Discord: https://discord.gg/kysUzFqSCK

paddler's People

Contributors

malzag avatar mcharytoniuk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

paddler's Issues

Support custom AWS CloudWatch metrics through StatsD

It would be nice if Paddler could report custom metrics out of the box. Setting up Auto Scaling rules would be simpler.

Additionally, the Agent might use some EC2 helpers, for example:

paddler agent --external-llamacpp-host aws:metadata:local-ipv4

Instead of providing IP Addr manually:

paddler agent --external-llamacpp-host 10.0.0.1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.