Giter VIP home page Giter VIP logo

ucall's Introduction

UCall

Remote Procedure Calls Library
Up to 100x Faster than FastAPI and gRPC


Discord     LinkedIn     Twitter     Blog     GitHub

RESTful API • JSON-RPC / HTTPS • HTTP • TCP
Linux • MacOS • Windows • WebAssembly
C 99Python 3JavaScript 🔜


Tired of slow REST backends or overly complex gRPC setups? UCall aims to be as approachable as FastAPI but significantly faster and more broadly applicable.

📊 FastAPI UX Comparison

FastAPIUCall
pip install fastapi uvicorn
pip install ucall
from fastapi import FastAPI

server = FastAPI()

@server.get('/sum')
def sum(a: int, b: int):
    return a + b

import uvicorn
uvicorn.run(...)    
from ucall.server import Server

server = Server()

@server.get('/sum')
def sum(a: int, b: int):
    return a + b

server.run()    

📈 Performance Metrics

It takes over a millisecond to handle a trivial FastAPI call on a recent 8-core CPU. In that time, light could have traveled 300 km through optics to the neighboring city or country, in my case. How does UCall compare to FastAPI and gRPC?

Setup 🔁 Server Protocol Latency w 1 client Throughput w 32 clients
Fast API over REST 🐍 REST 1'203 μs 3'184 rps
Fast API over WebSocket 🐍 REST 86 μs 11'356 rps ¹
gRPC ² 🐍 gRPC 164 μs 9'849 rps
UCall with POSIX C REST or JSON-RPC 62 μs 79'000 rps
UCall with io_uring 🐍 REST or JSON-RPC 40 μs 210'000 rps
UCall with io_uring C REST or JSON-RPC 22 μs 231'000 rps
Table legend

All benchmarks were conducted on AWS on general purpose instances with Ubuntu 22.10 AMI. It is the first major AMI to come with Linux Kernel 5.19, featuring much wider io_uring support for networking operations. These specific numbers were obtained on c7g.metal beefy instances with Graviton 3 chips.

  • The 🔁 column marks, if the TCP/IP connection is being reused during subsequent requests.
  • The "server" column defines the programming language, in which the server was implemented.
  • The "latency" column report the amount of time between sending a request and receiving a response. μ stands for micro, μs subsequently means microseconds.
  • The "throughput" column reports the number of Requests Per Second when querying the same server application from multiple client processes running on the same machine.

¹ FastAPI couldn't process concurrent requests with WebSockets.

² We tried generating C++ backends with gRPC, but its numbers, suspiciously, weren't better. There is also an async gRPC option, that wasn't tried.

🛠 How Does UCall Achieve This?

How can a tiny pet-project with just a couple thousand lines of code compete with two of the most established networking libraries? UCall stands on the shoulders of Giants:

  • io_uring for interrupt-less IO.

    • io_uring_prep_read_fixed on 5.1+.
    • io_uring_prep_accept_direct on 5.19+.
    • io_uring_register_files_sparse on 5.19+.
    • IORING_SETUP_COOP_TASKRUN optional on 5.19+.
    • IORING_SETUP_SINGLE_ISSUER optional on 6.0+.
  • SIMD-accelerated parsers with manual memory control.

You have already seen the latency of the round trip..., the throughput in requests per second..., want to see the bandwidth? Try yourself!

@server
def echo(data: bytes):
    return data

🎨 Rich Type Pallete

FastAPI supports native types, while UCall also supports numpy.ndarray, PIL.Image and other custom types. This comes handy when you build real applications or want to deploy Multi-Modal AI, like we do with UForm.

from ucall.server import Server
import ufrom

server = Server()
model = uform.get_model('unum-cloud/uform-vl-multilingual')

@server
def vectorize(description: str, photo: PIL.Image.Image) -> numpy.ndarray:
    image = model.preprocess_image(photo)
    tokens = model.preprocess_text(description)
    joint_embedding = model.encode_multimodal(image=image, text=tokens)

    return joint_embedding.cpu().detach().numpy()

🖥 Client Libraries

UCall offers a Python Client class and a CLI tool for easy interaction with UCall servers.

from ucall.client import Client

client = Client()
# Explicit JSON-RPC call:
response = client({
    'method': 'vectorize',
    'params': {
        'description': description,
        'image': image,
    },
    'jsonrpc': '2.0',
    'id': 100,
})
# Or the same with syntactic sugar:
response = client.vectorize(description=description, image=image) 

Aside from the Python Client, we provide an easy-to-use Command Line Interface, which comes with pip install ucall. It allow you to call a remote server, upload files, with direct support for images and NumPy arrays. Translating previous example into a Bash script, to call the server on the same machine:

ucall vectorize description='Product description' -i image=./local/path.png

To address a remote server:

ucall vectorize description='Product description' -i image=./local/path.png --uri 0.0.0.0 -p 8545

To print the docs, use ucall -h:

usage: ucall [-h] [--uri URI] [--port PORT] [-f [FILE ...]] [-i [IMAGE ...]] [--positional [POSITIONAL ...]] method [kwargs ...]

UCall Client CLI

positional arguments:
  method                method name
  kwargs                method arguments

options:
  -h, --help            show this help message and exit
  --uri URI             server uri
  --port PORT           server port
  -f [FILE ...], --file [FILE ...]
                        method positional arguments
  -i [IMAGE ...], --image [IMAGE ...]
                        method positional arguments
  --positional [POSITIONAL ...]
                        method positional arguments

You can also explicitly annotate types, to distinguish integers, floats, and strings, to avoid ambiguity.

ucall auth id=256
ucall auth id:int=256
ucall auth id:str=256

📊 AWS Free Tier Performance

We will leave bandwidth measurements to enthusiasts, but will share some more numbers. The general logic is that you can't squeeze high performance from Free-Tier machines. Currently AWS provides following options: t2.micro and t4g.small, on older Intel and newer Graviton 2 chips. This library is so fast, that it doesn't need more than 1 core, so you can run a fast server even on a tiny Free-Tier server!

Setup 🔁 Server Clients t2.micro t4g.small
Fast API over REST 🐍 1 328 rps 424 rps
Fast API over WebSocket 🐍 1 1'504 rps 3'051 rps
gRPC 🐍 1 1'169 rps 1'974 rps
UCall with POSIX C 1 1'082 rps 2'438 rps
UCall with io_uring C 1 - 5'864 rps
UCall with POSIX C 32 3'399 rps 39'877 rps
UCall with io_uring C 32 - 88'455 rps

In this case, every server was bombarded by requests from 1 or a fleet of 32 other instances in the same availability zone. If you want to reproduce those benchmarks, check out the sum examples on GitHub.

📝 C API Example

UCall provides an ABI-stable C 99 interface. To use it with the CMake build system:

include(FetchContent)
FetchContent_Declare(
    ucall
    GIT_REPOSITORY https://github.com/unum-cloud/ucall
    GIT_SHALLOW TRUE
)
FetchContent_MakeAvailable(ucall)
include_directories(${ucall_SOURCE_DIR}/include)

The C usage example is mouthful compared to Python. We wanted to make it as lightweight as possible and to allow optional arguments without dynamic allocations and named lookups. So unlike the Python layer, we expect the user to manually extract the arguments from the call context with ucall_param_named_i64(), and its siblings.

#include <cstdio.h>
#include <ucall/ucall.h>

static void sum(ucall_call_t call, ucall_callback_tag_t) {
    int64_t a{}, b{};
    char printed_sum[256]{};
    bool got_a = ucall_param_named_i64(call, "a", 0, &a);
    bool got_b = ucall_param_named_i64(call, "b", 0, &b);
    if (!got_a || !got_b)
        return ucall_call_reply_error_invalid_params(call);

    int len = snprintf(printed_sum, 256, "%ll", a + b);
    ucall_call_reply_content(call, printed_sum, len);
}

int main(int argc, char** argv) {

    ucall_server_t server{};
    ucall_config_t config{};

    ucall_init(&config, &server);
    ucall_add_procedure(server, "sum", &sum);
    ucall_take_calls(server, 0);
    ucall_free(server);
    return 0;
}

ucall's People

Contributors

ishkhan42 avatar ashvardanian avatar arman-ghazaryan avatar semantic-release-bot avatar darvinharutyunyan avatar ngalstyan4 avatar gurgenyegoryan avatar 1lumin avatar crazy-max avatar vimal-b avatar vovor avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.