Giter VIP home page Giter VIP logo

react-llm's Introduction

@react-llm/headless

Easy-to-use headless React Hooks to run LLMs in the browser with WebGPU. As simple as useLLM().

image

Features:

  • Supports Vicuna 7B
  • Use custom system prompts and "user:"/"assistant:" role names
  • Completion options like max tokens and stop sequences
  • No data leaves the browser. Accelerated via WebGPU.
  • Hooks built to 'Bring your own UI'
  • Persistent storage for conversations in browser storage. Hooks for loading and saving conversations.
  • Model caching for faster subsequent loads

Installation

npm install @react-llm/headless

Packages in this repository

useLLM API

Types

// Model Initialization
init: () => void;

// Model Generation
send: (msg: string, maxTokens: number, stopSequences: string[]) => void;
onMessage: (msg: GenerateTextResponse) => void;
setOnMessage: (cb: (msg: GenerateTextResponse) => void) => void;

// Model Status
loadingStatus: InitProgressReport;
isGenerating: boolean;
gpuDevice: GPUDeviceInfo;

// Model Configuration
userRoleName: string;
setUserRoleName: (roleName: string) => void;
assistantRoleName: string;
setAssistantRoleName: (roleName: string) => void;

// Conversation Management
conversation: Conversation | undefined;
allConversations: Conversation[] | undefined;
createConversation: (title?: string, prompt?: string) => void;
setConversationId: (conversationId: string) => void;
deleteConversation: (conversationId: string) => void;
deleteAllConversations: () => void;
deleteMessages: () => void;
setConversationTitle: (conversationId: string, title: string) => void;

Hooks

import useLLM from '@react-llm/headless';

const MyComponent = () => {
  const {
    conversation,
    allConversations,
    loadingStatus,
    isGenerating,
    createConversation,
    setConversationId,
    deleteConversation,
    deleteAllConversations,
    deleteMessages,
    setConversationTitle,
    onMessage,
    setOnMessage,
    userRoleName,
    setUserRoleName,
    assistantRoleName,
    setAssistantRoleName,
    gpuDevice,
    send,
    init,
  } = useLLM();

  // Component logic...

  return null;
};

Provider

import { ModelProvider } from "@react-llm/headless";

export default function Home() {
  return (
    <ModelProvider
      config={{
        kvConfig: {
          numLayers: 64,
          shape: [32, 32, 128],
          dtype: 'float32',
        },
        wasmUrl: 'https://your-custom-url.com/model.wasm',
        cacheUrl: 'https://your-custom-url.com/cache/',
        tokenizerUrl: 'https://your-custom-url.com/tokenizer.model',
        sentencePieceJsUrl: 'https://your-custom-url.com/sentencepiece.js',
        tvmRuntimeJsUrl: 'https://your-custom-url.com/tvmjs_runtime.wasi.js',
        maxWindowSize: 2048,
        persistToLocalStorage: true,
      }}
    >
      <Chat />
    </ModelProvider>
  );
}

Packages

  • @react-llm/headless - Headless React Hooks for running LLMs in the browser
  • @react-llm/retro-ui - Retro-themed UI for the hooks

How does it work?

This library is a set of React Hooks that provide a simple interface to run LLMs in the browser. It uses Vicuna 13B.

  • SentencePiece tokenizer (compiled for the browser via Emscripten)
  • Vicuna 7B (transformed to Apache TVM format)
  • Apache TVM and MLC Relax (compiled for the browser via Emscripten)
  • Off-the-main-thread WebWorker to run the model (bundled with the library)

The model, tokenizer, and TVM runtime are loaded from a CDN (huggingface). The model is cached in browser storage for faster subsequent loads.

Example

See packages/retro-ui for the full demo code. This is a simple example of how to use the hooks. To run it, after cloning the repo,

cd packages/retro-ui
pnpm install
pnpm dev

License

MIT

The code under packages/headless/worker/lib/tvm is licensed under Apache 2.0.

react-llm's People

Contributors

davidar avatar lxfater avatar matthoffner avatar r2d4 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

react-llm's Issues

Generating tokenizer.model

Hi, I'm trying to use react-llm's browser extension with a different model (Llama2 7b). I have compiled the model to wasm per the webllm instructions, but I'm having difficulty figuring out how the tokenizer.model file is created. I can export the tokenizer for Llama 2 from huggingface, resulting in three json files, but I can't find documentation about how to port that to the necessary tokenizer.model file. Could you let me know how you generated it? Thank you for your help and for this great project!

Question: Adapting react-llm to other Llama 1/2 variants

Hi,
Thanks for the amazing project!
I'm currently trying to adapt react-llm to work with a Llama2 variant I've quantized and transformed to wasm using the MLC AI LLM library (q4f32_1). I've updated background/index.js to reflect my model location / information:

    wasmUrl: "/models/Llama-2-7b-f32-q4f32_1/Llama-2-7b-f32-q4f32_1-webgpu.wasm",
    cacheUrl: "https://huggingface.co/maryxxx/Llama-2-7b-f32_q4f32_1/resolve/main/params/",
    tokenizerUrl: "/models/Llama-2-7b-f32-q4f32_1/tokenizer.model",

The extension loads the params from hugging face successfully. It registers the following error after loading, but still presents me with the prompt popup form:
Uncaught (in promise) Error: Cannot find function encoding

Then, when I submit a prompt, I receive the following error:
Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'generate')

It seems to me like something is off in the loading process, but I'm not sure where to start with debugging -- and if the problem is my wasm model or additional parameters I need to change in the codebase.

If possible, could you please provide an overview of how you prepared the example vicuna model (in case I am missing a step during compilation) and any additional hints re: other parameters that might need to be changed in the app?

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.