Giter VIP home page Giter VIP logo

llmunity_tr's Introduction

Integrate LLM models in Unity!

License: MIT Reddit X (formerly Twitter) URL

LLMUnity allows to integrate, run and deploy LLMs (Large Language Models) in the Unity engine.
LLMUnity is built on top of the awesome llama.cpp and llamafile libraries.

At a glance  •  How to help  •  Setup  •  How to use  •  Examples  •  Use your own model  •  Multiple AI / Remote server setup  •  Options  •  License

At a glance

  • 💻 Cross-platform! Supports Windows, Linux and macOS (supported versions)
  • 🏠 Runs locally without internet access but also supports remote servers
  • ⚡ Fast inference on CPU and GPU
  • 🤗 Support of the major LLM models (supported models)
  • 🔧 Easy to setup, call with a single line code
  • 💰 Free to use for both personal and commercial purposes

🧪 Tested on Unity: 2021 LTS, 2022 LTS, 2023
🚦 Upcoming Releases

How to help

  • Join us at Discord and say hi!
  • ⭐ Star the repo and spread the word about the project!
  • Submit feature requests or bugs as issues or even submit a PR and become a collaborator!

Setup

To install the package you can follow the typical asset / package process in Unity:

Method 1: Install the asset using the asset store

  • Open the LLMUnity asset page and click Add to My Assets
  • Open the Package Manager: Window > Package Manager
  • Select the Packages: My Assets option from the drop-down
  • Select the LLMUnity package, click Download and then Import

Method 2: Install the asset using the GitHub repo:

  • Open the Package Manager: Window > Package Manager
  • Click the + button and select Add package from git URL
  • Use the repository URL https://github.com/undreamai/LLMUnity.git and click Add

How to use

For a step-by-step tutorial you can have a look at our guide:

How to Use LLMs in Unity

Create a GameObject for the LLM ♟️:

  • Create an empty GameObject. In the GameObject Inspector click Add Component and select the LLM script (Scripts>LLM).
  • Download the default model with the Download Model button (this will take a while as it is ~4GB).
    You can also load your own model in .gguf format with the Load model button (see Use your own model).
  • Define the role of your AI in the Prompt. You can also define the name of the AI (AI Name) and the player (Player Name).
  • (Optional) By default the LLM script is set up to receive the reply from the model as is it is produced in real-time (recommended). If you prefer to receive the full reply in one go, you can deselect the Stream option.
  • (Optional) Adjust the server or model settings to your preference (see Options).

In your script you can then use it as follows 🦄:

using LLMUnity;

public class MyScript {
  public LLM llm;
  
  void HandleReply(string reply){
    // do something with the reply from the model
    Debug.Log(reply);
  }
  
  void Game(){
    // your game function
    ...
    string message = "Hello bot!"
    _ = llm.Chat(message, HandleReply);
    ...
  }
}

You can also specify a function to call when the model reply has been completed.
This is useful if the Stream option is selected for continuous output from the model (default behaviour):

  void ReplyCompleted(){
    // do something when the reply from the model is complete
    Debug.Log("The AI replied");
  }
  
  void Game(){
    // your game function
    ...
    string message = "Hello bot!"
    _ = llm.Chat(message, HandleReply, ReplyCompleted);
    ...
  }
  • Finally, in the Inspector of the GameObject of your script, select the LLM GameObject created above as the llm property.

That's all ✨!

You can also:

Add or not the message to the chat/prompt history

The last argument of the Chat function is a boolean that specifies whether to add the message to the history (default: true):

  void Game(){
    // your game function
    ...
    string message = "Hello bot!"
    _ = llm.Chat(message, HandleReply, ReplyCompleted, false);
    ...
  }
Wait for the reply before proceeding to the next lines of code

For this you can use the async/await functionality:

  async void Game(){
    // your game function
    ...
    string message = "Hello bot!"
    await llm.Chat(message, HandleReply, ReplyCompleted);
    ...
  }
Process the prompt at the beginning of your app for faster initial processing time
  void WarmupCompleted(){
    // do something when the warmup is complete
    Debug.Log("The AI is warm");
  }

  void Game(){
    // your game function
    ...
    _ = llm.Warmup(WarmupCompleted);
    ...
  }
Add a LLM / LLMClient component dynamically
using UnityEngine;
using LLMUnity;

public class MyScript : MonoBehaviour
{
    LLM llm;
    LLMClient llmclient;

    async void Start()
    {
        // Add and setup a LLM object
        gameObject.SetActive(false);
        llm = gameObject.AddComponent<LLM>();
        await llm.SetModel("mistral-7b-instruct-v0.1.Q4_K_M.gguf");
        llm.prompt = "A chat between a curious human and an artificial intelligence assistant.";
        gameObject.SetActive(true);
        // or a LLMClient object
        gameObject.SetActive(false);
        llmclient = gameObject.AddComponent<LLMClient>();
        llmclient.prompt = "A chat between a curious human and an artificial intelligence assistant.";
        gameObject.SetActive(true);
    }
}

Examples

The Samples~ folder contains several examples of interaction 🤖:

  • SimpleInteraction: Demonstrates simple interaction between a player and a AI
  • ServerClient: Demonstrates simple interaction between a player and multiple AIs using a LLM and a LLMClient
  • ChatBot: Demonstrates interaction between a player and a AI with a UI similar to a messaging app (see image below)

If you install the package as an asset, the samples will already be in the Assets/Samples folder.
Otherwise if you install it with the GitHub URL, to install a sample:

  • Open the Package Manager: Window > Package Manager
  • Select the LLMUnity Package. From the Samples Tab, click Import next to the sample you want to install.

The samples can be run with the Scene.unity scene they contain inside their folder.
In the scene, select the LLM GameObject and click the Download Model button to download the default model.
You can also load your own model in .gguf format with the Load model button (see Use your own model).
Save the scene, run and enjoy!

Use your own model

Alternative models can be downloaded from HuggingFace.
The required model format is .gguf as defined by the llama.cpp.
The easiest way is to download gguf models directly by TheBloke who has converted an astonishing number of models 🌈!
Otherwise other model formats can be converted to gguf with the convert.py script of the llama.cpp as described here.

❕ Before using any model make sure you check their license

Multiple AI / Remote server setup

LLMUnity allows to have multiple AI characters efficiently.
Each character can be implemented with a different client with its own prompt (and other parameters), and all of the clients send their requests to a single server.
This is essential as multiple server instances would require additional compute resources.

In addition to the LLM server functionality, we define the LLMClient class that handles the client functionality.
The LLMClient contains a subset of options of the LLM class described in the Options.
To use multiple instances, you can define one LLM GameObject (as described in How to use) and then multiple LLMClient objects. See the ServerClient sample for a server-client example.

The LLMClient can be configured to connect to a remote instance by providing the IP address of the server in the host property.
The server can be either a LLMUnity server or a standard llama.cpp server.

Options

  • Show/Hide Advanced Options Toggle to show/hide advanced options from below

💻 Server Settings

  • Num Threads number of threads to use (default: -1 = all)
  • Num GPU Layers number of model layers to offload to the GPU. If set to 0 the GPU is not used. Use a large number i.e. >30 to utilise the GPU as much as possible.
    If the user's GPU is not supported, the LLM will fall back to the CPU
  • Stream select to receive the reply from the model as it is produced (recommended!).
    If it is not selected, the full reply from the model is received in one go
  • Advanced options:
    • Parallel Prompts number of prompts that can happen in parallel (default: -1 = number of LLM/LLMClient objects)
    • Debug select to log the output of the model in the Unity Editor
    • Port port to run the server

🤗 Model Settings

  • Download model click to download the default model (Mistral 7B Instruct)
  • Load model click to load your own model in .gguf format
  • Load lora click to load a LORA model in .bin format
  • Model the model being used (inside the Assets/StreamingAssets folder)
  • Lora the LORA model being used (inside the Assets/StreamingAssets folder)
  • Advanced options:
    • Context Size Size of the prompt context (0 = context size of the model)
    • Batch Size Batch size for prompt processing (default: 512)
    • Seed seed for reproducibility. For random results every time select -1
    • Temperature LLM temperature, lower values give more deterministic answersThe temperature setting adjusts how random the generated responses are. Turning it up makes the generated choices more varied and unpredictable. Turning it down makes the generated responses more predictable and focused on the most likely options.
    • Top K top-k sampling (default: 40, 0 = disabled)The top k value controls the top k most probable tokens at each step of generation. This value can help fine tune the output and make this adhere to specific patterns or constraints.
    • Top P top-p sampling (default: 0.9, 1.0 = disabled)The top p value controls the cumulative probability of generated tokens. The model will generate tokens until this theshold (p) is reached. By lowering this value you can shorten output & encourage / discourage more diverse output.
    • Num Predict number of tokens to predict (default: 256, -1 = infinity, -2 = until context filled)This is the amount of tokens the model will maximum predict. When N predict is reached the model will stop generating. This means words / sentences might not get finished if this is too low.

🗨️ Chat Settings

  • Player Name the name of the player
  • AI Name the name of the AI
  • Prompt a description of the AI role

License

The license of LLMUnity is MIT (LICENSE.md) and uses third-party software with MIT and Apache licenses (Third Party Notices.md).

llmunity_tr's People

Contributors

amakropoulos avatar brucekristelijn avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.