Giter VIP home page Giter VIP logo

llmunity's Introduction

Create characters in Unity with LLMs!

License: MIT Reddit LinkedIn Asset Store GitHub Repo stars Documentation

LLM for Unity enables seamless integration of Large Language Models (LLMs) within the Unity engine.
It allows to create intelligent characters that your players can interact with for an immersive experience.
LLM for Unity is built on top of the awesome llama.cpp and llamafile libraries.

At a glance  •  How to help  •  Games using LLM for Unity  •  Setup  •  How to use  •  Examples  •  LLM model management  •  Options  •  License

At a glance

  • 💻 Cross-platform! Windows, Linux, macOS and Android
  • 🏠 Runs locally without internet access. No data ever leaves the game!
  • ⚡ Blazing fast inference on CPU and GPU (Nvidia, AMD, Apple Metal)
  • 🤗 Supports all major LLM models
  • 🔧 Easy to setup, call with a single line of code
  • 💰 Free to use for both personal and commercial purposes

🧪 Tested on Unity: 2021 LTS, 2022 LTS, 2023
🚦 Upcoming Releases

How to help

  • ⭐ Star the repo, leave us a review and spread the word about the project!
  • Join us at Discord and say hi!
  • Contribute by submitting feature requests or bugs as issues or even submiting a PR and become a collaborator!

Games using LLM for Unity

Setup

Method 1: Install using the asset store

  • Open the LLM for Unity asset page and click Add to My Assets
  • Open the Package Manager in Unity: Window > Package Manager
  • Select the Packages: My Assets option from the drop-down
  • Select the LLM for Unity package, click Download and then Import

Method 2: Install using the GitHub repo:

  • Open the Package Manager in Unity: Window > Package Manager
  • Click the + button and select Add package from git URL
  • Use the repository URL https://github.com/undreamai/LLMUnity.git and click Add

How to use

First you will setup the LLM for your game 🏎:

  • Create an empty GameObject.
    In the GameObject Inspector click Add Component and select the LLM script.
  • Download one of the default models with the Download Model button (~GBs).
    Or load your own .gguf model with the Load model button (see LLM model management).

Then you can setup each of your characters as follows 🙋‍♀️:

  • Create an empty GameObject for the character.
    In the GameObject Inspector click Add Component and select the LLMCharacter script.
  • Define the role of your AI in the Prompt. You can define the name of the AI (AI Name) and the player (Player Name).
  • (Optional) Select the LLM constructed above in the LLM field if you have more than one LLM GameObjects.

You can also adjust the LLM and character settings according to your preference (see Options).

In your script you can then use it as follows 🦄:

using LLMUnity;

public class MyScript {
  public LLMCharacter llmCharacter;
  
  void HandleReply(string reply){
    // do something with the reply from the model
    Debug.Log(reply);
  }
  
  void Game(){
    // your game function
    ...
    string message = "Hello bot!";
    _ = llmCharacter.Chat(message, HandleReply);
    ...
  }
}

You can also specify a function to call when the model reply has been completed.
This is useful if the Stream option is enabled for continuous output from the model (default behaviour):

  void ReplyCompleted(){
    // do something when the reply from the model is complete
    Debug.Log("The AI replied");
  }
  
  void Game(){
    // your game function
    ...
    string message = "Hello bot!";
    _ = llmCharacter.Chat(message, HandleReply, ReplyCompleted);
    ...
  }

To stop the chat without waiting for its completion you can use:

    llmCharacter.CancelRequests();
  • Finally, in the Inspector of the GameObject of your script, select the LLMCharacter GameObject created above as the llmCharacter property.

That's all ✨!

You can also:

Build a mobile app on Android

To build an Android app you need to specify the IL2CPP scripting backend and the ARM64 as the target architecture in the player settings.
These settings can be accessed from the Edit > Project Settings menu within the Player > Other Settings section.

It is also a good idea to enable the Download on Build option in the LLM GameObject to download the model on launch in order to keep the app size small.

Save / Load your chat history

To automatically save / load your chat history, you can specify the Save parameter of the LLMCharacter to the filename (or relative path) of your choice. The file is saved in the persistentDataPath folder of Unity. This also saves the state of the LLM which means that the previously cached prompt does not need to be recomputed.

To manually save your chat history, you can use:

    llmCharacter.Save("filename");

and to load the history:

    llmCharacter.Load("filename");

where filename the filename or relative path of your choice.

Process the prompt at the beginning of your app for faster initial processing time
  void WarmupCompleted(){
    // do something when the warmup is complete
    Debug.Log("The AI is nice and ready");
  }

  void Game(){
    // your game function
    ...
    _ = llmCharacter.Warmup(WarmupCompleted);
    ...
  }
Decide whether or not to add the message to the chat/prompt history

The last argument of the Chat function is a boolean that specifies whether to add the message to the history (default: true):

  void Game(){
    // your game function
    ...
    string message = "Hello bot!";
    _ = llmCharacter.Chat(message, HandleReply, ReplyCompleted, false);
    ...
  }
Use pure text completion
  void Game(){
    // your game function
    ...
    string message = "The cat is away";
    _ = llmCharacter.Complete(message, HandleReply, ReplyCompleted);
    ...
  }
Wait for the reply before proceeding to the next lines of code

For this you can use the async/await functionality:

  async void Game(){
    // your game function
    ...
    string message = "Hello bot!";
    string reply = await llmCharacter.Chat(message, HandleReply, ReplyCompleted);
    Debug.Log(reply);
    ...
  }
Add a LLM / LLMCharacter component programmatically
using UnityEngine;
using LLMUnity;

public class MyScript : MonoBehaviour
{
    LLM llm;
    LLMCharacter llmCharacter;

    async void Start()
    {
        // disable gameObject so that theAwake is not called immediately
        gameObject.SetActive(false);

        // Add an LLM object
        llm = gameObject.AddComponent<LLM>();
        // set the model using the filename of the model.
        // The model needs to be added to the LLM model manager (see LLM model management) by loading or downloading it.
        // Otherwise the model file can be copied directly inside the StreamingAssets folder.
        llm.SetModel("Phi-3-mini-4k-instruct-q4.gguf");
        // optional: you can also set a lora in a similar fashion
        llm.SetLora("my-lora.bin");
        // optional: you can set the chat template of the model if it is not correctly identified
        // You can find a list of chat templates in the ChatTemplate.templates.Keys
        llm.SetTemplate("phi-3");
        // optional: set number of threads
        llm.numThreads = -1;
        // optional: enable GPU by setting the number of model layers to offload to it
        llm.numGPULayers = 10;

        // Add an LLMCharacter object
        llmCharacter = gameObject.AddComponent<LLMCharacter>();
        // set the LLM object that handles the model
        llmCharacter.llm = llm;
        // set the character prompt
        llmCharacter.SetPrompt("A chat between a curious human and an artificial intelligence assistant.");
        // set the AI and player name
        llmCharacter.AIName = "AI";
        llmCharacter.playerName = "Human";
        // optional: set streaming to false to get the complete result in one go
        // llmCharacter.stream = true;
        // optional: set a save path
        // llmCharacter.save = "AICharacter1";
        // optional: enable the save cache to avoid recomputation when loading a save file (requires ~100 MB)
        // llmCharacter.saveCache = true;
        // optional: set a grammar
        // await llmCharacter.SetGrammar("json.gbnf");

        // re-enable gameObject
        gameObject.SetActive(true);
    }
}
Use a remote server

You can use a remote server to carry out the processing and implement characters that interact with it. To do that:

  • Create a project with a GameObject using the LLM script as described above. Enable the Remote option and optionally configure the port.
  • Create a second project with the game characters using the LLMCharacter script as described above. Enable the Remote option and configure the host with the IP address (starting with "http://") and port of the server.

A detailed documentation on function level can be found here:

Examples

The Samples~ folder contains several examples of interaction 🤖:

  • SimpleInteraction: Demonstrates a simple interaction with an AI character
  • MultipleCharacters: Demonstrates a simple interaction using multiple AI characters
  • KnowledgeBaseGame: Simple detective game using a knowledge base to provide information to the LLM based on google/mysteryofthreebots
  • ChatBot: Demonstrates interaction between a player and a AI with a UI similar to a messaging app (see image below)
  • AndroidDemo: Example Android app with an initial screen with model download progress

To install a sample:

  • Open the Package Manager: Window > Package Manager
  • Select the LLM for Unity Package. From the Samples Tab, click Import next to the sample you want to install.

The samples can be run with the Scene.unity scene they contain inside their folder.
In the scene, select the LLM GameObject and click the Download Model button to download a default model or Load model to load your own model (see LLM model management).
Save the scene, run and enjoy!

LLM model management

LLM for Unity implements a model manager that allows to load or download LLMs and ship them directly in your game.
The model manager can be found as part of the LLM GameObject:

You can download models with the Download model button.
LLM for Unity includes different state of the art models built-in for different model sizes, quantised with the Q4_K_M method.
Alternative models can be downloaded from HuggingFace in the .gguf format.
You can download a model locally and load it with the Load model button, or copy the URL in the Download model > Custom URL field to directly download it.
If a HuggingFace model does not provide a gguf file, it can be converted to gguf with this online converter.

The chat template used for constructing the prompts is determined automatically from the model (if a relevant entry exists) or the model name.
If incorrecly identified, you can select another template from the chat template dropdown.

Models added in the model manager are copied to the game during the building process.
You can omit a model from being built in by deselecting the "Build" checkbox.
To remove the model (but not delete it from disk) you can click the bin button.
The the path and URL (if downloaded) of each added model is diplayed in the expanded view of the model manager access with the >> button:

You can create lighter builds by selecting the Download on Build option.
Using this option the models will be downloaded the first time the game starts instead of copied in the build.
If you have loaded a model locally you need to set its URL through the expanded view, otherwise it will be copied in the build.

❕ Before using any model make sure you check their license

Options

LLM Settings

  • Show/Hide Advanced Options Toggle to show/hide advanced options from below
  • Log Level select how verbose the log messages are

💻 Setup Settings

  • Remote select to provide remote access to the LLM

  • Port port to run the LLM server (if Remote is set)

  • Num Threads number of threads to use (default: -1 = all)

  • Num GPU Layers number of model layers to offload to the GPU. If set to 0 the GPU is not used. Use a large number i.e. >30 to utilise the GPU as much as possible. Note that higher values of context size will use more VRAM. If the user's GPU is not supported, the LLM will fall back to the CPU

  • Debug select to log the output of the model in the Unity Editor

  • Advanced options
    • Parallel Prompts number of prompts that can happen in parallel (default: -1 = number of LLMCharacter objects)
    • Dont Destroy On Load select to not destroy the LLM GameObject when loading a new Scene

🤗 Model Settings

  • Download model click to download one of the default models

  • Load model click to load your own model in .gguf format

  • Download on Start enable to downloaded the LLM models the first time the game starts. Alternatively the LLM models wil be copied directly in the build

  • Advanced options
    • Download lora click to download a LoRA model in .bin format
    • Load lora click to load a LoRA model in .bin format
    • Context Size size of the prompt context (0 = context size of the model) This is the number of tokens the model can take as input when generating responses. Higher values use more RAM or VRAM (if using GPU).
    • Batch Size batch size for prompt processing (default: 512)
    • Model the path of the model being used (relative to the Assets/StreamingAssets folder)
    • Chat Template the chat template being used for the LLM
    • Lora the path of the LoRA being used (relative to the Assets/StreamingAssets folder)

🗨️ Chat Settings

  • Advanced options
  • Base Prompt a common base prompt to use across all LLMCharacter objects using the LLM

LLMCharacter Settings

  • Show/Hide Advanced Options Toggle to show/hide advanced options from below
  • Log Level select how verbose the log messages are

💻 Setup Settings

  • Remote whether the LLM used is remote or local
  • LLM the LLM GameObject (if Remote is not set)
  • Hort ip of the LLM (if Remote is set)
  • Port port of the LLM (if Remote is set)
  • Save save filename or relative path If set, the chat history and LLM state (if save cache is enabled) is automatically saved to file specified.
    The chat history is saved with a json suffix and the LLM state with a cache suffix.
    Both files are saved in the [persistentDataPath folder of Unity](https://docs.unity3d.com/ScriptReference/Application-persistentDataPath.html).
  • Save Cache select to save the LLM state along with the chat history. The LLM state is typically around 100MB+.
  • Debug Prompt select to log the constructed prompts in the Unity Editor

🗨️ Chat Settings

  • Player Name the name of the player
  • AI Name the name of the AI
  • Prompt description of the AI role

🤗 Model Settings

  • Stream select to receive the reply from the model as it is produced (recommended!).
    If it is not selected, the full reply from the model is received in one go

  • Advanced options
    • Load grammar click to load a grammar in .gbnf format
    • Grammar the path of the grammar being used (relative to the Assets/StreamingAssets folder)
    • Cache Prompt save the ongoing prompt from the chat (default: true) Saves the prompt while it is being created by the chat to avoid reprocessing the entire prompt every time
    • Seed seed for reproducibility. For random results every time use -1
    • Num Predict maximum number of tokens to predict (default: 256, -1 = infinity, -2 = until context filled)This is the maximum amount of tokens the model will maximum predict. When N tokens are reached the model will stop generating. This means words / sentences might not get finished if this is too low.
    • Temperature LLM temperature, lower values give more deterministic answers (default: 0.2)The temperature setting adjusts how random the generated responses are. Turning it up makes the generated choices more varied and unpredictable. Turning it down makes the generated responses more predictable and focused on the most likely options.
    • Top K top-k sampling (default: 40, 0 = disabled)The top k value controls the top k most probable tokens at each step of generation. This value can help fine tune the output and make this adhere to specific patterns or constraints.
    • Top P top-p sampling (default: 0.9, 1.0 = disabled)The top p value controls the cumulative probability of generated tokens. The model will generate tokens until this theshold (p) is reached. By lowering this value you can shorten output & encourage / discourage more diverse outputs.
    • Min P minimum probability for a token to be used (default: 0.05) The probability is defined relative to the probability of the most likely token.
    • Repeat Penalty control the repetition of token sequences in the generated text (default: 1.1)The penalty is applied to repeated tokens.
    • Presence Penalty repeated token presence penalty (default: 0.0, 0.0 = disabled) Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
    • Frequency Penalty repeated token frequency penalty (default: 0.0, 0.0 = disabled) Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
    • Tfs_z: enable tail free sampling with parameter z (default: 1.0, 1.0 = disabled).
    • Typical P: enable locally typical sampling with parameter p (default: 1.0, 1.0 = disabled).
    • Repeat Last N: last N tokens to consider for penalizing repetition (default: 64, 0 = disabled, -1 = ctx-size).
    • Penalize Nl: penalize newline tokens when applying the repeat penalty (default: true).
    • Penalty Prompt: prompt for the purpose of the penalty evaluation. Can be either null, a string or an array of numbers representing tokens (default: null = use original prompt).
    • Mirostat: enable Mirostat sampling, controlling perplexity during text generation (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0).
    • Mirostat Tau: set the Mirostat target entropy, parameter tau (default: 5.0).
    • Mirostat Eta: set the Mirostat learning rate, parameter eta (default: 0.1).
    • N Probs: if greater than 0, the response also contains the probabilities of top N tokens for each generated token (default: 0)
    • Ignore Eos: enable to ignore end of stream tokens and continue generating (default: false).

License

The license of LLM for Unity is MIT (LICENSE.md) and uses third-party software with MIT and Apache licenses (Third Party Notices.md).

llmunity's People

Contributors

amakropoulos avatar brucekristelijn avatar neegool avatar subatomicplanets avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

llmunity's Issues

JSON error out of nowhere?

Hello! The plugin worked great for me but suddenly out of no where it started giving me this random JSON error. I tried reimporting it a few times, made a new empty project and reimported, nothing worked?

ArgumentException: JSON parse error: Invalid value.
UnityEngine.JsonUtility.FromJson (System.String json, System.Type type) (at <565f365d8c2845e9a37fa074f60116a1>:0)
UnityEngine.JsonUtility.FromJson[T] (System.String json) (at <565f365d8c2845e9a37fa074f60116a1>:0)
LLMUnity.LLMClient.ConvertContent[Res,Ret] (System.String response, LLMUnity.ContentCallback`2[T,T2] getContent) (at Library/PackageCache/ai.undream.llmunity@211bd16a56/Runtime/LLMClient.cs:200)
LLMUnity.LLMClient.PostRequest[Res,Ret] (System.String json, System.String endpoint, LLMUnity.ContentCallback`2[T,T2] getContent, LLMUnity.Callback`1[T] callback) (at Library/PackageCache/ai.undream.llmunity@211bd16a56/Runtime/LLMClient.cs:239)
LLMUnity.LLMClient.Chat (System.String question, LLMUnity.Callback`1[T] callback, LLMUnity.EmptyCallback completionCallback, System.Boolean addToHistory) (at Library/PackageCache/ai.undream.llmunity@211bd16a56/Runtime/LLMClient.cs:155)
InteractionManager.SendMessageToCharacter (System.String playerText) (at Assets/Scripts/Managers/InteractionManager.cs:37)
System.Runtime.CompilerServices.AsyncMethodBuilderCore+<>c.<ThrowAsync>b__7_0 (System.Object state) (at <75633565436c42f0a6426b33f0132ade>:0)
UnityEngine.UnitySynchronizationContext+WorkRequest.Invoke () (at <7b2a272e51214e2f91bbc4fb4f28eff8>:0)
UnityEngine.UnitySynchronizationContext.Exec () (at <7b2a272e51214e2f91bbc4fb4f28eff8>:0)
UnityEngine.UnitySynchronizationContext.ExecuteTasks () (at <7b2a272e51214e2f91bbc4fb4f28eff8>:0)

It's this line:

return getContent(JsonUtility.FromJson<Res>(response));

For the

ConvertContent<Res, Ret>(string response, ContentCallback<Res, Ret> getContent = null)

Function

Function calling

Describe the feature

Hi, I wonder if it is possible to add custom tools to the LLM to make it do something like, create a new cube in the scene.

Remember optimal server settings

Describe the feature

With the introduction of the fallback to CPU and --no-mmap I was thinking. Maybe start up can be sped up on a repeated run by remembering what setup worked for this user? So the first time the LLM.cs will recognise what works and the other times a user starts up the game it will run in these settings.

I am unsure how to implement this yet (how / where to store this?) and probably something for the future but can help speed up launch times for users.

Server could not be started!

Describe the bug

on Mac, build setting IOS :

loaded mistral

/Assets/StreamingAssets/llamafile-0.6/bin/llamafile: line 38: mkdir: No such file or directory
UnityEngine.Debug:LogError (object)
LLMUnity.LLM:DebugLog (string,bool) (at ./Library/PackageCache/ai.undream.llmunity@641ba238a4/Runtime/LLM.cs:155)
LLMUnity.LLM:ProcessError (string) (at ./Library/PackageCache/ai.undream.llmunity@641ba238a4/Runtime/LLM.cs:162)
LLMUnity.LLMUnitySetup/<>c__DisplayClass0_0:b__1 (object,System.Diagnostics.DataReceivedEventArgs) (at ./Library/PackageCache/ai.undream.llmunity@641ba238a4/Runtime/LLMUnitySetup.cs:46)
System.Threading._ThreadPoolWaitCallback:PerformWaitCallback ()

Exception: Server could not be started!
LLMUnity.LLM.StartLLMServer () (at ./Library/PackageCache/ai.undream.llmunity@641ba238a4/Runtime/LLM.cs:272)
LLMUnity.LLM.Awake () (at ./Library/PackageCache/ai.undream.llmunity@641ba238a4/Runtime/LLM.cs:117)

Steps to reproduce

No response

LLMUnity version

unity 2022

Operating System

macOs

Fix server-client domain resolution

Describe the bug

Remote server setup doesn't seem to work with the default 127.0.0.1 host ip

Steps to reproduce

No response

LLMUnity version

No response

Operating System

None

Can't offload to GPU with a 3060

Describe the bug

It just fails and uses the CPU instead, not sure what the issue is. I use oobabooga separately and can load models with gpu offloading via llama.cpp, so I don't know. Any suggestions?

Steps to reproduce

No response

LLMUnity version

81d38bb

Operating System

Windows

Add optional callback for problem handling

Describe the feature

Currently, if the server does not start, you only produce a Debug.LogError(). That's fine for developing purposes, but if this happens on a customer machine, the error should be reported optionally to an object in the main game UI (using a callback or an event) so the game can display a meaningful message to the user.
Example: On one tester's machine, the MacAfee virus scanner isolated the llamafile executable during startup, so it was not there anymore, running the exe failed but the error did not show up anywhere, and Unity did not produce a crashlog. The problem was not visible to the user, the game just didn't work.
As a workaround, I am now just regularly checking LLM.serverListening and show a generic message as long as this is false.

Insert temporary prompt

Describe the feature

  • Default Prompt

  • Temp prompt <- No Save, But It is always above [User Prompt Input]

  • User Prompt Input

  • AI Response

[Temp prompt] is not saved, but should affect [AI Response] like [User Prompt Input].
Required to conserve token usage.
Is there any possible way?

Error when testing the LLM

Hello,
The server loads fine, and is started. However, when I do a request I get an error.

llama.cpp/ggml.c:9067: �[31;1massert(!isnan(x)) failed�[0m (cosmoaddr2line /C/Users/userName/Documents/Personal Projects/Flickr/Assets/StreamingAssets/llamafile-server.exe 4c54f9 4e3982 4e7379 5c9afc 5e7b93)
UnityEngine.Debug:LogError (object)
LLMUnity.LLM:DebugLog (string,bool) (at ./Library/PackageCache/ai.undream.llmunity@bb974c6551/Runtime/LLM.cs:125)
LLMUnity.LLM:DebugLogError (string) (at ./Library/PackageCache/ai.undream.llmunity@bb974c6551/Runtime/LLM.cs:131)
LLMUnity.LLMUnitySetup/<>c__DisplayClass0_0:<CreateProcess>b__1 (object,System.Diagnostics.DataReceivedEventArgs) (at ./Library/PackageCache/ai.undream.llmunity@bb974c6551/Runtime/LLMUnitySetup.cs:41)
System.Threading._ThreadPoolWaitCallback:PerformWaitCallback ()

image

Do you know what is the issue? I tested with the samples.

Text Completions

Describe the feature

Please for the love of god add Text completions support, chat complations is ass for anything non chat related.

llamafile issue on mac M2

Describe the bug

Hi! can't run on Mac. When run Unity chat scene looks like can't connect to server.

Steps to reproduce

No response

LLMUnity version

No response

Operating System

None

How to make it faster?

Hello!

What settings should I tweak to make it faster? I dont know if its even possible.

I know that smaller models answer faster, But for that I need I'm using a mistral 7b instruct model and it takes around 10 seconds to answer. Anything I can tweak to make it answer faster?

Integrate chat templates

HuggingFace has enabled chat templates alongside the model definitions.
"They specify how to convert conversations, represented as lists of messages, into a single tokenizable string in the format that the model expects."
This is part of a WIP PR from llama.cpp.
Once this is merged and integrated in llamafile it will be also incorporated here.

Opening the component inspector results in a performance hit

Describe the bug

When the inspector for both LLM components (LLM & LLMClient) is open the game takes a considerable performance hit. With the inspector this can be traced to the Editor.LLMEditor.OninspectorGUI.

Steps to reproduce

Enter playmode and open the editor in the inspector panel.

LLMUnity version

1.2.0

Operating System

Windows

Integrate ChatGPT support

The purpose of this feature is to integrate the ChatGPT API.
This can be used as an alternative option to LLMs if the developer desires.

LLMClient Initialization Issue with OnEnable

Hi there. I have tried implementing the project and it works very good so far. Thanks and props to you!
However in our project we create LLMClient's dynamically using AddComponent. However the usage of OnEnable causes some issues with this. OnEnable is called immediately when gamobject.AddComponent is called. Because of this settings the prompt has some issues.

I had this issue with Unity's own AudioSources aswell and a workaround I use is turning the game object off, adding the component and turning the gameobject on again.

gameObject.SetActive(false);
llmclient = gameObject.AddComponent<LLMClient>();
llmclient.prompt = suspectInstance.Suspect.GeneratePersona();
gameObject.SetActive(true);

I would suggest using Awake or the Start methods instead of OnEnable as there is no behaviour like OnDisable.

McAfee llamafile issue

Describe the bug

There is an issue with connecting to localhost. The reason is probably the antivirus blocking the "llamafile-0.6.2.exe" , stating it has some virus. Due to this , the server is unable to start and so it's not getting connected. Turning off the antivirus works , but still without turning it off , is there a way to solve this?

Steps to reproduce

Run the sample scene in Unity.

LLMUnity version

1.2.4

Operating System

Windows

Some mac need to run the command manually to work

Describe the bug

For some reason the llamafile command doesn't work within unity unless the command is first run manually.
Need to investigate how to fix that and if it applies to the builds as well

Steps to reproduce

No response

LLMUnity version

No response

Operating System

macOs

Show / hide advanced options

The purpose of this feature is to have checkbox fields in the Unity Editor to allow to show and hide advanced options in a drop-down manner.

Image Input

Describe the feature

hello.
llamafile seems to have image input functions such as jpg/png/gif/bmp.

Example)
llamafile -ngl 9999 --temp 0
--image ~/Pictures/lemurs.jpg
-m llava-v1.5-7b-Q4_K.gguf
--mmproj llava-v1.5-7b-mmproj-Q4_0.gguf
-e -p '### User: What do you see?\n### Assistant: '
--no-display-prompt 2>/dev/null

Is it possible to implement this feature in the future?
Or is there some problem that makes it impossible?

ChatBot Sample Scene flow stuck on "Loading..."

Description

When Play Mode begins in the Unity Editor and after the LLM Server is started, the following two NullReferenceExceptions are thrown and the chat bubble in the scene is stuck on "Loading..."

NRE 1, which does not repeat:

NullReferenceException: Object reference not set to an instance of an object
LLMUnitySamples.InputBubble.FixCaretSorting (UnityEngine.UI.InputField inputField) (at Assets/Samples/LLMUnity/1.0.1/ChatBot/Bubble.cs:183)
LLMUnitySamples.InputBubble..ctor (UnityEngine.Transform parent, LLMUnitySamples.BubbleUI ui, System.String name, System.String message, System.Int32 lineHeight) (at Assets/Samples/LLMUnity/1.0.1/ChatBot/Bubble.cs:144)
LLMUnitySamples.ChatBot.Start () (at Assets/Samples/LLMUnity/1.0.1/ChatBot/ChatBot.cs:49)

NRE 2, which repeats forever:

NullReferenceException: Object reference not set to an instance of an object
LLMUnitySamples.ChatBot.Update () (at Assets/Samples/LLMUnity/1.0.1/ChatBot/ChatBot.cs:136)

To Reproduce

  • Load a model
  • Enter Play mode in the Unity Editor

Expected Behavior

  • NREs do not appear
  • User is able to interact with the scene

Screenshots/Video

Unity_Xs0YnNRMYJ

Configuration

  • Windows 10
  • Unity 2021.3.34 LTS
  • LLMUnity v1.0.1
  • Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-GGUF

Notes

Thanks for creating this project. Great stuff!

Add Android support

Purpose of this features is to add support of LLMUnity for Android

Issues:

  • llamafile does not support Android, the llama.cpp server binary will need to be separately compiled
  • Android needs separate implementation for starting and possibly communicating to the server

An experimental WIP branch is feature/android_support

llama.cpp integration with DLL

Describe the feature

LLM for Unity uses the llamafile server for the LLM functionality.
This approach can't be used for mobile integrations due to security limitations.
Purpose of this feature is to instead integrate llama.cpp directly as DLL.
This feature doesn't necessarily mean to replace llamafile for PCs as this will need quite some testing and optimisation on different OS+CPUs

Kill llamafile processes on Unity crash

Describe the bug

llamafile processes sometimes are kept alive when Unity crashes.
Need to check if it is possible to kill them.

Steps to reproduce

No response

LLMUnity version

No response

Operating System

None

Weird performance on RTX 2060 super

Hi there. Was very suprised by the speed of this implementation. However is doesn't work on my colleagues PC. The server starts when GPU layers is set to 35 but interfacing takes a massive amount of time and after sending a prompt the model goes into some sort of loop, repeating the same output till the limit is reached. We have tried different settings but to no avail.

Specs:

  • Windows 11

  • Nvidia RTX 2060 Super GPU (With the latest game ready drivers)
    log.txt

  • Settings: All settings are default out of the box

  • Model: mistral-7b-instruct-v0.2.Q4_K_M.gguf

I will include the logs. We have compared the logs and output on both machines, but these do not seem all that different aswell.

Check how the stopwords work for phi2

Describe the feature

Phi2 is not really an Instruct based model.
We should check what stopwords make sense and how to expose stopwords in general.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.