tensoropsai / llmstudio Goto Github PK

View Code? Open in Web Editor NEW

213.0 213.0 23.0 30.77 MB

Framework to bring LLM applications to production

Home Page: https://tensorops.ai

License: Mozilla Public License 2.0

Python 24.35% JavaScript 2.90% CSS 0.67% TypeScript 72.05% Makefile 0.03%

ai langchain llm llmops ml mlflow openai prompt-engineering vertex-ai

llmstudio's People

Contributors

Stargazers

Watchers

Forkers

octag0no infrastacks senthi1kumar stophobia andreamason brunoscaglione gmh5225 coldra1n skullface20 eranchetz jeffrey-m-johnson jacquesmiov myaiyoda say383 vital121 siva01c knightcn1983

llmstudio's Issues

LLM_ENGINE - SDK

Monolithic Approach
SDK init when create a client / UI - @reiid00
VertexAI JSON serialisation BUG

DOC: Architecture RoadMap discussion

Issue with current documentation

If you find any issues with the current architecture roadmap documentation, please make sure to specify the exact file you're referring to for quicker resolution.

Idea or Request

We're always looking to improve and expand our documentation. Please describe as clearly as possible any topics, sections, or details you think are missing or would be beneficial to include in the current architecture roadmap. Your insights are valuable to us.

FEAT: Separate Client from Models

Feature Request

Client should not longer do .get_models() to retrieve a model.
Instead a model should be called like this: model=OpenAIModel()

Motivation

Client should only interact with higher level stuff: creation of api, creation of project, creation of user, max_quotas, ect.
Client should be called only once at the begining instead of every time a model is created.

Models should be instantiated directly without need of going through the client. Cleaner and more intuitive code.

Your contribution

Discussion

BUG: docs.llmstudio.ai not reachable

System Info

not relevant

Who can help?

@claudiolemos

Related Components

Reproduction

Open the GitHub repo
Follow the doc link to https://docs.llmstudio.ai/

Expected behavior

Docs page available

Set server for the session

In the init function, set server for the session and validate it using an http request. Be it in Python Session Level, not local.

BUG: LLMstudio command not registered on bashrc

System Info

Macbook

Who can help?

No response

Related Components

Reproduction

❯ pip install LLMstudio
Collecting LLMstudio
Using cached llmstudio-0.2.19-py3-none-any.whl.metadata (946 bytes)
Requirement already satisfied: pydantic in ./Library/Caches/pypoetry/virtualenvs/poetry-demo-YQizyC9W-py3.12/lib/python3.12/site-packages (from LLMstudio) (2.5.2)
Requirement already satisfied: requests in ./Library/Caches/pypoetry/virtualenvs/poetry-demo-YQizyC9W-py3.12/lib/python3.12/site-packages (from LLMstudio) (2.31.0)
Collecting numpy (from LLMstudio)
Using cached numpy-1.26.2-cp312-cp312-macosx_11_0_arm64.whl.metadata (61 kB)
INFO: pip is looking at multiple versions of llmstudio to determine which version is compatible with other requirements. This could take a while.
Collecting LLMstudio
Using cached llmstudio-0.2.18-py3-none-any.whl.metadata (946 bytes)
Using cached llmstudio-0.2.17-py3-none-any.whl.metadata (967 bytes)
Using cached llmstudio-0.2.16-py3-none-any.whl.metadata (967 bytes)
Using cached llmstudio-0.2.15-py3-none-any.whl.metadata (967 bytes)
Using cached llmstudio-0.2.14-py3-none-any.whl.metadata (967 bytes)
Using cached llmstudio-0.2.13-py3-none-any.whl.metadata (967 bytes)
Using cached llmstudio-0.2.12-py3-none-any.whl.metadata (967 bytes)
INFO: pip is still looking at multiple versions of llmstudio to determine which version is compatible with other requirements. This could take a while.
Using cached llmstudio-0.2.11-py3-none-any.whl.metadata (934 bytes)
Using cached llmstudio-0.2.10-py3-none-any.whl.metadata (928 bytes)
Using cached llmstudio-0.2.9-py3-none-any.whl.metadata (927 bytes)
Using cached llmstudio-0.2.8-py3-none-any.whl.metadata (851 bytes)
Using cached llmstudio-0.2.7-py3-none-any.whl.metadata (826 bytes)
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.
Using cached llmstudio-0.2.6-py3-none-any.whl.metadata (826 bytes)
Using cached llmstudio-0.2.5-py3-none-any.whl.metadata (826 bytes)
Using cached llmstudio-0.2.4-py3-none-any.whl.metadata (826 bytes)
Using cached llmstudio-0.2.3-py3-none-any.whl.metadata (826 bytes)
Using cached llmstudio-0.2.2-py3-none-any.whl.metadata (826 bytes)
Using cached llmstudio-0.2.1-py3-none-any.whl.metadata (826 bytes)
Using cached llmstudio-0.2.0-py3-none-any.whl.metadata (771 bytes)
Using cached llmstudio-0.1.6-py3-none-any.whl.metadata (497 bytes)
Requirement already satisfied: annotated-types>=0.4.0 in ./Library/Caches/pypoetry/virtualenvs/poetry-demo-YQizyC9W-py3.12/lib/python3.12/site-packages (from pydantic->LLMstudio) (0.6.0)
Requirement already satisfied: pydantic-core==2.14.5 in ./Library/Caches/pypoetry/virtualenvs/poetry-demo-YQizyC9W-py3.12/lib/python3.12/site-packages (from pydantic->LLMstudio) (2.14.5)
Requirement already satisfied: typing-extensions>=4.6.1 in ./Library/Caches/pypoetry/virtualenvs/poetry-demo-YQizyC9W-py3.12/lib/python3.12/site-packages (from pydantic->LLMstudio) (4.8.0)
Requirement already satisfied: charset-normalizer<4,>=2 in ./Library/Caches/pypoetry/virtualenvs/poetry-demo-YQizyC9W-py3.12/lib/python3.12/site-packages (from requests->LLMstudio) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in ./Library/Caches/pypoetry/virtualenvs/poetry-demo-YQizyC9W-py3.12/lib/python3.12/site-packages (from requests->LLMstudio) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./Library/Caches/pypoetry/virtualenvs/poetry-demo-YQizyC9W-py3.12/lib/python3.12/site-packages (from requests->LLMstudio) (2.1.0)
Requirement already satisfied: certifi>=2017.4.17 in ./Library/Caches/pypoetry/virtualenvs/poetry-demo-YQizyC9W-py3.12/lib/python3.12/site-packages (from requests->LLMstudio) (2023.11.17)
Using cached llmstudio-0.1.6-py3-none-any.whl (12 kB)
Installing collected packages: LLMstudio
Successfully installed LLMstudio-0.1.6

LLMstudio server --ui
zsh: command not found: LLMstudio

Expected behavior

We'd like to see LLMstudio run after pip install however it gives an error: zsh: command not found: LLMstudio

DOC: ReadMe needs update

Issue with current documentation

Some new features added are still not in the readme.
The image of the UI is outdated.
Needs better documentation of how to use llmstudio

Idea or Request

Maybe having a short video showing the use of the UI could be helpfull.

BUG: Application tab displays 'React Application' instead of LLMStudio logo and name

System Info

Browser: Brave v 1.59.117

OS: macOS Ventura 13.5.1

Description

The browser tab for the application currently displays the default "React Application" text and lacks an icon.

Related Components

Reproduction

Open the LLMStudio web application.
Observe the browser tab title and icon.

Expected behavior

It should instead be branded with the LLMStudio logo and name for better user experience and branding consistency.

FEAT: Models should accept parameters on creation

Feature Request

Models should also accept the parameters when created, in addition to receiving them by chat.
Chat parameters should override existing params. If no params (chat nor model) are defined default will be used.

Motivation

This allows for more intuitive code. Allows for integration with chains in langchain.

Your contribution

Will create a branch and submit a PR

FEAT: Concept of Project - Backend

Feature Request

Have the concept of a project in the backend.
Each project aggregates logs, api-keys and other metrics such as total cost and cost per provider.

Motivation

Makes it more organized to use LLMstudio for different projects instead of the logs becoming a mess between two projects.
Gives key indicators about the project.
Having this would be the first step into also having the same feature on the UI where it might have the most value.

Your contribution

Discussion

update readme

readme doesn't explain how to build the container

FEAT: Auto-Evaluation of batch of tests

Feature Request

Tests are running in a batch but they are not being evaluated against the GT answer.

The evaluation could be done using similarity metrics, LLM or even mathematics.

Motivation

This would help with quickly testing models and prompts and seeing the result quantified immediately.

Your contribution

Discussion

LLM_ENGINE - Dynamic Routing

Providers
Config
API Class

BUG: Server does not update api key for same provider on new client creation

System Info

llmstudio-0.2.9

Who can help?

No response

Related Components

Reproduction

create a Client from a given provider, say OpenAI but with a wrong API key
create another client from the same provider but with a correct API key this time
do client.get_model -> will crash saying the api key is incorrect, possibly meaning the API key is not being updated server side

Expected behavior

The API key should update server side instead of ignoring the latest api key on the client for the same provider.

BUG: client.get_model crashes if run on notebook

System Info

client.get_model crashes if run on notebook immediately after setting up client.
If we wait for 1-2 sec and then run it works perfectly.

This behaviour is not seen when running the same code but on a script instead of a jupyter notebook.

Who can help?

No response

Related Components

Reproduction

Put this in a jupyter notebook cell and click run all. (Check if OpenAI key is previously setup)

from llmstudio.models import OpenAIClient
client = OpenAIClient()
model = client.get_model(model_name="gpt-3.5-turbo")

Expected behavior

model would be created without error

FEAT: Wrap Backend for Assistants

Motivation
Allows for ease use of assistants in our backend and in the future the ui.

Your contribution
Discussion

Command line for running the server

As of now you need to both pull a docker container and pip install, maybe the pip install should create the docker container and run it. pip decide to which components to install.

FEAT: RBAC support

Feature Request

All requests made to the LLMstudio server should authenticated and authorized through a centralized RBAC system

Motivation

The current design assumes single user, all the permissions are managed on the Vendor side, no centralized authorization. It blocks LLMstudio from being used by organizations in a centralized way.

Your contribution

Raising the feature

BUG: LLMstudio port changed

System Info

Macbook pro

Who can help?

No response

Related Components

Reproduction

llmstudio server --ui

Expected behavior

the current port keeps changing, it used to be 3000

SEC: Deal with the 3 Security Concerns

There are currently 3 security concerns raised automatically by githubs bot. We need to address them asap.

FEAT: UI Assistants - Support function def

Allow for defining the functions for Assistants in the UI.

Clean up with Lint
Generate the JSON openai wants
Save these functions somewhere
Check security concerns with running user code

Motivation
On openai playground users cannot define functions easily. this would be a good upgrade and a reason for users to use our UI instead

Your contribution
Discussion

FEAT: Server Routing of used LLM

Feature Request

Server should be able to pick between models to optimize costs and performance. For example GPT3.5-16k should only be used if the context window is large enough to justify it, otherwise simply use the base GPT3.5 version as it is cheaper.

This behaviour should be customisable by the user, he may want for the router to choose between GPT4 and GPT3-5-16k instead for example.

Motivation

Would allow the user to save costs and optimize performance and define custom behaviour. Is mostly useful in production more than in development.

Your contribution

Discussion about it.

FEAT: Langchain Wrapper Integration

Feature Request

A way for LLM Studio to wrap langchain and thus leverage all its capabilities. Langchain already provides a way to wrap its own LLM base class.

Motivation

Currently there is no backend support for chains, templates, agents and other complex techniques for prompting and building llm applications.

Your contribution

Discussion and sharing of current code for this implementation. Code is still suboptimal since it takes advantage of 'try except' with a timeout which delays response. Code might need several modifications:

import requests
import os
from typing import List, Any, Dict, Optional, Mapping
from langchain.llms.base import LLM
from langchain.callbacks.manager import CallbackManagerForLLMRun
import openai


class LLMStudioLLM(LLM):
    temperature: int
    top_p: int
    model_name: str
    organization: str
    isStream : bool

    @property
    def _llm_type(self) -> str:
        return "custom"

    def _call(
        self,
        prompt: str,
        parameters: Optional[dict] = {},
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
    ) -> str:
        if stop is not None:
            raise ValueError("stop kwargs are not permitted.")

        try:
            response = requests.post(
                "http://localhost:8000/api/chat/openai",
                json={
                    "model_name": self.model_name,
                    "api_key": os.environ["OPENAI_API_KEY"],
                    "organization": self.organization,
                    "chat_input": prompt,
                    "parameters": {"temperature": self.temperature, "max_tokens": 2048, "top_p": self.top_p}
                },
                headers={"Content-Type": "application/json"},
            )

            return response.json()["chatOutput"]
        except:
            openai.api_key = os.environ["OPENAI_API_KEY"]
            openai.organization = self.organization 

            response = openai.ChatCompletion.create(
                model=self.model_name,
                messages=[
                    {
                        "role": "user",
                        "content": prompt,
                    }
                ],
                #stream = self.isStream,
                temperature=self.temperature,
                top_p=self.top_p,
                max_tokens=2048,
            )

            return response["choices"][0]["message"]["content"]


    @property
    def _identifying_params(self) -> Mapping[str, Any]:
        """Get the identifying parameters."""
        return {}

FEAT: UI should support templating

Feature

Support templates like this where the variables can be defined elsewhere so as to not make the text too messy.

"""
I have to understand what is being done here.

In the file API.txt you see all the components that are impacted directly and indirectly from account.

You will see there a flow named UpdateOpportunityStage which is directly impacted.
The source of the flow is in file VS_UpdateOpportunityStage.txt.

file API.txt: {api}

file VS_UpdateOpportunityStage.txt: {text_file}

Questions for example around this flow may be the following, please answer this:

Question: {question}

Motivation

When actually using the UI for more complex prompt engineering templating is vital to keep the text boxes clean and readable.

Contribution

Discussion

BUG: nest_async Requirement is missing

Package is missing from requirements

BUG: problem with loading the UI

System Info

installed bun

Who can help?

No response

Related Components

Reproduction

llmstudio server --ui
Running UI on localhost:8000
Running Engine on localhost:8000
No package.json, so nothing to update
Running Tracking on http://localhost:8080
Error running the UI app: Command '['bun', 'update']' returned non-zero exit status 1.
INFO: Started server process [3755]
INFO: Waiting for application startup.
INFO: Started server process [3755]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Application startup complete.
INFO: Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
INFO: Uvicorn running on http://localhost:8080 (Press CTRL+C to quit)

Expected behavior

the UI should load

BUG: GPT3.5 Rerouting

Rerouting when prompt is large to another model is not adapted for openai 1.0

Error Info

Server gives error 500. So it gives an error parsing the JSON:

Who can help?

@reiid00

Related Components

Expected behavior

JSON would be parsed correctly and response would be 200

FEAT: Run Batch of Tests on LLM

Feature Request

Add ability to run a preset batch of tests/prompts with a given model with certain parameters.

Batch of tests/prompts should be parallelized.

Motivation

Would allow for easy comparisson of diferent models and parameters.
Would also allow for easy comparisson of a certain model's behaviour with several prompts, so could help speed up prompt engineering.

Your contribution

Will create a branch and submit a PR. branch: feat/run_tests

FEAT: Dynamic Routing

Feature Request

Dynamic Routing as MLFlow has with their Gateway API

Motivation

Hackathon and scalability

Your contribution

Hackathon

FEAT: Server-side Concurrent Execution of Requests

Feature Request

Server should be able to handle multiple requests to the same model and handle them in parallel.

Motivation

Allows for the user to save time by leveraging parallelism. Specially important for #9 and #24.

Your contribution

Solved only for OpenAI by @reiid00 in a4968e6

FEAT: General UI for Assistants

Motivation
Ease of use. So that people do not need to go to OpenAI playground.

Your contribution
Discussion

DOC: How to connect LLMstudio to Azure OpenAI?

Issue with current documentation

I can't find how to configure LLMstudio for Azure OpenAI.

Idea or Request

Please provide docs for this. Thank you.

LLM_ENGINE - API Class Restructure

FEAT: Adapt to Rate Limit instead of Failure

Feature Request

Providers like OpenAI have some rate limits (things like a limit in the requests per minute).
This feature would allow llm studio to wait it out (or keep trying) when necessary so that the response does not error even if it takes longer.

Advanced feature:
By being aware of the exact rate limit of the user (depends on their tier in OpenAI for example) it could also decide which prompts to send at what time in order to maximize the rate limit without overstepping it (in cases where these are parallel).

Motivation

Gives more robustness to LLM calls. The user does not need to worry about their application breaking when making to many requests per minute.

Your contribution

Discussion

FEAT: Tracking/Log of Chains - Backend

Feature Request

Logs of chained prompts should be grouped in order for them to be more organized.

Motivation

When using chaining prompts the logs get very confusing as to what log belongs to what chain of prompts. Having these more organized could be better for tracking of experiments.

This feature would be particularly relevant for the UI, to allow for a easy overview of experiments similarly to langsmith. So it is relevant to allow a future feature.

Your contribution

Discussion

BUG: Pip install Error

When pip installing, the API keys for OpenAI are not recognized, for the SDK and UI. This could be related to some package being missing which crashes the API or the changes in openai to 1.0

LLM_ENGINE - UI Compatability

UI being served at 3000 port - @claudiolemos

BUG: Adapt LLMstudio for new OpenAI version 1.2.3

OpenAI changed a lot of models names and where the classes are in their package. We need to update LLMstudio to be compatible with those.

BUG: Streaming for backend

Streaming is currently not working on the backend. Only checked for OpenAI.

Start-up: No module named 'llmstudio.engine'

System Info

Unable to launch LLMStudio in pycharm environment on Windows server.

$ LLMStudio server
Traceback (most recent call last):
File "C:\Program Files\Python38\lib\runpy.py", line 193, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Program Files\Python38\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\pythonProject-LLMStudio\venv\Scripts\LLMStudio.exe_main.py", line 4, in
File "C:\pythonProject-LLMStudio\venv\lib\site-packages\llmstudio\cli.py", line 3, in
from llmstudio.engine.config import EngineConfig
ModuleNotFoundError: No module named 'llmstudio.engine'

Who can help?

No response

Related Components

Reproduction

Install LLMStudio (pip install LLMStudio)
launch the LLMServer (LLMStudio server)
Error is observed

Expected behavior

The LLMStudio server should start and should be able to connect on localhost.

DOC: PyPI Documentation of LLMstudio

Issue with current documentation

A description of the project needs to be added to the PyPI following standard practices in other packages.

https://pypi.org/project/llmstudio/

Idea or Request

No response

BUG: VertexAI provider is not decoding API JSON correctly

System Info

llmstudio-0.2.9
macOS Venture 13.5.1

Who can help?

No response

Related Components

Reproduction

Start VertexAIClient with JSON api key
Do client.get_model for any vertexai model
model.chat("Hi")

Expected behavior

Should work instead of giving error.

FEAT: Redesign the UI to be more fit to represent logging explorer

Feature Request

No option to delete
Sort from last to first
Show the timestamp of the job
Allow filtering

Motivation

The LLMstudio UI is a wrapper for a logging system, should be inspired by SnowFlake, BigQuery or other systems. . Jobs are searchable, indexed but not mutable.

Your contribution

Feature request raising

Architecture Restructuring

FEAT: PromptCompare UI

Feature

Since we have the backend to compare the outputs of several prompts for the same model, the same should be supported in the UI

Motivation

Make prompt engineering a quicker process, allowing for parallelized testing of prompts

Contribution

Discussion

Configuration screen

Support setting parameters for the app in a config screen

upload API tokens (GCP, OpenAI)

BUG: AttributeError in validate_parameters Method ('OpenAIParameters' object has no attribute 'model_dump')

System Info

Google Colab
unsing:
!pip install LLMstudio

Who can help?

No response

Related Components

Reproduction

Encountered a bug while running the 01_intro_to_llmstudio.ipynb notebook from the LLMstudio repository on Colab.
Relevant notebook URL: Intro to LLMstudio Notebook

Issue:
An AttributeError is thrown in the validate_parameters method of the OpenAIParameters class.
The error message indicates a missing model_dump method: AttributeError: 'OpenAIParameters' object has no attribute 'model_dump'.

Code References:
File Path: LLMstudio/llmstudio/models/openai.py
Method: validate_parameters
File Path: LLMstudio/llmstudio/validators/openai.py
Class: OpenAIParameters

Steps to Reproduce:

Open the 01_intro_to_llmstudio.ipynb notebook in Google Colab.
!pip install LLMstudio
Execute the notebook cells in order.
Observe the AttributeError upon reaching the cell that invokes validate_parameters.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input> in <cell line: 2>()
      1 # OpenAI models
----> 2 gpt_3 = openai_client.get_model("gpt-3.5-turbo")
      3 gpt_4 = openai_client.get_model("gpt-4")

4 frames
/usr/local/lib/python3.10/dist-packages/llmstudio/models/openai.py in validate_parameters(self, parameters)
     81             """
     82             parameters = parameters or {}
---> 83             return OpenAIParameters(**parameters).model_dump()
     84 
     85     class GPT3_5(OpenAIModel):

AttributeError: 'OpenAIParameters' object has no attribute 'model_dump'

Expected behavior

Correctly loading the models

BUG: <END_TOKEN> appearing when using OpenAI models

System Info

Browser: Brave v 1.59.117

OS: macOS Ventura 13.5.1

Description

<END_TOKEN> appears in answer along with some numbers.

Who can help?

No response

Related Components

Reproduction

Open the LLMStudio web application.
Use the prompt, model and parameters in the image above.
Try multiple times (results are not consistent but should happen more often than not).

Expected behavior

The output should end before the printing of the <END_TOKEN>

FEAT: Google Colab Support for LLMstudio Web UI

Feature Request

Request for the addition of Google Colab support for the LLMstudio Web UI.

I know that it's technically possible, as other projects use gradio for spawning web UI out of colab instance (e.g. https://github.com/lllyasviel/Fooocus)

Motivation

This feature would greatly enhance the usability and accessibility of LLMstudio for a broader range of users and remove any entry barriers to trying the tool. If the UI could be hosted in colab I could even recommend and explain how to use it in 5min to any "non-tech colleagues" for prompt engineering or prototyping an LLM idea for their department's use case.

Having the LLMstudio UI to track and evaluate OpenAI API calls would be a great asset for all LLM projects being developed in Colab (from personal use hobby projects to university student project colabs or rapid prototyping efforts inside of organizations).

Your contribution

Unfortunately, I have no gradio experience or knowledge how to technically solve this.