Giter VIP home page Giter VIP logo

jozu-ai / kitops Goto Github PK

View Code? Open in Web Editor NEW
198.0 198.0 19.0 48.95 MB

Tools for easing the handoff between AI/ML and App/SRE teams.

Home Page: https://KitOps.ml

License: Apache License 2.0

Go 68.49% Shell 1.58% Dockerfile 0.28% JavaScript 7.23% HTML 0.45% CSS 0.33% Vue 16.89% TypeScript 4.74%
ai code datasets devops devops-tools gguf kubernetes kubernetes-deployment ml mlops mlops-tools model-interpretability model-packer model-serving models pytorch sklearn tensorflow

kitops's People

Contributors

amisevsk avatar bmicklea avatar dependabot[bot] avatar gorkem avatar javisperez avatar jwilliamsr avatar mayacostantini avatar nida-hasan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kitops's Issues

UX: Design for dev command

Design the interface for the demo dev command.

  • KitOps branded
  • Add info flyouts
  • 768px min. screen size (Desktop only)
  • What happens on even smaller screen sizes?
  • Error messages
  • Content for (i) buttons
  • 404

missing UI elements:

  • Radio buttons
  • Slider component

Support for proxies

Describe the problem you're trying to solve
It should be possible to use corporate proxy with kit CLI

Describe the solution you'd like
kit CLI should respect the HTTP_PROXY* environment settings. It should also be possible to set proxies through CLI flags.

Add way to clean up untagged modelkits/all local modelkits

Describe the problem you're trying to solve
Once I have a large-ish number of ModelKits it can become annoying to have to clean them out one-by-one.

Describe the solution you'd like
Some kind of CLI command to do bulk removes based on tags perhaps?

`init` command to generate a kitfile

Implement an init command to initialize a kitfile on a given repository. The command should be able to introspect the folder and suggests a kitfile. It should also be able to use artifacts from different frameworks like MLFlow to initialize the information

The init command is an interactive command. It should be able to discover artifacts on a folder like jupyter notebooks, csv. json, well known serialized models and interactively complete the generation of the kitfile.

Also init should recognize and generate kitfile from the info generated by MLFlow's save_model

Include attestation for ModelKits

Describe the problem you're trying to solve
ModelKits and the assets they contain can come from any location and be built by anyone. There are no inherent guarantees in any of the existing model / dataset packaging mechanism of provenance or safety. Users want a way to know where the package they are using has come from so they can make their own decision about whether to trust it.

Describe the solution you'd like
ModelKits should be able to include attestations for the package and its contents. We could use something like the SLSA's verification summary and include it with the ModelKit as an option. This would make ModelKits the first packaging for AI/ML that provides provenance attestations.

Allow referencing remote assets

Describe the problem your feature would solve
When using very large datasets (e.g., image and video data especially), it's painful to have to clone them to my local machine in order to package up a ModelKit.

Describe the solution you'd like
It would be nice if you could provide a URI to the dataset in the Kitfile. I understand that that would trade download time for me for wait time as the ModelKit fetched the data in order to be built...as long as it doesn't happen on my machine πŸ˜„

Describe alternatives you've considered
There isn't really an alternative beyond cloning everything locally.

WHEN CLOSED UPDATE THE NEXT STEPS DOC (NEXT-STEPS.MD) WHICH REFERENCES THIS ISSUE.

Improve how artifacts are stored locally to avoid duplicating data

Describe the problem your feature would solve

Currently, ModelKits are stored using one OCI spec index per repository, using the folder structure

<storage-root>
└── <registry>
    └── <organization>
        β”œβ”€β”€ <repository1>
        β”‚Β Β  β”œβ”€β”€ blobs
        β”‚Β Β  β”œβ”€β”€ index.json
        β”‚Β Β  └── oci-layout
        └── <repository2>
         Β Β  β”œβ”€β”€ blobs
         Β Β  β”œβ”€β”€ index.json
         Β Β  └── oci-layout

As the OCI image index spec does not leave easy room for multiple repositories within one index, tagging the same image into two separate repositories currently uses double the storage. In other words, executing

kit tag my-image:mytag my-other-image:mytag

results in the blobs for my-image being copied to another directory.

Note this issue isn't present for ModelKits within the same repository -- i.e. my-image:tag1 and my-image:tag2 will share storage as expected.

Describe the solution you'd like

Since blobs are content-addressable and there are no auth concerns with locally-stored modelkits, it makes sense to store each blob only once, and reference them from multiple different indexes. This would cut down on storage requirements for ModelKits while keeping a relatively pure OCI image index structure.

Describe alternatives you've considered

Alternatively, we could abandon using the image index structure for local storage and instead implement an alternate way of tracking references to ModelKits in local storage. This would avoid the need for potentially awkward workarounds to manage accessing and removing blobs locally.

Additional context

Docs: improve rendering for lists

Describe the problem you're trying to solve
Bulleted and numbered lists in the docs are a little hard to read - they're not very indented and generally just kind of run together with the other text. See the use-cases.md as a good example.

Describe the solution you'd like
A little deeper indent on lists would help. Perhaps if there was more space between regular paragraphs and a little less at the top of the first item in a list that would help too. The docs need room to breathe! πŸ˜„

Describe alternatives you've considered
Not using lists, but that seems extreme.

Additional context
n/a

Add `--untag` flag to `remove` command

Describe the problem you're trying to solve
There's no way to remove a tag from a ModelKit if it only has one tag (you end up removing the ModelKit).

Describe the solution you'd like
A flag to remove a tag (even the last tag) would help.

Describe alternatives you've considered
Adding the new tag and then removing the old. This might be better if we think we always want a tag for every ModelKit...

Tutorial: Fine-tuning LLMs using Kitops

Describe the problem you're trying to solve
Reference material that shows how to use Kitops for fine-tuning LLMs

Describe the solution you'd like

  • Implement a fine-tuning solution that uses base models from ModelKits and repackages them. Use the dataset from Kitops repository (documents and code)
  • Prepare a test set for validation
  • Demonstrate LoRA or QLoRA
  • Show that tuning parameters can be stored on ModelKit config for efficiency

Additional context
We can take this solution as a permanent service for kitops.ml site.

FAQ section for KitOps Site

Please design an FAQ section for the home page. This section should display the questions with collapsed answers. Let's start with 10 questions.

Create KitOps Python libraries

Describe the problem you're trying to solve
To reduce adoption friction among data scientists we need a Python library that makes it trivial for someone to create a kitfile or build a ModelKit as part of their code.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Enhance unpack Command for Deployment-Specific Structures

Describe the problem you're trying to solve
Currently, the unpack command does not fully cater to the specific needs and expectations of different target environments, such as inference engines. There's a gap in functionality where the command fails to dynamically adjust and resolve the unpacked files based on the unique structure and metadata requirements of these environments. This lack of flexibility can lead to inefficiencies and potential mismatches when integrating unpacked files into various runtime, test, development, or deployment settings.

Describe the solution you'd like

To address this issue, I propose enhancing the kit unpack command to offer a more adaptable and environment-aware unpacking mechanism. Although ModelKits are designed to be abstract away from different environments, it's essential for the kit unpack functionality to be able to cater to the unique needs of these environments when unpacking artifacts.

A feasible approach to achieving this is through the introduction of a plugin mechanism. This system would allow for the development of environment-specific plugins that kit can leverage during the unpacking process. Each plugin would instruct the kit unpack command on how to properly structure the unpacked files to meet the requirements of a target environment
ModelKits are an abstracted from the runtime, test, development, or deployment environments.

This would allow us to customize the kit unpack process to meet the diverse requirements of different target environments efficiently, which in turn streamlines deployment and integration workflows, improving user experience across various environments.

Tutorial: RAG using KitOps

Describe the problem you're trying to solve
Create a reference material on how to use KitOps for creating RAG pipelines.

Describe the solution you'd like

  • Implement a RAG pipeline solution that uses the data from Kitops repository (docs and code) to answer KitOps related questions
  • The solution should use one of the pre-packaged ModelKits and show it is possible to change the base model.
  • Tutorial should be repeatable locally. (local vector DB)
  • Create a validation set of tests

Additional context
We can also make this solution deployed as a permanent service for Kitops.ml website.

Add `readme.md` into the Kitfile / ModelKit

Describe the problem your feature would solve
It can sometimes be difficult to understand what a model is meant to be used for and what datasets are / why they were included.

Describe the solution you'd like
A readme included in the ModelKit would make it easy for the Kitfile producer and consumer to understand the context for the ModelKit and the assets it is packaging.

Describe alternatives you've considered
Descriptions in the Kitfile are okay, but are either very short and keep the Kitfile readable, or are very long and wreck the readability.

Additional context
[none]

Revamp Quick Start document

We should simplify the quick start and make it more step-by-step so they understand the power of Kit

  • Link to the ModelKit overview for folks who want to learn what a ModelKit is
  • Then link to install guide
  • kit version
  • kit login (docker, azure cr, aws ecr, google cr, github in that order)
  • kit unpack the sample ModelKit
  • [view the directory]
  • kit list
  • kit pack -t
  • kit list
  • kit push
  • link to command reference
  • link to Kitfile overview

We can then follow with helping them build their own ModelKit

  • git clone sample repo
  • author a kitfile
  • kit pack
  • kit push
  • kit list

Logo's Marquee issue in iOS Safari

Describe the bug
In iOS Safari, the Marquee of logos in the kitops site has a weird timing and spacing issue.

To Reproduce
Steps to reproduce the behavior:

  1. In your mobile, using iOS Safari, go to https://kitops.ml
  2. Scroll to the "What’s supported?" section
  3. Wait a few seconds and you should notice the logos acting weird.

Support shell completions for commands

Description

Add completions for operations that work on local storage:

  • kit push <modelkit> -- complete with ModelKits (registry/repository:tag) stored locally
  • kit unpack <modelkit> -- same as above
  • kit remove <modelkit> -- same as above
  • kit tag <source> <dest> -- complete for source ModelKit; maybe for destination as well?
  • kit list -- complete for local repositories (i.e. where a registry is not included)

Update default storage paths

Description

Currently, kit defaults to storing everything within $HOME/.kitops. Since the .kitops directory is not intended for direct user access (and instead serves as general storage for OCI artifacts, credentials, etc.), it makes more sense to use more standard directories for each operating system:

  • Linux/MacOS: $HOME/.local/share/kitops
  • Windows: %LOCALAPPDATA%/kitops

In addition, it would be convenient to respect an environment variable -- e.g. $KITOPS_HOME -- for overriding this default (in addition to the --config flag). This would allow using an alternate storage path for multiple commands (or setting a default value system wide) without needing to add the --config flag to each command.

Status indicator for `pack` and `unpack`

Describe the problem you're trying to solve
It's unclear when you do a pack or unpack how long it will take or whether it's even still working. For large models and datasets there could be quite a long delay in building the ModelKit.

Describe the solution you'd like
Some kind of status update including a description of what it's doing at various stages / steps.

Describe alternatives you've considered
n/a

Additional context
none

Implement ignore file

Describe the problem you're trying to solve
A mechanism to exclude files or folders from ModelKit during packaging

Describe the solution you'd like
implement .kitignore to ignore files/folders to be excluded from model kit packaging.

Describe alternatives you've considered
Use .gitignore

CLI UX: Sorting commands & Flags

The commands & flags should be sorted by the most common usage. The order is currently alphabetical.

Tasks:

  • Create a useful order
  • execute

dev command

Describe the problem you're trying to solve
Run a model in a ModelKit locally for development/testing purposes

Describe the solution you'd like
Develop a test harness that exposes REST APIs and an additional chat interface for LLM development on a locally run web server and inferences the given model.

Describe alternatives you've considered
Modifying similar projects to use ModelKits

Issue list:

kit pack error

Describe the bug
When trying to run kit pack with the attached Kitfile I get an error:

Failed to pack model kit: error resolving /Users/bmicklea/Documents/GitHub: lstat /Users/bmicklea/Documents/GitHub/Users: no such file or directory

It looks like it's appending a second Users to the path, although that's weird because the path actually has more to it...

kitfile.txt

To Reproduce
Steps to reproduce the behavior:

  1. Run kit pack ./ with the attached kitfile (it'll need to have the .txt extension removed)
  2. See error

Version
Next

Extend distribution channels

Describe the problem you're trying to solve
Extend the distribution of kit CLI releases beyond GH releases.

Describe the solution you'd like
Support distribution via

CLI only distributions

CLI and Python API

version number for the cli reference

Describe the problem you're trying to solve
The version number that is generated for the documentation site should have a reference to the version of the CLI it is generated with.

JSON Schema for kitfile

Describe the problem you're trying to solve
When editing kitfiles it should be able to get validation and code assist

Describe the solution you'd like
Provide a JSON Schema that can be referenced by editors that support JSON Schema based validation.
The Schema should eventually be moved to schema store once kitfile API is more stable.

Generate Dockerfile from `kitfile`

Describe the problem you're trying to solve
Although I can build nearly any deployable artifact from the serialized model that is in the ModelKit, it's still work...

Describe the solution you'd like
Generate a Dockerfile based on my ModelKit and some basic criteria I provide.

Describe alternatives you've considered
Making Dockerfiles on my own from the unpacked model in a ModelKit.

Additional context
none

Write a Next Steps with Kit doc

Describe the problem you're trying to solve
The new Quick Start does a good job of getting someone familiar with the basic Kit commands, but it doesn't go into Kitfile authorship or other critical commands.

Describe the solution you'd like
A Next Steps with Kit doc that follows from the Quick Start would help new users to build their own ModelKits and learn how to take advantage of tagging and registry managment.

Sign windows binaries.

Describe the problem you're trying to solve
Like MacOS we should also sing the windows binaries.

Describe the solution you'd like

  • Use code signing certificates to sing windows binaries.
  • Automate the release process

Design the `kit dev` LLM prompt interaction

Describe the problem you're trying to solve
The current LLM prompt configuration and interaction UX / UI is ugly and somewhat confusing.

Describe the solution you'd like
We should rethink what aspects are consistently vs rarely changed and redesign the interface to make it both smoother and more joyful to use.

Add progress bars for upload/download actions

Description

Currently, CLI operations that require uploading or downloading data from a remote registry work silently, which can be confusing if the operation takes a relatively long time.

The CLI should support progress bars for kit push, kit pull, and kit export (when ModelKit isn't present locally). We should only print these progress bars when the CLI is being used interactively (i.e. don't print progress bars in CI environments, etc.).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.