qubitium Goto Github PK

followers: 36.0 following: 50.0 repos: 56.0 gists: 0.0

Name: Qubitium-ModelCloud

Type: User

Company: ModelCloud.ai

Bio: Golang, Python, Kotlin, Swift. I prefer strongly typed languages and I do not worship PEP. @ModelCloudAi

Twitter: qubitium

Location: Earth/Epoch 2.0

Blog: https://modelcloud.ai

Qubitium-ModelCloud's Projects

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

alpaca-lora

Instruct-tune LLaMA on consumer hardware

android-rteditor

The Android RTEditor is a rich text editor component for Android that can be used as a drop in for EditText

auto-round

SOTA Weight-only Quantization Algorithm for LLMs

autoawq

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

autogptq

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

bitblas

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Boxwood is a PHP extension for fast replacement of multiple words in a piece of text. It supports case-sensitive and case-insensitive matching. It requires that the text it operates on be encoded as UTF-8.

c4_200m-synthetic-dataset-for-grammatical-error-correction

This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences from C4 using a tagged corruption model. The approach and the dataset are described in more detail by Stahlberg and Kumar (2021) (https://www.aclweb.org/anthology/2021.bea-1.4/)

checkmk

Checkmk - Best-in-class infrastructure & application monitoring

fastchat

The release repo for "Vicuna: An Open Chatbot Impressing GPT-4"

femtozip

flash-attention

Fast and memory-efficient exact attention

flashinfer

FlashInfer: Kernel Library for LLM Serving

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

gosocket

gpt-4-llm

gpt4all

gpt4all: a chatbot trained on a massive collection of clean assistant data including code, stories and dialogue

gptq-for-llama

4 bits quantization of LLaMa using GPTQ

gptq-triton

GPTQ inference Triton kernel

graphdat-plugin-juniper-ex-switch

hqq

Official implementation of Half-Quadratic Quantization (HQQ)

hyperdb

A hyper-fast local vector database for use with LLM Agents. Now accepting SAFEs at $35M cap.

ios-app

Official ProtonVPN iOS app

libheif

libheif is an HEIF and AVIF file format decoder and encoder.

list

Do you want a 9 KB cross-browser native JavaScript that makes your plain HTML lists super flexible, searchable, sortable and filterable? Yeah! Do you also want the possibility to add, edit and remove items by dead simple templating? Hell yeah!

llama-dl

llama.cpp

Port of Facebook's LLaMA model in C/C++

qubitium Goto Github PK

Qubitium-ModelCloud's Projects

Recommend Projects

Recommend Topics

Recommend Org