Neural Magic's Projects
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Hackathon 2022
A framework for the evaluation of autoregressive code generation language models.
Causal depthwise conv1d in CUDA, with a PyTorch interface
CLIP-like model evaluation
A safetensors extension to efficiently store sparse quantized tensors on disk
CUDA Templates for Linear Algebra Subroutines
π€ The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Sparsity-aware deep learning inference runtime for CPUs
Repo for building and packaging a 1-click app for DigitalOcean
Top-level directory for documentation and general content
NeuralMagic fork of EvalPlus (Rigourous evaluation of LLM-synthesized code - NeurIPS 2023)
Notebooks using the Neural Magic libraries π
woop wooop
Helm charts for deploying NM VLLM
Reference implementations of MLPerfβ’ inference benchmarks
β‘ Building applications with LLMs through composability β‘
NM fork of LLM foundry for compatibility with SparseAutoModel.
A framework for few-shot evaluation of language models.
A framework for few-shot evaluation of autoregressive language models.
Mamba SSM architecture
NM fork of MixEval compatible with SparseAutoModel.
Neural Magic GHA
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Neural Magic Docker