IST Austria Distributed Algorithms and Systems Lab photo

ist-daslab Goto Github PK

repos: 40.0 gists: 0.0

Name: IST Austria Distributed Algorithms and Systems Lab

Type: Organization

Blog: https://ista.ac.at/en/research/alistarh-group/

IST Austria Distributed Algorithms and Systems Lab's Projects

acdc

Code for reproducing "AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks" (NeurIPS 2021)

cap

Repository for Correlation Aware Prune (NeurIPS23) source and experimental code

cram

Code for reproducing the results from "CrAM: A Compression-Aware Minimizer" accepted at ICLR 2023

distiller

Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://nervanasystems.github.io/distiller

distiller-micronet

working repository for the the IST daslab micronet challenge

efcp

The repository contains code to reproduce the experiments from our paper Error Feedback Can Accurately Compress Preconditioners available below:

gcomp_sim_strip

Stripped version of gcomp_sim for ML course

gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

horovod

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

kdvr

Code for the experiments in Knowledge Distillation Performs Partial Variance Reduction, NeurIPS 2023

llm-foundry

LLM training code for Databricks foundation models

m-fac

Efficient reference implementations of the static & dynamic M-FAC algorithms (for pruning and optimization)

marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

obc

Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".

peft-rosa

A fork of the PEFT library, supporting Robust Adaptation (RoSA)

pruned-vision-model-bias

Code for reproducing the paper "Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures"

qigen

Repository for CPU Kernel Generation for LLM Inference

qmoe

Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

Repository for the implementation of "Distributed Principal Component Analysis with Limited Communication" (Alimisis et al., NeurIPS 2021). Parts of this code were originally based on code from "Communication-Efficient Distributed PCA by Riemannian Optimization" (Huang and Pan, ICML 2020).

ist-daslab Goto Github PK

IST Austria Distributed Algorithms and Systems Lab's Projects

Recommend Projects

Recommend Topics

Recommend Org