Giter VIP home page Giter VIP logo

amd-ai-guide's Introduction

AMD / Radeon 7900XTX 6900XT GPU ROCm install / setup / config

Ubuntu 22.04 / 23.04

AMD ROCm 5.6.1

Automatic1111 Stable Diffusion + ComfyUI ( venv )

Oobabooga - Text Generation WebUI ( Conda / PyTorch (ROCm) / BitsAndBytes-ROCm ( 0.41.1 ) / ExLlama + ExLlamav2 )

Install notes / instructions

2023-07 - I have composed this collection of instructions as they are my notes. I use to setup my own Linux system with AMD parts. I've gone over these doing many re-installs to get them all right. This is what I had hoped to find when I had search for install instructions - so I'm sharing them in the hopes that they save time for other people. There may be in here extra parts that aren't needed but this works for me. Originally text, with comments like a shell script that I cut and paste

2023-09-09 - I had a report that this doesn't work in virtual machines (virtualbox) as the system there cannot see the hardware, it can't load drivers, etc. Windows users may find it more helpful to try DirectML - /

2023-09-30 - Added new version for ROCm 5.7 ->
2023-11-30 - Updated for ROCm 5.7.2 and PyTorch from Nightlies, and simplification of process ->

2023-11-30 - Rewrite of this document to include updates... ROCm 5.6.1, Stable Diffusion and ComfyUI sections rewritten for venv, Textgen webui section updated to use more of the project requirements as designed. Overall greatly simplified.

2023-12-13 - Added supplement for those who want to use Mixtral models ( uses llama.cpp ) -

2023-12-18 - ROCm 6.0 is out, so there's an updated guide for that here -

2023-12-18 - Update this document ( for ROCm 5.6.1 ) as it is the current 'stable' version supported by PyTorch. Test showed it was able to load the modest Mixtral variant TheBloke_mixtralnt-4x7b-test-GPTQ using ExLlamav2 - takes ~18GB out of VRAM to hold this model.

2023-12-23 - Update to use miniconda instead of anaconda. Exllamav2 improved details. Looks like FA2 works. Minor revisions.

Ubuntu 22.04 / 23.04 - Base system install

Ubuntu 22.04 works great on Radeon 6900 XT video cards, but does not support 7900XTX cards as they came out later Ubuntu 23.04 is newer but has issues with some of the tools. So the notes below should work on either system, unless commented.

At this point we assume you've done the system install and you know what that is, have a user, root, etc.

# update system packages 
sudo apt update -y && sudo apt upgrade -y 
#turn on devel and sources.
sudo apt-add-repository -y -s -s
sudo apt install -y "linux-headers-$(uname -r)" \
	"linux-modules-extra-$(uname -r)"

[ for Ubuntu 23.04 - lunar ]

Some things may require older versions of python, so we need to add jammy packages, so that they can be installed, on lunar systems.

sudo add-apt-repository -y -s deb jammy main universe

Add AMD GPU package sources

Make the directory if it doesn't exist yet. This location is recommended by the distribution maintainers.

sudo mkdir --parents --mode=0755 /etc/apt/keyrings

Download the key, convert the signing-key to a full Keyring required by apt and store in the keyring directory

wget -O - | \
    gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null

amdgpu repository for jammy

echo 'deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] jammy main' \
    | sudo tee /etc/apt/sources.list.d/amdgpu.list
sudo apt update -y 


sudo apt install -y amdgpu-dkms

Note : This commonly produces warning message about 'Possible missing firmware' these are just wanrings and things work anyway, they can be ignored.

ROCm repositories for jammy

echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] jammy main" \
    | sudo tee --append /etc/apt/sources.list.d/rocm.list
echo -e 'Package: *\nPin: release\nPin-Priority: 600' \
    | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt update -y

More AMD ROCm related packages

This is lots of stuff, but comparatively small so worth including, as some stuff later may want as dependencies without much notice.

# ROCm...
sudo apt install -y rocm-dev rocm-libs rocm-hip-sdk rocm-dkms rocm-libs
# update 
sudo tee --append /etc/ <<EOF
sudo ldconfig
# update path
echo "PATH=/opt/rocm/bin:/opt/rocm/opencl/bin:$PATH" >> ~/.profile

Find graphics device

sudo /opt/rocm/bin/rocminfo | grep gfx

Found : gfx1030 [ Radeon 6900 ] Found : gfx1100 [ Radeon 7900 ]

Add user to groups

Of course note to change the user name to match your user.

sudo adduser `whoami` video
sudo adduser `whoami` render
# git and git-lfs (large file support
sudo apt install -y git git-lfs
# development tool may be required later...
sudo apt install -y libstdc++-12-dev
# stable diffusion likes TCMalloc...
sudo apt install -y libtcmalloc-minimal4

Performance Tuning

This section is optional, and as such has been moved to performance-tuning

Top for video memory and usage

nvtop Note : I have had issues with the distro version crashes with 2 GPUs, installing new version from sources works fine. Project website :

sudo apt install -y nvtop 

Radeon specific tools...

sudo apt install -y radeontop rovclock

and now we reboot...

sudo reboot

End of OS / base setup

Stable Diffusion (Automatic1111)

This system is built to use its own venv ( rather than Conda )...

Download Stable Diffusion ( Automatic1111 webui ) Get the files...

git clone
cd stable-diffusion-webui

Requisites :

sudo apt install -y wget git python3 python3-venv libgl1 libglib2.0-0

Edit environment settings...

tee --append <<EOF
 ## Torch for ROCm
# generic import...
export TORCH_COMMAND="pip install torch torchvision --index-url"
 ## And if you want to call this from other programs...
 export COMMANDLINE_ARGS="--api"
 ## crashes with 2 cards, so to get it to run on the second card (only), unremark the following 

If you keep models for SD somewhere, this is where you'd like them up...

If you don't do this, it will install a default to get you going. Note that these start files do include things that it needs you'll want to copy into the folder where you have other models ( to avoid issues )

#mv models models.1
#ln -s /path/to/models models 

Run SD...

Note that the first time it starts it may take it a while to go and get things it's not always good about saying what it's up to.


end Stable Diffusion

ComfyUI install script

Same install of packages here as for Stable Diffusion ( included here in case you're not installed SD and just want ComfyUI... )

sudo apt install -y wget git python3 python3-venv libgl1 libglib2.0-0
git clone
cd ComfyUI/custom_nodes
git clone
cd ..
python3 -m venv venv
source venv/bin/activate
# pre-install torch and torchvision from nightlies - note you may want to update versions...
python3 -m pip install --pre torch torchvision --index-url
python3 -m pip install -r requirements.txt  --extra-index-url
python3 -m pip install -r custom_nodes/ComfyUI-Manager/requirements.txt --extra-index-url

# end vend if needed...

Scripts for running the program...

tee --append <<EOF
source venv/bin/activate
python3 --preview-method auto
chmod +x
tee --append <<EOF
source venv/bin/activate
python3 --preview-method auto --cpu
chmod +x

Update the config file to point to Stable Diffusion (presuming it's installed...)

# config file - connecto stable-diffusion-webui 
cp extra_model_paths.yaml.example extra_model_paths.yaml
sed -i "s@path/to@`echo ~`@g" extra_model_paths.yaml
# edit config file to point to your checkpoints etc 
#vi extra_model_paths.yaml

Now you can call ComfyUI through the script created above.

End ComfyUI install

Oobabooga - Text Generation WebUI - ROCm

Project Website :


First we'll need Conda ... Required for pytorch... Conda provides virtual environments for python, so that programs with different dependencies can have different environments. Here is more info on managing conda : Other notes : Download info :

Anaconda ( if you prefer this to miniconda below )

#cd ~/Downloads/
#bash -b
#cd ~
#ln -s anaconda3 conda

Miniconda ( if you prefer this to Anaconda above... ) [ ]

cd ~/Downloads/
bash -b
cd ~
ln -s miniconda3 conda
echo "PATH=~/conda/bin:$PATH" >> ~/.profile
source ~/.profile
conda update -y -n base -c defaults conda
conda install -y cmake ninja
conda init
source ~/.profile

conda is now active...

install pip

sudo apt install -y pip
pip3 install --upgrade pip

useful pip stuff to know ...

## show outdated packages...
#pip list --outdated
## check dependencies 
#pip check
## install specified bersion 
#pip install <packagename>==<version>

End conda and pip setup.

Oobabooga / Textgen webui

conda create -n textgen python=3.11 -y
conda activate textgen

PyTorch install...

# pre-install 
pip install --pre cmake colorama filelock lit numpy Pillow Jinja2 \
	mpmath fsspec MarkupSafe certifi filelock networkx \
	sympy packaging requests \
# install
pip install torch torchvision torchtext torchaudio torchdata  \

bitsandbytes rocm

2023-09-11 New version of BitsAndBytes(0.41 !) made for 5.6 Project website :

git clone
cd bitsandbytes-rocm-5.6/
BUILD_CUDA_EXT=0 pip install -r requirements.txt --extra-index-url
# 7900XTX
#make hip ROCM_TARGET=gfx1100 ROCM_HOME=/opt/rocm-5.6.1/
# 6900XT
#make hip ROCM_TARGET=gfx1030 ROCM_HOME=/opt/rocm-5.6.1/
# both...
make hip ROCM_TARGET=gfx1100,gfx1030 ROCM_HOME=/opt/rocm-5.6.1/
pip install . --extra-index-url

Flash-Attention 2 :

Install may take a few mins ( takes author close to 5 minutes at time of writing )...

git clone
cd flash-attention
pip install .

2023-11-30 - Note it appears PyTorch for ROCm doesn't include FA support at this time... as there's a warning : "UserWarning: 1Torch was not compiled with memory efficient attention. " Further this issue is noted here : pytorch/pytorch#112997 - So while the above runs, it isn't operating at the present time. 2023-12-23 - FA2 appears to be working. YMMV.

Oobabooga / Text-generation-webui - Install webui...

git clone
cd text-generation-webui

Oobabooga's 'requirements'

The default bitsandbytes for AMD is out of date and doesn't support GPU. So we installed one earlier ( may be unsupported... ) we'll run sed first to adjust that line of the requirements...

sed -i "s@bitsandbytes==@bitsandbytes>=@g" requirements_amd.txt 
pip install -r requirements_amd.txt 

Exllama and Exllamav2 loaders ... exllama isn't being maintained, but exllamav2 is... 2023-12-23 - After many tests, it appears that the exllamav2 that's installed above gives an error, so we're compiling and reinstalling exllama here as when we do that it does work.
2024-01-19 - Something has broken and exllamav2 won't compile directly from checkout, The package version for 5.6 may be functional. However, for those who want to compile the latest, or for their hardware, I've added a line to reset the checkout to the last known good / compiling version 0.0.11 2024-01-20 - Thanks to TurboDerp for fixing the exllamav2 code to play nice with HIP... workaround no longer needed, so I've remarked it out incase it's helpful in the future.

## install exllama
##git clone repositories/exllama
# install exllamav2
git clone repositories/exllamav2
cd repositories/exllamav2
# Force collection back to base 0.0.11 
# git reset --hard a4ecea6
pip install .   
cd ../..

Let's create a script ( to run the program...

tee --append <<EOF
## activate conda
conda activate textgen
## command to run server...
# python
# preferred configuration... note that --listen makes it accessible on the local network.
python --listen --extensions sd_api_pictures send_pictures gallery
conda deactivate
chmod u+x

Models If you're new to this - new models can be downloaded from the shell via a python script, or from a form in the interface. There are lots of them - Generally the GPTQ models by TheBloke are likely to load. The 30B/33B models will load on 24GB of VRAM, but may error, or run out of memory depending on usage and parameters.

To get new models note the ~/text-generation-webui directory has a program " " that is made for downloading models from HuggingFace's collection.

If you have old models, link pre-stored models into the models

# cd ~/text-generation-webui
# mv models models.1
# ln -s /path/to/models models

And here's the command to start the interface... note that it takes some time to download components the first time it runs.


The exllamav2 loader works with most GPTQ models. This is the best choice as it is fast.
Some models that won't load that way will load with AutoGPTQ - but without Triton ( triton seems to break things ). Also worth noting, I've had things work on one card or the other, but not on both cards, loading on both cards causes LLMs to spit out gibberish.

End - Oobabooga - Text-Generation-WebUI

nvtop from source

( As one from packages crashes on 2 GPUs, while this never version from sources works fine. ) project website : optional - tool for displaying gpu / memory usage info The package for this crashes with 2 gpu's, here it is from source.

sudo apt install -y libdrm-dev libsystemd-dev libudev-dev cmake
git clone
mkdir -p nvtop/build && cd nvtop/build
sudo make install

end nvtop

amd-ai-guide's People


nktice avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.