rocm-gfx803's People
rocm-gfx803's Issues
How to set AMD GPU targets when compiling tensorflow-rocm?
Hi there,
I have a RX590 (gfx803). I'm currently trying to build the tensorflow-rocm (r2.12-rocm-enhanced) with ROCm 5.4.3 on Ubuntu 22.04 in order to use the newer version of tensorflow.
My question is how do you change the AMD GPU targets when compiling tensorflow-rocm to include gfx803? I have tried to set the following environment variables, but it seems like none of these works when I test the output whl file.
export AMDGPU_TARGETS=gfx803
export TF_ROCM_AMDGPU_TARGETS=gfx803
export GPU_DEVICE_TARGETS=gfx803
printf 'gfx803\n' | tee -a "/opt/rocm/bin/target.lst"
./build_rocm_python3
When I test the output whl file, it will give me the following error:
2023-07-24 11:06:52.592658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2011] Ignoring visible gpu device (device: 0, name: Radeon RX 590 Series, pci bus id: 0000:07:00.0) with AMDGPU version : gfx803. The supported AMDGPU versions are gfx1030, gfx900, gfx906, gfx908, gfx90a.
Thanks
OSError: libmpi_cxx.so.40: cannot open shared object file: No such file or directory
Hi! I'm trying to install Rocm5 with gfx803, and really thanks a lots for all your work!
I have Ubuntu 20.4.3
Kernel 5.11.0-27-generic
python 3.8.10
- I installed ROCm using amdgpu tools following this guide https://docs.amd.com/bundle/ROCm_Installation_Guide-v5.0/page/How_To_Install_ROCm.html#_Installation_Methods, I got no error or warning
- i follow your procedure installing rocblash, pytorch and torchvision
- at the ending of rocblash istallation i get this warming:
_
503a092-cp38-cp38-linux_x86_64.whl
Collecting typing-extensions
Downloading typing_extensions-4.1.1-py3-none-any.whl (26 kB)
Installing collected packages: typing-extensions, torch
WARNING: The scripts convert-caffe2-to-onnx, convert-onnx-to-caffe2 and torchrun are installed in '/home/fiss/.local/bin' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed torch-1.11.0a0+git503a092 typing-extensions-4.1.1**
_
So i add the variable
export PATH="$PATH:/home/fiss/.local/bin"
to .bashrc for adding the directory to PATH
After the installation procedure from the terminal, if i try to import torch i obtain:
import torch
Traceback (most recent call last):
File "", line 1, in
File "/home/fiss/.local/lib/python3.8/site-packages/torch/init.py", line 198, in
_load_global_deps()
File "/home/fiss/.local/lib/python3.8/site-packages/torch/init.py", line 151, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.8/ctypes/init.py", line 373, in init
self._handle = _dlopen(self._name, mode)
OSError: libmpi_cxx.so.40: cannot open shared object file: No such file or directory
What i can do? i'm very noob with linux but i'm sure that with your help we well overcome this issue
thanks for your attention
Could we update the Torch package here?
Hello,
Thanks to your incredible work I'm able to run the torch build here with ROCM on an RX580 on Arch.
Your Pytorch is Python 3.8 and Torch 1.11
I was hoping for Python 3.9 and Torch 1.12+
I was hoping to use a newer version of Pytorch, how do we build it? I'm willing to help.
torchaudio issue
first of all i want to thank you, without all your effort i won't ever been able to use PyTorch over my GPU, I'm very grateful.
I should use torchaudio for an exam, but when i try to install this library, the installer first remove PyTorch gfx803 compatible that i installed before, after it reinstall pytorch (not gfx803 compatible) and finally install torchaudio, losing gpu compatibility.
Do you know how can i resolve this issue?
thanks in advantage
Pytorch binaries not working on arch4edu ROCm
Hello, I installed the ROCm stack from arch4edu and it seems to be working (rocminfo detects my RX 580). However, upon installing and testing torch (installed from the wheels provided here), this error pops up.
Traceback (most recent call last):
File "pytest.py", line 4, in <module>
import torch
File "/home/fran/.local/lib/python3.8/site-packages/torch/__init__.py", line 199, in <module>
from torch._C import * # noqa: F403
ImportError: /home/fran/.local/lib/python3.8/site-packages/torch/lib/libtorch_cpu.so: undefined symbol: zgetrs_
I tried building torch myself but it didn't go so well, haha
I attempted to follow this but for instance, I don't seem to find the Arch equivalent of the packages installed by apt. Proceeding to build pytorch results in a bunch of errors, I couldn't really distinguish what the problem was.
Is it me or we got no image textures?
Hi! First of all, thank you a load of a lot for making this project! You were a saviour of a gpu right here.
So everything seems to be working except when I add image textures to my shaders. Was it something I did wrong when installing rocm/blender or is this a known issue? Blender just crashes when I try to run Cycles with any image attached to the file, no matter how big or small.
Maybe the terminal output can help? Here it is anyway:
$ ./blender
Read prefs: /home/aori/.config/blender/3.4/config/userpref.blend
[ALSOFT] (EE) Failed to set real-time priority for thread: Operation not permitted (1)
[ALSOFT] (EE) Failed to set real-time priority for thread: Operation not permitted (1)
Reloading external rigs...
Reloading external metarigs...
Pillow is not installed, therefore:
- BIP images load without scaling.
- Other images load slowly (Blender standard).
Pillow is not installed, therefore:
- BIP images load without scaling.
- Other images load slowly (Blender standard).
Pillow is not installed, therefore:
- BIP images load without scaling.
- Other images load slowly (Blender standard).
Unhandled SIGBUS caught
Aborted (core dumped)
Thanks for the response,
Aori.
We need wheels for python 3.9
Hello
In Slicer we have a new module using torch https://discourse.slicer.org/t/new-extension-fully-automatic-whole-body-ct-segmentation-in-2-minutes-using-totalsegmentator/26710/12
The problem is that Slicer uses an embedded python 3.9 by default (it can install wheels).
Is there a possibility to support python 3.9 here?
How to build patched tensorflow package
Environment
Hardware | description |
---|---|
GPU | RX 570 |
CPU | Ryzen 5 2600 |
Software | version |
---|---|
OS | Ubuntu 20.04.5 |
ROCm | 5.3.0 gfx803 (from this repo) |
Python | 3.8 |
Hi, for my application I need tensorflow 2.7, so I'd like to build it. From the available resources it is not clear to me how the provided tensorflow package is patched or if it is even patched at all to run on gfx803. Could you provide an insight on how you build the tensorflow package please?
Does ROCm support Polaris 21 Family ?
Hello, I want to install ROCm on my Arch because it said capable to help model training on my machine learning program using Tensorflow, but my laptop graphics card is RX 560 (Polaris 21)...Does ROCm supported it ?..And which version are you suggest ?..Thanks
SD_WebUI_V1.6.0 does not support python3.8
When I use the Pytorch built by this repositorie, it means that I need to use Python3.8 to run Stable Diffusion. After testing, Python3.8 will report an error in the official Stable Diffusion V1.6.0, indicating that the Stable Diffusion V1.6.0 will appear. Do not support python3.8, which version of Stable Diffusion should be used in python3.8 can make it run.
Python 3.8.10 (default, Jun 4 2021, 15:09:15)
[GCC 7.5.0]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4
Installing clip
Installing requirements for CodeFormer
Installing requirements
Launching Web UI with arguments: --listen --enable-insecure-extension-access --opt-sdp-attention --skip-torch-cuda-test
/home/sliman/miniconda3/envs/py38/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: /home/sliman/miniconda3/envs/py38/lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
warn(f"Failed to load image Python extension: {e}")
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
Traceback (most recent call last):
File "launch.py", line 48, in
main()
File "launch.py", line 44, in main
start()
File "/home/sliman/stable-diffusion-webui/modules/launch_utils.py", line 432, in start
import webui
File "/home/sliman/stable-diffusion-webui/webui.py", line 13, in
initialize.imports()
File "/home/sliman/stable-diffusion-webui/modules/initialize.py", line 33, in imports
from modules import shared_init
File "/home/sliman/stable-diffusion-webui/modules/shared_init.py", line 5, in
from modules import shared
File "/home/sliman/stable-diffusion-webui/modules/shared.py", line 5, in
from modules import shared_cmd_options, shared_gradio_themes, options, shared_items, sd_models_types
File "/home/sliman/stable-diffusion-webui/modules/options.py", line 74, in
class Options:
File "/home/sliman/stable-diffusion-webui/modules/options.py", line 77, in Options
def init(self, data_labels: dict[str, OptionInfo], restricted_opts):
TypeError: 'type' object is not subscriptable
My environment
Ubuntu 22.04.3 LTS x86_64
RX580 8G(gfx803)
Python3.8.10
Please update to 5.1
HSA_STATUS_ERROR_OUT_OF_RESOURCES In rocminfo and no devices in clinfo
I get the HSA_STATUS_ERROR_OUT_OF_RESOURCES error when I run rocminfo (ROCm 4.5.2) on my computer. A previous install on another drive worked (ROCm 4.5.0) on the same kernel (5.11). When I run clinfo, 0 devices show up under “AMD Accelerated Parallel Processing”.
I have libopenblas and libopenmpi installed already, my PCIe slot supports atomics(no kfd errors). I have the patched ROCBlas and your torch and torchvision. Torch says that there are no CUDA devices(torch treaters HIP as a CUDA device)
OS: Ubuntu 20.04
Kernel: 5.11.0-44-generic
ROCm version: 4.5.2(I originally said 5.2 which does not exist, sorry)
Using the patched miopen .deb throws warning xnack "off" in gfx803
I have followed the steps listed in this repo's README.md to install rocm-dev, rocm-lib and other .deb's provided in the links, I'm currently in Ubuntu 20.04 LTS, Kernel 5.11.0-25-generic, My GPU is RX 550 (gfx804) Polaris 11, rocm 4.2. After installation I tried out some benchmark using resnet18 and cifar10. That is when I was greeted by this warning: xnack 'Off' was requested for a processor that does not support it!
. Fortunately my training proceeded without ant error or problems, just wanted to report this warning popping off each time. I have attached a screenshot of my training when the warning pops.
Does this work for RX 550 and on arch linux ?
Docker
Hi, i am trying to use this inside the dockers provided by amd for use with tensorflow, i have tried with 4.5.2 and 4.3.1, unfortunately they dont provide 4.5.0 or 4.3.0 with tensorflow, anyways, once i install the rocblas package the default installation of tensorflow doesn't work, which I think is expected, but i get an error when trying to install the wheel provided here
'''ERROR: tensorflow-2.6.0-cp38-cp38-linux_x86_64.whl is not a supported wheel on this platform.'''
But I have successfully used your packages inside other dockers with rocm, so, one thing is to say, thanks for your efforts.
The error comes to the version of python (the one in the dockers with tensorflow preinstaled are 3.6 and the other docker is 3.8) and to ask if should I add a Docker option, i also tried to compile tensorflow for rocm 3.10 but i was not able to do it. I would like to contribute with those dockers and doing the compiles for at least a few extra versions, any info you could gave me to help with this would be appreciated.
PD: i dont know where to put this more as a comment than an issue, my issue was the compatibility with the dockers, which i was able to solve and i want to leave this if another person has the same problem, i think it would be a good idea to add this to the readme
unhandled SGPR spill to memory - Blender(HIP)
Is this fatbin file version sensitive? I mean will one fatbin work on other blender versions ? like Blender 3.6 , 4.0 ? I compiled blender 4.0 and used 3.4.1's fatbin that you provided . But doesn't working
Pytorch GPU returns false
edit: solved was missing a couple of packages:
sudo apt-get install libopenblas-base libopenmpi-dev
Hi, thank you for releasing these patches to keep gfx803 working with ROCm.
Trying out Pytorch shouldn't this return true?
sudo PYTORCH_TEST_WITH_ROCM=1 python3 -c 'import torch;print("GPU:",torch.cuda.is_available())'
GPU: False
both clinfo and rocm-smi look good. I need to call clinfo with sudo though, otherwise it won't show the gpu
might be related: if I run python3 without sudo I get this error while loading torch:
import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/user/pytorchrocm/lib/python3.8/site-packages/torch/__init__.py", line 196, in <module>
_load_global_deps()
File "/home/user/pytorchrocm/lib/python3.8/site-packages/torch/__init__.py", line 149, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libmpi_cxx.so.40: cannot open shared object file: No such file or directory
I have a virtual environment for the rocm version of pytorch, if I run pip3 freeze it shows the correct version of pytorch, but if I run sudo pip3 freeze it shows the non-rocm version
Is there any instructions for build torchvision
Thank you for your hard work, it helps me a lot.
I'm trying to run a Stable Diffusion project(AUTOMATIC1111/stable-diffusion-webui) on my gfx803(RX580),I use your built rocblas, pytorch and torchvision. It works fine at first, but after the project is updated, it force me to use pytorch 1.12+, if I keep use pytorch 1.11, it will faild at load models.
So I follow your guide about navi10(https://github.com/xuhuisheng/rocm-build/blob/master/navi10/README.md) to build a pytorch 1.12.1 for my gfx803. But is seems have some problems with torchvision, if I kepp use your built torchvision, it will occur an error:
/..../lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: /..../lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZNK3c1010TensorImpl36is_contiguous_nondefault_policy_implENS_12MemoryFormatE
If I change the torchvision to torchvision-0.13.1+rocm5.1.1-cp39-cp39-linux_x86_64 the error code is:
_ZN3c106detail23torchInternalAssertFailEPKcS2_js2_RKSs
If I use pip install torchvision==0.13.1 to install torchvision, the error is:
Failed to load image Python extension: libc10_cuda.so: cannot open shared object file: No such file or directory.
And all these three vision have problems in generate images, There is a greater chance of producing colored lines in the generated image, like screen tearing. And it is most serious in img2img mode.
So I wonder if I need to build a torchvision for my environment, Whether special parameters need to be set when compiling, or just follow the official guide.
My environment is:
Ubuntu-20.04.5(5.13.0-35-generic)
RX580
ROCm 5.2.0
I'll upload screenshots of the error later.
Pytorch2.0.1 Rocm5.5 support
Hi
Will you also release this version?
Torch wheel and libmpi_cxx.so.40
rocminfo generates error HSA_STATUS_ERROR_OUT_OF_RESOURCES
After installing the rocm 4.5.0 I followed for method and added rocm-dkms and rocm-libs and installing the rocmblas downloaded from here for rocm4.5.0.
When I run rocm-smi
I get this:
======================= ROCm System Management Interface =======================
================================= Concise Info =================================
GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%
0 45.0c 14.127W 760Mhz 1750Mhz 0% auto 48.0W 11% 0%
================================================================================
============================= End of ROCm SMI Log ==============================
But when I run rocminfo
I get this:
ROCk module is loaded
hsa api call failure at: /long_pathname_so_that_rpms_can_package_the_debug_info/src/rocminfo/rocminfo.cc:1143
Call returned HSA_STATUS_ERROR_OUT_OF_RESOURCES: The runtime failed to allocate the necessary resources. This error may also occur when the core runtime library needs to spawn threads or create internal OS-specific events.
I am already part of render and video groups
salik@salik-pc:~$ groups
salik adm cdrom sudo dip video plugdev kvm render lpadmin lxd sambashare libvirt docker
Any help would be appreciated.
Kernel: Linux 5.11.0-43-generic #47
OS: 20.04.2-Ubuntu
ROCM: 4.5
Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:113.)
I have Ubuntu 20.4.3
Kernel 5.11.0-27-generic
python 3.8.10
GPU: radeon FirePro f9300x2 (equivalent as 2 radeon Nano)
Hi, now i can import successfully pythorch, but when i run torch.cuda.is_available() I get this error:
torch.cuda.is_available()
/home/fiss/.local/lib/python3.8/site-packages/torch/cuda/init.py:82: UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at ../c10/hip/HIPFunctions.cpp:113.)
return torch._C._cuda_getDeviceCount() > 0
do you have any idea?
thanks a lot for your support!
Strange issue, images generates flawlessly but...
This is a strange issue. I am using a rx 580 4gb on ubuntu and I followed this guide here https://rentry.org/sable-sdw-ubuntu-amd-gfx8.
Images generate flawlessly.. that is until the very final step. Right before it outputs the completed image, my PC turns off and the fans go max speed like the computer is overheating. But it is not overheating. Even if I interrupt a generation right after it starts, the PC will shut down with fans blazing. I have tested different resolutions. 256 x 256 works but it produces somewhat recognizable but distorted images. I tried 320 x 320 and again the PC shuts down right before the final image outputs.
Using Windows 10, the card works perfectly fine at 512 x 512, but I do not want to use Windows... might have to.
I am at a loss.
How to install for FreeBSD 13.1?
Possible to update PyTorch build to support Torch 1.13.1 Rocm5.2?
Not sure how difficult it is, but is there a chance we might be able to get an updated build of PyTorch for Rocm5.2 with GFX803 enabled?
Currently the Rocm5.2 pytorch has gfx803 left out. and attempting to use xuhuisheng's build results in compatibility errors as other libraries are expecting the torch version to be 1.13.1 and torchvision 0.14.1.
Xuhuisheng's version is built on Torch 1.11.1 and Torchvision 0.12.0.
I'm certainly willing to willing to try and build it myself if anyone has a good guild on how to compile both Torch with Rocm (so far only found guides for Cuda) and TorchVision
OSError: libc10_cuda.so: cannot open shared object file: No such file or directory
Hello, I am trying to run diff-svc on my gfx803 gpu. When I try to run inference, I get
Traceback (most recent call last): File "inference.py", line 8, in <module> from infer import * File "/projects/diff-svc/diff-svc/infer.py", line 10, in <module> from infer_tools import slicer File "/projects/diff-svc/diff-svc/infer_tools/slicer.py", line 5, in <module> import torchaudio File "/usr/local/lib/python3.8/dist-packages/torchaudio/__init__.py", line 1, in <module> from torchaudio import _extension # noqa: F401 File "/usr/local/lib/python3.8/dist-packages/torchaudio/_extension.py", line 67, in <module> _init_extension() File "/usr/local/lib/python3.8/dist-packages/torchaudio/_extension.py", line 61, in _init_extension _load_lib("libtorchaudio") File "/usr/local/lib/python3.8/dist-packages/torchaudio/_extension.py", line 51, in _load_lib torch.ops.load_library(path) File "/usr/local/lib/python3.8/dist-packages/torch/_ops.py", line 220, in load_library ctypes.CDLL(path) File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__ self._handle = _dlopen(self._name, mode) OSError: libc10_cuda.so: cannot open shared object file: No such file or directory
Is it a problem with the current pytorch version? I know it is working because when I run
import torch if torch.cuda.is_available(): device = torch.device("cuda:0") print("Running on the GPU") else: device = torch.device("cpu") print("Running on the CPU")
It prints "Running on the GPU". Thanks!
Installing rocblas 5.1.1 deb might break system updates or gets uninstalled
Hi, AMD seems updated ROCM 5.1.1 to build 50101 since a while so the dirty version is no longer considered superior and would get uninstalled with every next system update. So manual installation needs to be done each time.
It is also possible to pin the dirty version, but then all system updates are effectively blocked, as there are unmet dependencies:
Building dependency tree... Done
Reading state information... Done
You might want to run 'apt --fix-broken install' to correct these.
The following packages have unmet dependencies:
rocblas-dev : Depends: rocblas (>= 2.43.0.50101) but 2.43.0-490c4140~dirty is installed
E: Unmet dependencies. Try 'apt --fix-broken install' with no packages (or specify a solution).
Trying suggested way results in removal of entire ROCm.
I am not sure how APT handles the dependencies, but probably easiest way is to increase rocblas version to something like 2.43.0.99999 ?
Btw latest ROCm is now 5.1.3, would it be possible to bump the version to this one, maybe that would work as well?
ROCm 5.3.0 on Ubuntu 22.04.1 LTS with RX580
Sorry if this is the wrong location for this post:
I am trying to install pytorch, but it seems that rocm is not successfully installed (following the steps in this repo's README but on the newer Ubuntu version). Basically fresh install of Ubuntu 22.04.1 LTS, kernel v. 5.15.0-50. The amdgpu-install script installs without errors, but rocminfo and clinfo do not seem to show the right output.
me@astra:~$ sudo /opt/rocm-5.3.0/bin/rocminfo
ROCk module is loaded
hsa api call failure at: /long_pathname_so_that_rpms_can_package_the_debug_info/src/rocminfo/rocminfo.cc:1148
Call returned HSA_STATUS_ERROR_OUT_OF_RESOURCES: The runtime failed to allocate the necessary resources. This error may also occur when the core runtime library needs to spawn threads or create internal OS-specific events.
me@astra:~$ /opt/rocm-5.3.0/opencl/bin/clinfo
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 2.1 AMD-APP (3486.0)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback
Platform Name: AMD Accelerated Parallel Processing
Number of devices: 0
I understand that the gfx800s series is not supported with newer rocm releases, so I'm not sure what kind of output I should be getting from rocminfo or clinfo or if this is expected and its otherwise functional. Has anyone else tried? Is the output from those utils broken on Ubuntu 20.x as well?
OSError: libmpi_cxx.so.40: cannot open shared object file: No such file or directory
Hello, thanks for providing binaries for gfx803! I am somewhat new to using virtual environments/PyTorch so any help is appreciated:
I'm trying to use them alongside Automatic111's stablediffusion webui
Setup:
OS(fresh install): Ubuntu-20.04.5 Kernel: 5.15.0-56-generic Python: 3.8.10 ROCm: 5.3.0
I followed only step 1 of this guide to install rocm https://github.com/RadeonOpenCompute/ROCm-docker/blob/master/quick-start.md
Within venv, I installed the 2 binaries as well as the 3 wheel packagesand ran:
Command line arguments
export HSA_OVERRIDE_GFX_VERSION=10.3.0 ROC_ENABLE_PRE_VEGA=1 python launch.py --precision full --no-half
Python 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0]
Commit hash: 44c46f0ed395967cd3830dd481a2db759fda5b3b
Traceback (most recent call last):
File "launch.py", line 294, in <module>
prepare_enviroment()
File "launch.py", line 209, in prepare_enviroment
run_python("import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'")
File "launch.py", line 73, in run_python
return run(f'"{python}" -c "{code}"', desc, errdesc)
File "launch.py", line 49, in run
raise RuntimeError(message)
RuntimeError: Error running command.
Command: "/home/<username>/Desktop/stable-diffusion-webui/venv/bin/python" -c "import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'"
Error code: 1
stdout: <empty>
stderr: Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/<username>/Desktop/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/__init__.py", line 198, in <module>
_load_global_deps()
File "/home/<username>/Desktop/stable-diffusion-webui/venv/lib/python3.8/site-packages/torch/__init__.py", line 151, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libmpi_cxx.so.40: cannot open shared object file: No such file or directory
Please provide how to build pytorch from source for ROCm-3.5.1
I'm using CentOS 8 which the latest ROCm that works with gfx803 is 3.5.1.
I can't install your pytorch wheel because it was built with difference version of glibc available on CentOS 8, so I have to build it from the source.
cuDNN version incompatibility: PyTorch was compiled against (2, 15, 0) but linked against (2, 16, 0)
HI. I upgraded to rocm 5.1.1 by a fresh istallation. When i try to run a pytorch project i get that issue
Currently running torch 2.3.0, torch vision 17.2 ROCM 6.1 successfully
Hello,
Thank you for sharing this.
I have got some ideas from your various scripts that its technically possible to provide support for this target, but then decided to build the ROCM necessary stuff manually and then I got the above mentioned versions running just fine on stable diffusion and some other llama agents.
I have several of these gfx803 and I had to make sure they don't lose their value.
What I built manually, is rocBLAS, rocSOLVER, rocMLIR and defining some env vars for the GPU_TARGETS and all the basic ones of course.
So this is possible, and this is what people want to hear.
here is one image I generated using the stuff :)
Thanks again.
Failed to load image Python extension:/python3.8/site-packages/torchvision/image.so: undefined symbol:
Hi
First of all thanks for all the support that do you give us!
i'm trying to train in multigpu a torchvision project, and i'm getting this issue:
/home/fi/.local/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: /home/fi/.local/lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZNK3c1010TensorImpl36is_contiguous_nondefault_policy_implENS_12MemoryFormatE
this led to GPU memory errors, with frequently computer freezing and shut down
Maybe rebuild torchvision from source can be a nice try?
Update blender?
Hello,
Could you release an updated blender build please?
Question: where is the source for tensorflow-rocm?
Thanks for the helpful guide.
Btw, in the readme.md
, to install tensorflow-rocm we need to install the wheel provided from this repo, I assume that there were some changes to the original tensorflow-rocm, could we get the sources that are used for building this wheel? Thanks
How To Get This Working On An 18.04-based System?
What would I need to do to get this working on my 18.04LTS-based system?
I'm running elementaryOS 5.1.7, and the only upgrade path to a 20.04-based system is a clean install. I can't afford to have my system down for any length of time, or lose any data, if something goes wrong with a full install.
My video card is an RX590.
blender 4.0 update ?
I can't thank you enough this is really the best and you are a genius , thank you sir so much , just wondering if blender binaries will be updated to blender 4 thank you soo much
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.