eomii / rules_ll Goto Github PK
View Code? Open in Web Editor NEWAn Upstream Clang/LLVM-based toolchain for contemporary C++ and heterogeneous programming
Home Page: https://ll.eomii.org
License: Other
An Upstream Clang/LLVM-based toolchain for contemporary C++ and heterogeneous programming
Home Page: https://ll.eomii.org
License: Other
The draft #98 adds experimental support for C++23 module std
. Getting things to work required some customizations to the internal file inputs and to the way we handle toolchain.cpp_stdlib
. This is not pretty. We should rework things in a way that doesn't require hacky list indexing and .to_list()
ing depsets.
It may be desirable for non-rules_ll
users to get bzlmod
support for the original Clang/LLVM Bazel overlay. The files whose contents we may be able to upstream are ll/extensions.bzl
, MODULE.bzl
and .bazelrc
. Ideally, bzlmod
users should be able to import llvm-project
via the bazel-central-registry
.
rules_ll
specific extensions should remain in this repository and the bazel-eomii-registry
.
Since zlib still hasn't addressed madler/zlib#633 in almost a year we should consider it unsupported and deprecated.
I've already sent https://reviews.llvm.org/D143320 but that'll take some time to get into LLVM main due to the official overlay not yet using bzlmod by default. However, we can already use the patch in rules_ll.
We should probably also aim to upstream our zlib-ng buildfile to the BCR.
New users might not know how to use bazels caching effectively across projects.
Our setup should make users aware that local caching exists and how to enable it.
The docs for ll.defs
are messed up. somehow we have some super long lines in there ๐คฃ
Vale pre-commit hooks don't work properly at the moment. Tracking progress in errata-ai/vale#575.
When attempting to adjust linking paths of libraries such as OpenSSL or libcrypto under WSL Ubuntu 22.04 with the "-L" flag the build will use the system libraries instead of the provided ones by the development environment.
This is already possible, but we should probably document the workflow. I suspect that this is especially relevant for WSL2 users because the WSL CUDA driver tends to differ from the one we package in rules_ll#unfree
.
Maybe we should add explicit checks that set certain rpath values for WSL as well?
Using the compilation_database
rule is too clunky otherwise.
We already use extensive toolchain transitions to handle our various compilation_mode
s. It looks like this is not enough anymore.
Our current approach is limited in the following ways:
ll_binary
tools in genrules since that requires the ll_binary
to be in exec
configuration. We need some way to transition from the compilation_mode
-specific target
configurations to an exec
configuration. This is not supported at the moment.We need to be careful that opening the toolchains up to handle such cases only leads to excessive rebuilds when absolutely necessary. Otherwise users may end up building LLVM several times just to get a trivial ll_binary
working in a genrule. We may also need better platform support to tackle this elegantly.
Things work at the moment because we can fall back to rules_cc
for exec tools. This is a very undesirable limitation of the current implementation.
Attempts to run the tests in CI via remote execution currently doesn't work because Bazel doesn't like to run in a nix-built container. build
and run
works, but test
doesn't, most likely due to bazelbuild/bazel#12579.
Technically it's already decent coverage if just builds pass, but many issues arise from dynamic linking behavior and are only visible during runtime. So at the moment we'd either have to run all examples manually without the ll_test
wrappers, or only run a bazel build cpp
without running anything.
Another option would be to build a custom Bazel which we distribute as part of rules_ll. Building a custom Bazel against an LLVM toolchain and statically linking libc++ could be an option that keeps things portable between CI and regular usage, but it might lead to issues for non-nix workflows.
@JannisFengler @SpamDoodler @jaroeichler What do you think? Statically linking Bazel with libc++ would add a few MB to all images, caches, the devenv etc because we'd have duplicate libc++ functions in every subbinary and we'd have to thinkg about infrastructure to support staying upstream with the bazel sources. That would make it easier to get remote execution to work though. Do we want to go down that path or should we try to find another solution?
This tool currently requires users to manually run the pre-commit hooks to reliably check whether generated configs have changed. This should be integrated into the rbegen
invocation.
We should also add a release
attribute to the tool that tags the image with a release version and pushes it to a remote registry. We need this to release the next version of rules_ll
.
It can be tricky to write CUDA/HIP code that at least remotely looks like C++. At the moment the examples are littered with // NOLINT
directives so that clang-tidy doesn't completely ragequit.
Let's try to find better ways to write these examples.
Hi Aaron, do you have time to meet in the city center now, so I can borrow you an Nvidia GPU to fix the cuda tool chain errors?
Best regards,
Jannis
While precompilations correctly cannot see each other if they are specified in the same interfaces
attribute, the same is not true for the implicit BMI-to-Object compilation. This is a bug. Only files in srcs
should be able to see BMIs from interfaces
.
NixOS/nixpkgs#224646 (comment) mentioned that the way we currently import CUDA from nix is outdated. We should change imports from the outdated
pkgs.cudaPackages.cudatoolkit
to
cudaPackages.{lib,cuda_foo}
@JannisFengler @SpamDoodler This might make WSL compatibility work.
As part of the transition to the flake-based workflow we removed Vale.
Getting it to run again is slightly tricky, as we need an additional config step before we can run the vale binary. Let's try to make things work again in a reproducible manner, i.e. ideally without having to rely on vales irreproducible autoinstaller.
I started testing Steamdeck support for GPU (and CPU) code execution. My excuse is that Teslas run on similar APU architecture and I am thinking that performace gained here is worth looking into.
The only remote execution image currently provided is the default image which we use for the tests and pin in rbe/default/config/BUILD.
The default image includes openssl because the examples require it. This is not ideal. Since all the toolchain and container auto generation can be difficult to grasp we should provide a straightforward, documented way to customize it.
This is a nuanced topic requiring dedicated docs and architecture explanations.
Heterogeneous code takes forever to check. This is likely caused by all the CUDA and HIP headers we have to include. There should be some builtin default setting to exclude these headers from the checks.
-iquote
, -isystem
, -I
, -idirafter
-isystem-after
).strip_prefix
in rules_cc
. Otherwise we have to manually specify includes
for external repositories. There were reasons for not implementing strip_prefix
like in rules_cc
. I will post an update when I remember the details. Currently, the compiler is invoked at the top-level of the action sandbox. If we were to move it into the build subdirectory within that sandbox we will need to change the way inclusions of external headers are handled (maybe prefix with ../../
or something like that).Running the default CUDA example in WSL fails to detect the GPU.
Setting LD_LIBRARY_PATH
resolves the issue:
export LD_LIBRARY_PATH="/usr/lib/wsl/lib:$LD_LIBRARY_PATH"
rules_ll appends automatically /usr/lib/wsl/lib to LD_LIBRARY_PATH when rules_ll is running in WSL.
(Not sure If this belongs into the nix flake or into the bazel rules)
Building Clang from upstream depends on libxml2-dev. libxml2-dev should be added as external dependency in the bazel-eomii-registry and to the dependencies for rules_ll.
We can't use go_deps.module_override
in upstream dependencies. This means that importers of rules_ll will break. Circl is required for our cluster setup and needs to be usable from downstream repos.
While we the new remote execution workflows are very efficient, we are still running gigantic builds compared to most other projects. This means that we quickly fall out of the "free" or "open source" tiers of remote execution services. Self-hosting might be inevitable ๐
For a full setup there is buildfarm. Regarding remote caching there is the pretty good bazel-remote, but it might also be fun to try wrapping dragonflydb with the remote-api gRPC calls and use that as cache.
The remote-apis are fairly straightforward, so we could also build an entire stack ourselves.
We are missing #include <sanitizer/msan_interface.h>
. Probably caused by drift from upstream. Should be easy to fix.
To make sure that we don't accidentally destroy people's workspaces the current ll init
only appends some contents to files. If one runs the command more than once this will lead to duplicate code in those files which can look somewhat buggy.
We should probably factor the command out into a separate shell script and add more flexibility/checking/whatever to improve its user experience. This should be an actual shell script instead of a nix string template so that we can properly run linters on it. This likely requires changing the structure of the command in a way that every variable is passed as an external argument. I'm thinking something along the lines of invoking it like this in the flake:
''${ll} \
--bazelversion=${./.bazelversion} \
--module=${./examples/MODULE.bazel} \
--bazelrc=${./examples/.bazelrc}''
Blocks #4.
After porting the pre-commit hooks to nix it seems that shellcheck is always skipped.
bazelbuild/bazel#14659 prevents us from leveraging our registry.
We can probably work around this by moving the third-party-overlays/*.BUILD.bazel
to static variables.
We currently do not support loading loading shared objects during runtime. An ll_pkg
rule may be an option.
ld.lld is unable to find libraries when building with Ubuntu 22.04 and gcc 11.2.0.
ERROR: /home/ubuntu/rules_ll/examples/format_example/BUILD.bazel:3:10: LlLinkExecutable format_example/format_example failed: (Exit 1): ld.lld failed: error executing command (from target //format_example:format_example) bazel-out/k8-fastbuild/bin/external/@rules_ll.override/ll/ld.lld --color-diagnostics '-dynamic-linker=/lib64/ld-linux-x86-64.so.2' --lto-O3 --pie --nostdlib -L/usr/lib64 -lm -ldl -lpthread -lc ... (remaining 11 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
ld.lld: error: unable to find library -lm
ld.lld: error: unable to find library -ldl
ld.lld: error: unable to find library -lpthread
ld.lld: error: unable to find library -lc
Target //format_example:format_example failed to build
I set the symlinks for crt*.o and Scrt*.o manual, but all the libraries are in /usr/lib/x86_64-linux-gnu and not in /usr/lib64 .
Maybe it makes sense to to set /usr/lib/x86_64-linux-gnu the default?
After adding support for shared libraries our heterogeneous toolchains broke. The new shared library linking causes us to blindly link all of CUDA's shared libraries which is of course not what we want. Instead of rewriting the linking logic, we may want to consider rewriting the CUDA-related build files and/or making the toolchains finer-grained in the sense that static and shared libraries are more clearly separated.
We need support for creating dynamic shared objects. Clang plugins such as the hipsycl
plugin require this.
The best way to implement this is probably by reworking the aggregate
attribute. This will require support for position independent code.
Tests are missing shared objects like libamdhip64.so
. Sometimes these tests can flakily pass by chance if the corresponding library path has been populated before.
Probably needs some runfile/symlink tweaking.
Playing around with clippy in rules_rust
made me notice how incredibly convenient it would be to have clang-tidy run as a plugin that just prints warnings like a "regular" compiler warning. I'm not sure whtether this is possible, but if i was, it could be a significant improvement for our user experience and would obsolete the ll_compilation_database
targets in many cases.
Let's see whether it's possible to copy the rules_rust/clippy behavior to rules_ll/clang-tidy.
One of the main goals of rules_ll is to build a Clang/LLVM based toolchain from upstream. This should work with Clang and GCC.
One error occurs when running the examples with GCC as default compiler:
error: zlib.h: no such file or directory
Which can be fixed with installing the libz-dev package.
After installing the missing headers, the build fails with following error message:
ERROR: /root/.cache/bazel/_bazel_root/79c7c71f78facf0e35780b9a06528730/external/@rules_ll.override.llvm_project_overlay.llvm-project/llvm/BUILD.bazel:164:11: Compiling llvm/lib/Support/Process.cpp [for tool] failed: (Exit 1): gcc failed: error executing command (from target @@rules_ll.override.llvm_project_overlay.llvm-project//llvm:Support) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 70 arguments skipped)
Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
In file included from external/@rules_ll.override.llvm_project_overlay.llvm-project/llvm/lib/Support/Process.cpp:107:
external/@rules_ll.override.llvm_project_overlay.llvm-project/llvm/lib/Support/Unix/Process.inc: In static member function 'static size_t llvm::sys::Process::GetMallocUsage()':
external/@rules_ll.override.llvm_project_overlay.llvm-project/llvm/lib/Support/Unix/Process.inc:93:20: error: aggregate 'llvm::sys::Process::GetMallocUsage()::mallinfo2 mi' has incomplete type and cannot be defined
93 | struct mallinfo2 mi;
| ^~
external/@rules_ll.override.llvm_project_overlay.llvm-project/llvm/lib/Support/Unix/Process.inc:94:10: error: '::mallinfo2' has not been declared
94 | mi = ::mallinfo2();
| ^~~~~~~~~
Target //format_example:format_example failed to build
The system is Ubuntu 20.04.4 LTS, the GCC version is 9.4.0
This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.
These problems occurred while renovating this repository. View logs.
Warning
Renovate failed to look up the following dependencies: Could not determine new digest for update (github-tags package eomii/rules_ll)
.
Files affected: templates/default/MODULE.bazel
These updates have all been created already. Click a checkbox below to force a retry/rebase of any.
MODULE.bazel
platforms 0.0.10
rules_cc 0.0.9
bazel_skylib 1.7.0
rules_java 7.6.1
stardoc 0.6.2
llvm-project-overlay 17-init-bcr.3
templates/default/MODULE.bazel
rules_ll <TODO: USE THE COMMIT FROM THE FLAKE HERE HERE>
.bazelversion
bazel 8.0.0-pre.20240516.1
templates/default/.bazelversion
bazel 8.0.0-pre.20240516.1
.github/workflows/docs.yml
ubuntu 22.04
.github/workflows/pre-commit.yml
ubuntu 22.04
.github/workflows/scorecard.yml
ubuntu 22.04
templates/default/.github/workflows/pre-commit.yml
ubuntu 22.04
Surely there is some workaround.
examples/flake.lock
causes direnv/devenv to break../
is not reproducible across machines (for different users the absolute path to the directory is different).git update-index --skip-worktree
for this, but that doesn't work with devenv.What a dilemma lmao.
For now I've sent #67, but that's hardly a satisfactory solution.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.