Comments (5)
This isn't a Zig problem, since it happens in equivalent C code (i.e. using stdatomic.h), at least on my end:
#include <pico/sync.h>
#include <stdatomic.h>
static _Atomic(int) whatever = 0;
int change(void) {
return atomic_exchange(&whatever, 1);
}
Will produce the same error if change
is called. Zig can't provide support for this because it does not recognize the rp2040's synchronization primitives, which you should regardless be using through pico_sync
or pico_multicore
. Even if Zig could do that, it would have to keep track of changes to the Pico SDK, since it would need to allocate a hardware spinlock without failing, for example: the only atomic read/write operations that exist are for peripheral pages.
from zig.
Thankyou for your answer.
I understand that e.g. atomic_exchange
has this problem because cortex M0 has no special instructions for atomics.
However, I think @atomicLoad
and @atomicStore
do not share this problem.
As far as I understand, the normal Store and Load assembly instructions for cortex m0 are all atomic in the sense that the thread on CPU 1 will either see the old value or the new value when a Store is executed by CPU 0. There is also no memory cache which could hide some of the stores but not others.
So my naive assumption was that @atomicStore(u32, &variable, 42, .Monotonic)
compiles down to a single STR instruction.
This would mean that the difference between variable = 42;
and @atomicStore(u32, &variable, 42, .Monotonic);
is that the latter guarantees that the Store instruction is not optimized away (e.g. because the variable ist held in a register).
(Maybe this board is not the right place for the following question. If so please let me know.)
My question is now, how do I work around the missing atomic builtins.
One of my use cases is a shared ring buffer. CPU 0 writes into it. The writing function must guarantee that the Store of a new value happens before the Store of the advanced write pointer. The reader on CPU 1 can then poll by executing a Load on the write pointer and compare it to its stored copy. I they differ, the buffer contains new values, which can then by read by Loads.
I have 2 solution ideas:
-
make all variables (all being 4 bytes or smaller) which are shared between both cores volatile, i.e. in my example the write pointer and the array for the values. I know that this is a hack because volatile is meant for memory mapped IO only.
I think this should work and be equivalent to accessing the variables using@atomicStore(T, &variable, value, .SeqCst)
and@atomicLoad(T, &variable, .SeqCst)
. -
Implement the Store and Load in assembly (one instruction each). Make the optimizer not optimize them away (I don't know how to do that right now.)
from zig.
I understand that e.g. atomic_exchange has this problem because cortex M0 has no special instructions for atomics.
However, I think @atomicLoad and @AtomicStore do not share this problem.
Actually, I tried to use atomic_load and it linked and produced "reasonable" assembly:
#include <pico/sync.h>
#include <stdatomic.h>
static _Atomic(int) whatever = 0;
int change(void) {
return atomic_load(&whatever);
}
(Note this is an object file, so the compiler isn't even telling the linker that it's using an atomic builtin)
However, it would only compile if I used the GCC arm-none-eabi toolchain. If I used Clang with the same sysroot, it produced the same undefined call to libatomic. I also tried and got mostly the same result by manually using each possible macro expansion in stdatomic. GCC is probably substituting the builtin at the callsite, while LLVM wants to use libatomic. I assume the (GCC) toolchain isn't expecting this, so they just aren't provided.
The equivalent Zig code would be something like: (function is noinline
to make it easier to spot)
var whatever: usize = undefined;
noinline fn change() usize {
@fence(.SeqCst); // fence before
const result = @as(*volatile usize, @ptrCast(&whatever)).*; // grab result
@fence(.SeqCst); // fence after
return result;
}
Which becomes
For some reason the option to DMB is different. Reading the v6-m ARM, I think the GCC example would actually throw an illegal instruction, since opt
has to be SY
, but is ISH
(1111 vs 1011). This makes me think that none of this is intended to be supported and the GCC output is a miscompilation, but the documentation may be incomplete. Even stranger is that some atomic builtins would use fences while others would not.
You probably only want a compiler fence, anyways, so you could get away with something like this:
var whatever: usize = undefined;
noinline fn change() usize {
asm volatile ("" ::: "memory"); // fence before
const result = @as(*volatile usize, @ptrCast(&whatever)).*; // grab result
asm volatile ("" ::: "memory"); // fence after
return result;
}
The pointer is marked as volatile in this case to prevent LLVM optimizing it out entirely since it's only ever undefined.. it might be better practice to just define a function called compilerFence
(which the std lib removed, for some reason) and stick that where you don't want anything to be reordered.
from zig.
Thank you @drew-gpf !
You are very helpful.
Combining what you have written, here is my suggestion for the implementation of @atomicLoad
and @atomicStore
.
const AtomicOrder = @import("std").builtin.AtomicOrder;
pub fn atomicLoad(comptime T: type, ptr: *const T, comptime order: AtomicOrder) T {
if (@sizeOf(T) > 4) @compileError("Only supported for types with 4 or less bytes.");
if (order == .SeqCst) @compileError("AtomicOrder.SeqCst not supported.");
if (order == .Release or order == .AcqRel) asm volatile ("" ::: "memory"); // compiler fence
const result = @as(*const volatile T, @ptrCast(ptr)).*;
if (order == .Acquire or order == .AcqRel) asm volatile ("" ::: "memory"); // compiler fence
return result;
}
pub fn atomicStore(comptime T: type, ptr: *T, value: T, comptime order: AtomicOrder) void {
if (@sizeOf(T) > 4) @compileError("Only supported for types with 4 or less bytes.");
if (order == .SeqCst) @compileError("AtomicOrder.SeqCst not supported.");
if (order == .Release or order == .AcqRel) asm volatile ("" ::: "memory"); // compiler fence
@as(*volatile T, @ptrCast(ptr)).* = value;
if (order == .Acquire or order == .AcqRel) asm volatile ("" ::: "memory"); // compiler fence
}
Does that make sense?
Maybe AtomicOrder.SeqCst
could be implemented using the ISB, DMB, and DSB instructions?
I don't feel comfortable implementing it. These concurrency problems are usually more tricky than they appear at first.
from zig.
What I'm moreso getting at is that it's unclear if any compiler should be expected to support atomic rmw for arbitrary addresses if Clang generates the same linker error and GCC generates a potentially bad fence instruction. That is, if you want atomic stores and loads, you need to keep in mind the specific SOC you're using and especially the address you're writing to and optimize from there--that's why I suggested to use compiler fences for all memory ordering types, since I'm pretty sure a DMB is a 3-cycle no-op in this case. You're better off asking on the raspberry pi forums, though. I suspect they will tell you to just use a critical section.
from zig.
Related Issues (20)
- Can't link on PPC64LE HOT 1
- ErrorUnion typeInfo debug print causes GenericPoison HOT 3
- @cInclude("stdio.h") doesn't work for -Dtarget=wasm32-emscripten
- Unsure how to handle duplicate symbol linker error
- SIGTRAP When attempting to write to a runtime index in comptime variable HOT 1
- Undefined struct fields are initialized to 0 instead of 0xAA in debug mode (with llvm backend)
- zig fetch -h should include examples
- Zig build system incorrectly parses some depfiles generated by Rust on Windows
- implement std.Progress for single-threaded mode
- std.Progress: add byte unit flag and resource scope; support throughput calculation
- implement std.Progress IPC for WASI
- implement std.Progress IPC for Windows
- add support for targeting glibc 2.39
- Progress display is cut off horizontally on Windows HOT 2
- `zig fmt` outputs invalid syntax when formatting an error set with doc comments onto one line
- Latest master tarball is unable to build a stage4 zig HOT 3
- upperBound and lowerBound has inconsistent callback types HOT 1
- "offsetof" from C header does not handle union within struct HOT 2
- x86_64 backend miscomp
- When compiling for android, and linking libc, you will always hit a @compileError HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zig.