fishinabarrel / linux-kernel-module-rust Goto Github PK

View Code? Open in Web Editor NEW

1.3K 1.3K 117.0 258 KB

Framework for writing Linux kernel modules in safe Rust

License: GNU General Public License v2.0

Rust 94.62% Makefile 2.09% C 1.21% Python 2.07%

linux-kernel-module-rust's People

Contributors

Stargazers

Watchers

Forkers

nicklauri nathansgreen happy-ferret immunant jason-ni duzhanyuan jyizheng isgasho fengjixuchui dlrobertson x64k potatogim gavinzheng mostafa maplebeats jeremycline anhkggfork kees pubfork brauner luisgerhorst pombredanne itshabib heruix rust-wenwp zhuosichen mrvan uuhan kvinwang poppycompass xdevs23 zjsxwc ujang360 zachschuermann srikwit leon-barth awfeequdng starliiit yutuer bartmassey-upstream eddi0815 h33p sathnaga bitkis rendaardy bobo1239 orklann m-asama frewsxcv simonsan juan-lee asked637 16yuki0702 liutgnu xdnice 0lxb zzl133 adamrk joebb97 kloenk luciferarm heroin-moose simonjiao stephanvanschaik mcoffin sunnyanthony linux-kern wowverylogin sawchord nand-nor chmodawk icodein juliobonon g3n3rous sylfrena sciguy16 robjsstewart starcross-tech posticarus qiaomengnan16 knightku mijian1988 rewasp moriai code-fool leikang123 acseed leemgs channgo2203 mddmzl panda0125 bill790 jnbdz hansonchar legomushroom maswangy ajunlonglive robberphex aganhengxing bishwajitdey

linux-kernel-module-rust's Issues

Architecture support (tracking issue)

Rust supported architectures: https://forge.rust-lang.org/platform-support.html For our purposes, we don't care much about Tier 1 vs. Tier 2, and even Tier 3 support is probably fine.

Another useful reference is Debian's arch status as of February 2018: https://alioth-lists.debian.net/pipermail/pkg-rust-maintainers/2018-February/001215.html (doesn't include all the arches the kernel supports)

arch	description	LLVM status	Rust status
alpha	DEC Alpha	removed from LLVM in 2011
arc	Synopsys ARC	present	needs port
arm	ARM 32-bit	present	ARMv5 and up is Tier 2
arm64	ARM 64-bit aka AARCH64	present	Tier 2
c6x	TI C6000-series VLIW DSPs	no LLVM port
csky	C-Sky	no LLVM port
h8300	Hitachi H8	no LLVM port
hexagon	Qualcomm Hexagon DSP, part of Snapdragon SoCs	present	added July 2019
ia64	Intel Itanium	removed from LLVM in 2009
m68k	Motorola 680x0	port exists, proposed for merge	work in progress
microblaze	Xilinx MicroBlaze for FPGAs	removed from LLVM
mips	MIPS (R3000 / MIPS I and up)	present for R4000 / MIPS II and up	Tier 2
nds32	Andes AndeStar 32-bit	proposed for merge in 2017	needs port
nios2	Altera Nios II for FPGAs	removed from LLVM in January 2019
openrisc	OpenRISC	out-of-tree backend	needs port
parisc	Hewlett-Packard PA-RISC aka hppa	no LLVM port
powerpc	PowerPC / POWER 32-bit, 64-bit LE, 64-bit BE	all present	all Tier 2
riscv	RISC-V	present	Tier 2
s390	z/Architecture 64-bit aka s390x	present	Tier 2
sh	Hitachi SuperH / J2	experimental out-of-tree backend	needs port
sparc	Sun SPARC 32- and 64-bit	present	Tier 2 for 64-bit (SPARC v9), needs 32-bit port
um	User-Mode Linux (on any userspace arch)	N/A	might just work?
x86	Intel x86 aka i386 and x86-64 aka amd64	present	Tier 1
xtensa	Tensilica Xtensa, used on ESP microcontrollers	proposed for merge in March 2019	work in progress

Re ARMv4 - the linked issue implies ARMv4 is only a microcontroller for OS-less use, but arch/arm/mach-gemini/ looks like an actual Linux target for ARMv4. There are also a few ARMv4T boards (i.e., with Thumb support). It looks like LLVM wants to emit BLX instructions which switch to Thumb mode, so unclear if plain ARMv4 will work. (Though forcing it to emit non-Thumb code only doesn't sound like it should be too difficult....)

Re MIPS I - see simias/psx-sdk-rs#1. MIPS I has load delay slots, LLVM's codegen would need to be taught about it. https://github.com/impiaaa/llvm-project seems to have a patch.

Some possible approaches for merging into mainline and dealing with architectures with no LLVM support and no realistic plans (e.g., processors no longer manufactured):

Convince Linux to drop them <_< >_>
Only support certain architectures via Kconfig - it's probably okay to only support some modules on the major architectures
Use mrustc to compile via C https://github.com/thepowersgang/mrustc
Use the LLVM C backend to compile via C #111
Write a GCC frontend for Rust, see https://github.com/sapir/gcc-rust/tree/rust (which glues actual rustc into GCC) or https://github.com/redbrain/gccrs, both of which seem to be actively developed
Something involving Cranelift, maybe? I don't have a good sense of what this would look like / whether it would work.

Remove malloc from `println!()`

CI should very that everything is nice and cargo fmt'd

Build fails if rustfmt is not installed

bindgen doesn't fail if rustfmt is unavailable, which seems probably reasonable:
https://github.com/rust-lang/rust-bindgen/blob/v0.51.0/src/lib.rs#L1890-L1898
but the build fails because we can't scrape LINUX_VERSION_CODE if it's not on a line by itself. We should do something clearer:

scrape the code more robustly (regex crate?)
in a subprocess, compile and run fn main() {println!("{}", LINUX_VERSION_CODE);}
figure out how to get the variable directly, using some libclang bindings, instead of scraping it
scrape it from the C sources
throw an error if rustfmt isn't installed

Add RCU bindings

There are some APIs like for_each_process that want you to hold an RCU read lock when calling them. We should support this, at minimum. We probably want to also have full support for RCU-protected data, but maybe we can put that on hold.

Upstream RCU docs: https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt

The rough idea of RCU is to implement a reader-writer lock-ish pattern where readers are low-overhead, under the assumption that all readers are short-lived - a valid assumption for kernels, where a "reader" is the execution of a single syscall/interrupt; anything that persists state beyond this will count as a "writer". In particular, readers should not be slowed down when a writer is trying to do an update, so ideally reads should happen concurrently. The way this is done is by having writers allocate and fill in a new structure, atomically update the pointer to it, synchronize memory and wait for existing readers to complete, and then deallocating the old structure. (This pointer can either be a single pointer to some single structure, or one of the pointers in a linked list, or something.) That way, existing readers see a consistent view - either the old or new version - and you're guaranteed that once you're finished synchronizing, any readers in progress are only seeing the new version.

(Multiple writers are not protected by RCU in any way. They should use a conventional mutex, or something.)

The basic RCU API in Rusty pseudocode is

extern {
    fn rcu_read_lock();
    fn rcu_read_unlock();
    fn synchronize_rcu();
    fn rcu_assign_pointer<T: Sized>(p: &*mut T, v: *mut T);
    fn rcu_dereference<T: Sized>(p: &*T) -> *T;
}

(Note the latter two in C are macros, into which p is syntactically passed by reference. If we want to bind them directly, we probably want to expose C helpers that take void ** and handle the type safety in the Rust wrapper.)

rcu_read_lock and rcu_read_unlock take no context info, they cause an RCU read-sized critical section for any RCU use in the kernel. Between those you may call rcu_dereference on anything. (That said, there are two other RCU "flavors" in the kernel, rcu_read_lock_bh etc. and rcu_read_lock_sched etc. They appear to have been consolidated last year but I think from the API they should still be treated as separate? In any case, the other flavors are rarely used and we can probably get away with ignoring them for now.)

The rules as I understand them are:

You can only read an RCU pointer between rcu_read_lock and rcu_read_unlock (aka "within a read-side critical section"), and you can only read it with rcu_dererence (though it compiles out on all architectures besides Alpha). rcu_dereference doesn't actually do the dereferencing itself, it simply gives you a pointer you can safely deference until rcu_read_unlock.
You cannot block in a read-side critical section.
You can only write an RCU pointer with rcu_assign_pointer, though you can do so at any point without advance notice.
You must keep the old object pointed to by an rcu_assign_pointer valid until you've called synchronize_rcu.

You may nest / overlap read-side critical sections, though. It's a reader lock, you can have more than one of those and writers can't do anything to invalidate the data you're reading until all reader locks are dropped. The only difference with RCU is that writers only wait on synchronize_rcu for existing reader locks, once they start, any new reader locks don't affect it (new reads are now guaranteed to be ordered after any previous rcu_assign_pointers).

I think the rough way to handle this is to

create an Rcu<T> pointer type
have an empty RcuRead object whose constructor calls rcu_read_lock(), whose destructor calls rcu_read_unlock(), and which is required to read an Rcu<T>
have ... some sort of operation to assign to an Rcu<T> that keeps the old value alive. Perhaps it returns an RcuDropGuard<T> object that holds on to a pointer to the old object, and that object's destructor calls synchronize_rcu() and then frees it?

I think it's memory-safe that we're using RAII / destructors here. The worst that can happen is that you deadlock (if you forget an RcuRead) or you keep the old value alive forever (if you forget RcuDropGuard<T>). The unsafety in std::thread::scoped::JoinGuard was that there were operations that were unsafe to do before the JoinGuard was dropped and safe after (like, deallocate data used by the thread). There are no operations that are enabled by either an RcuRead or an RcuDropGuard<T> going out of scope. (The RcuDropGuard<T> itself needs to handle deallocating the old object, for this reason, to guarantee that synchronize_rcu() was in fact called.)

I'm a little less sure about making sure you don't hold the result of reading an Rcu<T> after the originating RcuRead goes out of scope / after you call rcu_read_unlock. Can we use lifetimes here, by giving you a pointer that's constrained to the lifetime of RcuRead? Or do we need to insist on making you pass a callback/lambda?

For the short term, I'd like to at least introduce RcuRead and have it be a required parameter for binding things like for_each_process, and we can handle doing RCU pointer reads and writes in Rust later. (Which also lets us defer figuring out how to integrate the mutex needed for avoiding concurrent writes.)

More interesting RCU docs:

More description of things that RCU does and does not guarantee: https://www.kernel.org/doc/Documentation/RCU/Design/Requirements/Requirements.html

An RCU home page, more or less: http://www.rdrop.com/~paulmck/RCU/

"RCU's first-ever CVE, and how I lived to tell the tale": https://youtu.be/hZX1aokdNiY http://www.rdrop.com/~paulmck/RCU/cve.2019.01.23e.pdf (tl;dr: use-after-free because they locked the wrong RCU flavor, but the solution is a bit more complicated than that)

Add another sysctl test

We lost test coverage for Sysctl.get() in b7b3ebf. Also, add test coverage for #131. This can be done as a single test I think.

Add tests for filesystem::register

Update to 2018 edition

There's some work on this in jason-ni/linux-kernel-module-rust@e7e4044 - it looks straightforward.

Support proc_mkdir + proc_create

Make it possible to write modules in stable Rust

Even though you'll need the unstable compiler for a while for this crate, modules themselves should be possible to write in stable Rust, in the same way that libstd itself requires unstable but can be used from stable Rust.

Right now hello-world uses #![feature(alloc)] to use alloc::borrow::ToOwned and alloc::String. Both alloc::borrow and alloc::string are stably re-exported in libstd. So we should do the same thing. Strictly speaking, this permits those modules to change as long as libstd makes an API-compatible facade, but practically that's unlikely to happen and if it does we can just steal whatever facade libstd comes up with (or in the worst case, take a semver hit).

Potentially the way to do this is to re-export a module named std that contains a subset of what's in actual libstd, with the intention of consumers doing

#[macro_use]
extern crate linux_kernel_module;
use linux_kernel_module::std;
use linux_kernel_module::std::prelude::v1::*;

That way, those things from std that do/can exist in kernelspace can just be used like normal.

See about having KernelModule::init() take &mut self so less moves are required

Do something reasonable in panic_fmt

Currently we just hang, which is not a world record holder for best UX.

In the short term using BUG is probably reasonable, but I'm not positive about how the stack will look. I think because of panic = abort it'll just be invoked in the frame where we panic'd, so it'll all be good?

CI is unreliable

https://travis-ci.org/alex/linux-kernel-module-rust/builds/530517474 and https://travis-ci.org/alex/linux-kernel-module-rust/builds/528534959 are for the exact same commit, but one passes and the other fails.

i386 support

While looking at something else I remembered that the i386 kernel builds with -mregparm=3 -freg-struct-return, i.e., pass the first three arguments in registers (instead of on the stack, the normal ABI) and try to return structures in a register if they fit. I'm not sure if Rust and/or bindgen know how to deal with this.

If they don't, put a note in README.md and move on. If they do, let's add i386 to CI at some point....

Remove explicit list of symbols in build.rs?

It's a pain to have to constantly update it. Is there a reason not to just bind everything -- other than the compile time penalty (which should theoretically happen less often since we'll be editing build.rs less?)

Must use fallible allocation and not panic

The code seems to use Box::new at times, which panics on allocation failure.

This is obviously not acceptable, since the kernel must not panic on allocation failure, so this crate must NOT use Box, Vec, etc. and instead use fallible alternatives.

Automated conversion from C to Rust

Idea is to provide the framework for automated conversion using c2rust, as they do for the user-space programs.

Corresponding issue in c2rust itself: immunant/c2rust#150

Add support for registering sysctls

Rough cut concept for the API:

struct MySysctlExample {
    sysctl_table: SysctlTableRegistration,
}

impl KernelModule for MySysctExample {
    impl init() -> KernelResult<Self> {
        let table = SysctlTable::new();
        let sub_table = SysctlTable::new();
        sub_table.register(INTEGER_VALUE, "name", some_parameter_about_valid_values_and_type?);
        table.register_subtable(INTEGER_VAULE, "name", sub_table);
        
        return Ok(MySysctlExample{
            sysctl_table: table.register();
        });
    }
}

Feedback welcome, going to spike on this tomorrow

`hello-world` fails to build

Trying to build hello-world as described here, but I'm getting this error:

   Compiling linux-kernel-module v0.1.0 (/home/vnz/work/playground/linux-kernel-module-rust)
error: failed to run custom build command for `linux-kernel-module v0.1.0 (/home/vnz/work/playground/linux-kernel-module-rust)`

Caused by:
  process didn't exit successfully: `/home/vnz/work/playground/linux-kernel-module-rust/hello-world/target/debug/build/linux-kernel-module-b8153acc5bce9d04/build-script-buil
d` (exit code: 1)
--- stdout
cargo:rerun-if-env-changed=KDIR
cargo:rerun-if-env-changed=CLANG
cargo:rerun-if-changed=kernel-cflags-finder/Makefile

--- stderr
kernel-cflags-finder did not succeed
stdout: -nostdinc -isystem /usr/lib/clang/8.0.1/include -I/usr/lib/modules/5.2.8-1-MANJARO/build/./arch/x86/include -I/usr/lib/modules/5.2.8-1-MANJARO/build/./arch/x86/inclu
de/generated -I/usr/lib/modules/5.2.8-1-MANJARO/build/./include -I/usr/lib/modules/5.2.8-1-MANJARO/build/./arch/x86/include/uapi -I/usr/lib/modules/5.2.8-1-MANJARO/build/./a
rch/x86/include/generated/uapi -I/usr/lib/modules/5.2.8-1-MANJARO/build/./include/uapi -I/usr/lib/modules/5.2.8-1-MANJARO/build/./include/generated/uapi -include /usr/lib/mo
dules/5.2.8-1-MANJARO/build/include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-P
IE -Werror=implicit-function-declaration -Werror=implicit-int -Wno-format-security -std=gnu89 -no-integrated-as -Werror=unknown-warning-option -mno-sse -mno-mmx -mno-sse2 -m
no-3dnow -mno-avx -m64 -mno-80387 -mstack-alignment=8 -mtune=generic -mno-red-zone -mcmodel=kernel -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCO
NFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_AVX512=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -Wno-sign-compare -fno-asy
nchronous-unwind-tables -mretpoline-external-thunk -fno-jump-tables -fno-delete-null-pointer-checks -Wno-address-of-packed-member -O2 -fplugin=./scripts/gcc-plugins/structle
ak_plugin.so -fplugin-arg-structleak_plugin-byref-all -DSTRUCTLEAK_PLUGIN -Wframe-larger-than=2048 -fstack-protector-strong -Wno-unused-but-set-variable -pg -Wdeclaration-af
ter-statement -Wvla -Wno-pointer-sign -Wno-unknown-warning-option -DMODULE

stderr: clang-8: error: unknown argument: '-fplugin-arg-structleak_plugin-byref-all'
make[2]: *** [scripts/Makefile.build:285: /home/vnz/work/playground/linux-kernel-module-rust/kernel-cflags-finder/dummy.o] Error 1
make[1]: *** [Makefile:1597: _module_/home/vnz/work/playground/linux-kernel-module-rust/kernel-cflags-finder] Error 2
make: *** [Makefile:33: all] Error 2

I'm on Manjaro Linux 18.0.4, with Linux kernel 5.2.8-1.

Improve the quality of output on panic

Currently we don't display the panic error message. We get a decent stack, but getting the actual error message would be a great improvement.

Add chrdev!

https://lwn.net/Articles/195805/ -- starts at "To that end, here".

There seems to me to be two essential pieces to this API:

allocating/obtaining device numbers
file ops structure

File Ops is probably an interface that people can implement, which we convert into the explicit vtable, does that sound right? From there it's probably just a matter of going through all the function pointers in it and turning them into interface methods. We'll be responsible for wrapping/unwrapping the private data and managing self's lifetime.

The device numbers stuff is probably just some types with imperative methods, although I haven't dug in deeply

Switch back to panic=unwind

I think we need to unwind when we panic so we appropriately drop things, poison mutexes, etc. I don't really know how to integrate with the kernel's unwinder: the challenge is we need to unwind every Rust function on the stack, even if there's C code in between, but not unwind past the end of the stack. (I'm also not sure the kernel has a meaningful unwinder.) We'll need to define an appropriate / meaningful eh_personality lang item, see comments in src/libpanic_unwind/gcc.rs for details and an example.

Or, I can be convinced that not dropping things and leaving mutexes locked is sound. (Seems suboptimal though)

Run hello-world through tests

It uses String which none of the others do, and it has its own makefile. We should have coverage for it since it's the user-facing example.

Switch filesystem vtable approach to look more like the file operations vtable

make an example that uses lazy_static!

https://github.com/rust-lang-nursery/lazy-static.rs

I'd expect it to work, I just want to test it (and keep CI'ing it)

Upstream support for building rust modules to mainline kernel

Kees thinks this is a good next step -- what we can do is upstream support for building rust with an arbitrary target file + xbuild, but not include any of our code (including our build.rs).

We think we want to support something that looks roughly like:

obj-m += helloworld.o
helloworld-crates := Cargo.toml

all:
        $(MAKE) -C $(KDIR) M=$(CURDIR)

clean:
        $(MAKE) -C $(KDIR) M=$(CURDIR) clean

Come up with a strategy for writing automated tests

Are there things in libstd that we want?

While most of std::collections is reexported from liballoc (see #37), std::collections::HashMap and std::collections::HashSet are not. As far as I can tell, this is because they use std::hashmap_random_keys(), which returns a 128-bit random value. We can do that in kernelspace pretty easily.

Can we manage to export a HashMap and HashSet that people can use? (Will this involve an automated filter-branch of upstream libstd, or is there a better way to do that?)

Also in libstd is std::sync::RwLock and other things in std::sync, which uses OS native sync functionality (e.g. libc::pthread_rwlock_rdlock on UNIX). Can we make those work? (That's probably a whole subproject on its own re kernel sync primitives, but once we have that, should we try to wire up libstd's RwLock implementation?)

Anything else? std::time? The types (but not functions) from std::net?

bindgen derives Debug on a packed struct

ubuntu@ip-172-16-0-28:~/linux-kernel-module-rust/hello-world$ RUST_TARGET_PATH=$(pwd)/.. cargo xbuild --target x86_64-linux-kernel-module                                                                                                                                          Compiling linux-kernel-module v0.1.0 (file:///home/ubuntu/linux-kernel-module-rust)                                                  warning: #[derive] can't be used on a non-Copy #[repr(packed)] struct (error E0133)
    --> /home/ubuntu/linux-kernel-module-rust/hello-world/target/x86_64-linux-kernel-module/debug/build/linux-kernel-module-ba51b8057957ede2/out/bindings.rs:2455:10
     |
2455 | #[derive(Debug, Default)]
     |          ^^^^^
     |
     = note: #[warn(safe_packed_borrows)] on by default
     = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
     = note: for more information, see issue #46043 <https://github.com/rust-lang/rust/issues/46043>

tl;dr: if packing a struct causes the field to get misaligned, accesses are unsafe, and the derived implementation of Debug involves accessing these fields.

The struct in question is desc_struct (which is the type of a member of thread_struct, which we need).

Do we care? Also I think this is a bindgen bug and we can't do anything about it?

Set up CI, even if all it does is build

Investigate LLVM C backend

https://github.com/JuliaComputing/llvm-cbe

Looks like it's being actively developed. If it works enough that we can load a module on x86_64 by generating C and having gcc compile that, then that could be our answer for architectures LLVM doesn't support. If so we should get it into CI to make sure it doesn't regress (and document how to do it).

modinfo needs a const char [], not a const char *

We use this (from src/lib.rs) to set modinfo:

#[link_section = ".modinfo"]
#[allow(non_upper_case_globals)]
// TODO: Generate a name the same way the kernel's `__MODULE_INFO` does.
// TODO: This needs to be a `[u8; _]`, since the kernel defines this as a  `const char []`.
// See https://github.com/rust-lang/rfcs/pull/2545
pub static $name: &'static [u8] = concat!(stringify!($name), "=", $value, '\0').as_bytes();

This (likely) stores a pointer to a string in the modinfo section instead of storing the string directly.

Write and publish docs

We should have docs for at least all public APIs, and published rustdocs linked from the README.

It's not super straightforward to use this repo with crates.io (you need a local checkout for getting to the target JSON file, see #1). docs.rs automatically publishes crates on crates.io; I'm not sure if we want to publish the crate for the sake of using that (+ getting into things like crater runs?), or just self-host docs. (I run enough websites that I'm not going to be sad about self-hosting docs somewhere....)

Come up with a shorter idiom for handling errors in C calls

Right now error handling for C functions is fairly verbose:

let res = unsafe { bindings::call() };
if res != 0 {
    return Err(error::Error::from_kernel_errno(res));
}

I think we want a try!()-style macro for checking the return value and propagating it upwards. The one challenge is not all kernel functions have the same return value convention, for some != 0 means error, for others < 0 means error, and I'm sure there's some third convention.

File a bug with Rust/Cargo requesting the ability to ship a target.json in a crate

Move run_tests.py to some sort of real build system

It is awful and only getting worse.

[Suggestion] An organization for the Rust-in-Linux effort

@alex asked about the requirements for eventually merging something into the kernel in alex/linux#3 (comment)

While there are advantages of developing within the kernel (e.g. nowadays we have staging), this would be a treewide effort, not to mention multi-repo. To submit any changes to the kernel eventually, ideally everything required should be in stable Rust and a clean patch series is prepared for submission where everyone can discuss it in the LKML.

Therefore, in order to prepare everything, it would be nice to have something like the ClangBuiltLinux organization for all the Rust-in-Linux (Rust-for-Linux?) related things:

The Linux fork with the framework and any other changes required (e.g. like those @alex started), plus a branch with an example driver, etc.
A rustc fork (and any others) with all the features needed to build the Linux fork until they are in the official repo(s) (e.g. cargo xbuild as @joshtriplett mentioned)
The CI setup
etc.

Figure out how to make std work

There are a couple of annoying things about #![no_std], including:

Cargo doesn't handle having different flags on a crate that's used as both a build-dependency and a runtime dependency (rust-lang/cargo#5730, rust-osdev/cargo-xbuild#10). This means that none of the dependencies of bindgen that are optionally-no-std are usable. Notably, this includes byteorder, which is both useful in its own right and an indirect dependency of serde-json-core.
Most no-std crates interpret that to mean no-alloc either, although allocation works totally fine for us. Regular serde-json is capable of handling dynamic structures (e.g., deserializing a Vec); serde-json-core is not.
While in theory you could add a feature flag serde-json for no-std but yes-alloc (serde-rs/json#362), in practice it seems all of its serialization uses the std::io::Write trait and the impl Write for Vec<u8> implementation, so it would be a sizable rewrite to make the crate functional without std.
As noted in #38 and #16, there are other things we likely want from libstd, including hash maps, hash sets, an std::ffi.

I think we should try to port over libstd, by making all the filesystem, subprocess, etc.operations just unconditionally return Err. (Bonus points if we can raise compile-time errors with cfg but that seems like a bigger delta against upstream libstd than I'd like.)

I'm not sure how to do that, though - xargo would have been the obvious answer in the past. I asked over in the rust-osdev Gitter if anyone has recommendations. And as #38 mentions, we'll probably want a bot to automatically rebase our version of it, until it gets upstreamed. (But I suspect this is a reasonable thing to want to upstream and other projects will want it?)

usercopy handling and KERNEL_DS

This project currently seems to assume that a pointer checked with access_ok() always points to userspace; however, thanks to the set_fs(KERNEL_DS) mechanism, that's not necessarily true.

access_ok() doesn't actually always check that the given pointer points to userspace. If you want real kernel memory safety around access_ok(), you'll have to split userspace access into two types:

One that passes userspace pointers wrapped in opaque structs that can't be directly accessed from safe code, with helpers for safe slicing and such, just like for kernel buffers. This variant would also need to have object lifetime checking such that a handle to a "userspace" memory region can't be persisted beyond the end of the current function. (I don't know much about rust, but this seems like something rust probably provides?)
One that accepts raw userspace pointers, but does a stronger check than access_ok() (comparison to TASK_SIZE_MAX or so). (Of course this would still permit corrupting userspace memory, but I assume that's acceptable.)

Longer explanation of the problem:

Linux has various pieces of kernel code that normally operate on userspace pointers, and are optimized for this case - for example, filesystem read and write handlers (->f_op->read(), ->f_op->write(), ...). These handlers take userspace pointers (annotated with __user and, somewhat redundantly, accessed through helpers that check for access_ok()). However, sometimes the kernel needs to directly call into such APIs, operating on kernel buffers. In current kernel versions, this is in particular true for the sys_splice() syscall, which can move data directly between a pipe and a file without going through userspace memory. (There have been discussions about cleaning this up to make it less hazardous, but that hasn't happened yet.)

Normally, the access_ok() checks performed by filesystem code would prevent this from working - after all, these checks are explicitly designed to block the use of kernel pointers. To work around this, the kernel has a function set_fs() that can be used to temporarily override the address limit used for access_ok() checks so that all pointers are accepted. Therefore, if your code can run in a context where address limit checks have been disabled, UserSlicePtr could AFAICS be used to access arbitrary kernel pointers.

Kernel code that does this looks like this (example from fs/read_write.c):

ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
{
	mm_segment_t old_fs;
	ssize_t result;

	old_fs = get_fs();
	set_fs(get_ds());
	/* The cast to a user pointer is valid due to the set_fs() */
	result = vfs_read(file, (void __user *)buf, count, pos);
	set_fs(old_fs);
	return result;
}

There were two recent LWN articles related to this issue:

Examples of kernel bugs caused by this behavior:

ARM/ARM64 arbitrary memory read via access_ok() in a perf subsystem interrupt, which could run in KERNEL_DS context: https://bugs.chromium.org/p/project-zero/issues/detail?id=822
Infiniband subsystem arbitrary kernel write via copy_to_user() in f_op->write() context: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e6bd18f57aad1a2d1ef40e646d03ed0f2515c9e3
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=31e33a5b41bb158f27c30e13b12d6e5e6513ea05
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=60e6627f12a78203a093ca05b7bca15627747d81
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=15279df6f26cf2013d713904b4a0c957ae8abb96
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a0341fc1981a950c1e902ab901e98f60e0e243f3
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f1e255d60ae66a9f672ff9a207ee6cd8e33d2679
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=26b5b874aff5659a7e26e5b1997e3df2c41fa7fd

[info] ClangBuiltLinux

Hello! Neat project, a lot of folks in Android are keeping an eye on it. If you run into any issues with Clang or LLVM for the Linux kernel (or need help with anything else roughly related), please let us know: https://github.com/ClangBuiltLinux/linux/issues.

Use stable Rust (tracking ticket for unstable features we use)

More ambitious than #37, and won't be possible for a long while yet. Here's what we use currently:

#![feature(alloc)]: to be able to use liballoc (Box, String, Vec, etc.). rust-lang/rust#27783 I think, though I don't see any recent progress about liballoc specifically. (But as noted in #37 most things we need are re-exported through libstd, so maybe we can advocate for pushing the unstable attribute down to the things that aren't re-exported?)
#![feature(global_allocator)]: to be able to define an allocator backed by kmalloc. PR rust-lang/rust#51241 from today stabilizes global_allocator and I think all the things under allocator_api that we use. (Also, as part of its work, it pushes down some of liballoc's wide instability to particular modules; we might want to advocate to do more of that to stabilize the re-exported parts of liballoc to solve the previous one.)
#![feature(allocator_api)]: See rust-lang/rust#51540
#![feature(lang_items)] for #[lang = "oom"]: See rust-lang/rust#51540
#![feature(use_extern_macros)]: I think we don't need it, #40.
#![feature(panic_implementation)]: for #[panic_implementation]
#![feature(alloc_error_handler)]: See either rust-lang/rust#51540 or rust-lang/rfcs#2492.
#[feaure(const_str_as_bytes)]: See rust-lang/rust#63770

make C strings a little more ergonomic

See if we can get std::ffi::CString and std::ffi:CStr to work; they're actually in libstd, but I think CString only uses alloc/containers and CStr doesn't even need that. (And decide if we want it?)
Find something better than b"foo\0" for compiile-time static C strings. Options include https://github.com/mzabaluev/rust-c-str and https://github.com/abonander/const-cstr (both of which use std::ffi::CStr) or just coding it ourselves as suggested in rust-lang/rfcs#400 (comment) . If we write it ourselves I might argue for calling it c! to keep it easy to read/write.

Build fails on v5.0+ against kernels built with gcc

Since torvalds/linux@e9666d1, detection for asm goto (which gcc supports and clang does not) got moved from a check at each build to a check in the kernel config, so a kernel built with gcc will have CONFIG_CC_HAS_ASM_GOTO, which breaks the build with clang / breaks bindgen's ability to parse the code with clang. (I haven't tested it, but I expect a kernel built with clang to work fine.)

See discussion in lizhuohua/linux-kernel-module-rust#1.

Support things that would be required for binderfs

Allocate chrdev region with specific number of minors but not register devices (https://github.com/torvalds/linux/blob/v5.0/drivers/android/binderfs.c#L558-L559)
Expand filesystem support (https://github.com/torvalds/linux/blob/v5.0/drivers/android/binderfs.c#L546-L551)
Support inode_operations (https://github.com/torvalds/linux/blob/v5.0/drivers/android/binderfs.c#L459-L463)
Support super_operations (https://github.com/torvalds/linux/blob/v5.0/drivers/android/binderfs.c#L341-L346)
Support nonseekable_open on FileOperations (https://github.com/torvalds/linux/blob/v5.0/drivers/android/binderfs.c#L375)
Support ioctl on FileOperations (https://github.com/torvalds/linux/blob/v5.0/drivers/android/binderfs.c#L376-L377)
Support llseek as a noop on FileOperations (https://github.com/torvalds/linux/blob/v5.0/drivers/android/binderfs.c#L378)

Build on the latest working nightly, not the latest nightly

From the comments in #115 - clippy and rustfmt don't always exist on the latest nightly for whatever reason, which breaks our CI. We can use the API described at https://github.com/rust-lang/rustup-components-history to programmatically find the last date that cargo + clippy + rust + rust-src + rustfmt all existed on our arch(es), and build against that particular nightly.

Replace most (if not all) of the author fields with FiaB

This was it accurately reflects all the contributors we have/will have.

Print error message in #[panic_implementation]

It should be

#[lang = "panic_fmt"]
extern fn panic_fmt(msg: fmt::Arguments, file: &'static str, line: u32, col: u32) -> !;

and possibly also an #[unwind] or #[unwind(allowed)] in future definitions of Rust. See libcore's definition and longer discussion in #19 (comment)

Decide what to do about `malloc()` in filesystem::register

Upside: it works cleanly and has a nice API

Downside: extra malloc, having the file_system_type in heap memory means it's writable which means the function ptrs in it can be targets for control flow hijacking.

Possible solutions are: using a macro to convert the trait implementer to a static, mapping the memory read-only, or deciding it's all fine since our rust modules won't be full of exploitable vulnerabilities (<--- probably not this).

usercopy uses u64 on 4.15.0-1007-aws

On the Ubuntu 18.04 EC2 box (which I still haven't shut down oops):

error[E0308]: mismatched types
   --> /home/ubuntu/t2/src/user_ptr.rs:103:17
    |
103 |                 data.len() as u32,   
    |                 ^^^^^^^^^^^^^^^^^ expected u64, found u32
help: you can cast an `u32` to `u64`, which will zero-extend the source value
    |
103 |                 (data.len() as u32).into(),
    |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^

error[E0308]: mismatched types
   --> /home/ubuntu/t2/src/user_ptr.rs:129:17
    |
129 |                 data.len() as u32,   
    |                 ^^^^^^^^^^^^^^^^^ expected u64, found u32
help: you can cast an `u32` to `u64`, which will zero-extend the source value
    |
129 |                 (data.len() as u32).into(),                                                                                           |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                        
error: aborting due to 2 previous errors                                                                                                                                                                                                                                        ```

Changing it to u64 fixes it.

Probably something switched to `size_t` or whatever?

Build fails on v5.2+

Since torvalds/linux@cdd750b, the __c_flags variable used by kernel-cflags-helper is gone, and we should just use _c_flags. Probably the right thing to do is use __c_flags if it exists and fall back to _c_flags, for correctness with older kernels. (You can also do version comparisons with like, $(shell expr) or something but that sounds like a pain.)

Maybe it's worth seeing if there's a cleaner way of doing all this.

Originally reported in lizhuohua/linux-kernel-module-rust#1.