Giter VIP home page Giter VIP logo

libkrun's Introduction

libkrun

libkrun is a dynamic library that allows programs to easily acquire the ability to run processes in a partially isolated environment using KVM Virtualization.

It integrates a VMM (Virtual Machine Monitor, the userspace side of an Hypervisor) with the minimum amount of emulated devices required to its purpose, abstracting most of the complexity that comes from Virtual Machine management, offering users a simple C API.

Possible use cases

  • Adding VM-isolation capabilities to an OCI runtime.
  • Implementing a lightweight jailer for serverless workloads.
  • Bringing additional self-isolation capabilities to conventional services (think of something as simple as chroot, but more powerful).

Goals and non-goals

Goals

  • Enable other projects to easily gain KVM-based process isolation capabilities.
  • Be self-sufficient (no need for calling to an external VMM) and very simple to use.
  • Be as small as possible, implementing only the features required to achieve its goals.
  • Have the smallest possible footprint in every aspect (RAM consumption, CPU usage and boot time).
  • Be compatible with a reasonable amount of workloads.

Non-goals

  • Become a generic VMM.
  • Be compatible with all kinds of workloads.

Variants

This project provides two different variants of the library:

  • libkrun: Generic variant compatible with all Virtualization-capable systems.
  • libkrun-sev: Variant including support for AMD SEV (bare SEV and SEV-ES) memory encryption and remote attestation. Requires an SEV-capable CPU.

Each variant generates a dynamic library with a different name (and soname), so both can be installed at the same time in the same system.

Virtio device support

All variants

  • virtio-console
  • virtio-vsock (specialized for TSI, Transparent Socket Impersonation)

libkrun

  • virtio-fs
  • virtio-balloon (only free-page reporting)
  • virtio-rng

libkrun-sev

  • virtio-block

Networking

In libkrun, networking is implemented using a novel technique called Transparent Socket Impersonation, or TSI. This allows the VM to have network connectivity without a virtual interface (hence, virtio-net is not among the list of supported devices).

This technique supports both outgoing and incoming connections. It's possible for userspace applications running in the VM are able to transparently connect to endpoints outside the VM, and also receive connections from the outside to ports listening inside the VM.

Limitations

TSI only supports impersonating AF_INET SOCK_DGRAM and SOCK_STREAM sockets. This implies it's not possible to communicate outside the VM with raw sockets.

Building and installing

Linux (generic variant)

Requirements

  • libkrunfw
  • A working Rust toolchain
  • C Library static libraries, as the init binary is statically linked (package glibc-static in Fedora)
  • patchelf

Compiling

make

Installing

sudo make install

Linux (SEV variant)

Requirements

  • The SEV variant of libkrunfw, which provides a libkrunfw-sev.so library.
  • A working Rust toolchain
  • C Library static libraries, as the init binary is statically linked (package glibc-static in Fedora)
  • patchelf
  • OpenSSL headers and libraries (package openssl-devel in Fedora).

Compiling

make SEV=1

Installing

sudo make SEV=1 install

macOS

Requirements

As part of libkrun building process, it's necessary to produce a Linux ELF binary from init/init.c. The easiest way to do this is by using a binary version of krunvm and its dependencies (libkrunfw, and libkrun itself), such as the one available in the krunvm Homebrew repo, and then executing the build_on_krunvm.sh script found in this repository.

This will create a lightweight Linux VM using krunvm with the current working directory mapped inside it, and produce the Linux ELF binary from init/init.c.

Building the library using krunvm

./build_on_krunvm.sh
make

Using the library

Despite being written in Rust, this library provides a simple C API defined in include/libkrun.h

Examples

chroot_vm

This is a simple example providing chroot-like functionality using libkrun.

Building chroot_vm

cd examples
make

Running chroot_vm

To be able to chroot_vm, you need first a directory to act as the root filesystem for your isolated program.

Use the rootfs target to get a rootfs prepared from the Fedora container image (note: you must have podman installed):

make rootfs

Now you can use chroot_vm to run a process within this new root filesystem:

./chroot_vm ./rootfs_fedora /bin/sh

If the libkrun and/or libkrunfw libraries were installed on a path that's not included in your /etc/ld.so.conf configuration, you may get an error like this one:

./chroot_vm: error while loading shared libraries: libkrun.so: cannot open shared object file: No such file or directory

To avoid this problem, use the LD_LIBRARY_PATH environment variable to point to the location where the libraries were installed. For example, if the libraries were installed in /usr/local/lib64, use something like this:

LD_LIBRARY_PATH=/usr/local/lib64 ./chroot_vm rootfs/ /bin/sh

Status

libkrun has achieved maturity and starting version 1.0.0 the public API is guaranteed to be stable, following SemVer.

Known users

  • crun: An OCI runtime that can make use of libkrun to run containers with Virtualization-based isolation.
  • krunvm: A CLI tool for creating and running microVMs based on OCI images.

Getting in contact

The main communication channel is the VirTEE Matrix channel.

Acknowledgments

libkrun incorporates code from Firecracker, rust-vmm and Cloud-Hypervisor.

libkrun's People

Contributors

arkkors avatar asahilina avatar blenessy avatar djs55 avatar dm0- avatar flouthoc avatar germag avatar giuseppe avatar mtjhrc avatar rwmjones avatar sandrobonazzola avatar slp avatar stefano-garzarella avatar teohhanhui avatar tylerfanelli avatar wainersm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

libkrun's Issues

The VM crashes if the root directory doesn't exist

If I configure krun with an invalid root directory (krun_set_root()) and start the VM then it will simply crash. Shouldn't rather it fails nicely?

It can be reproduced with the chroot_vm program:

$ ./chroot_vm rootfs_not_exist_dir /bin/sh
[    0.038923] Kernel panic - not syncing: Requested init /init.krun failed (error -2).
[    0.039014] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.10 #1
[    0.039044] Call Trace:
[    0.039062]  show_stack+0x3d/0x3f
[    0.039124]  dump_stack+0x5e/0x74
[    0.039140]  ? rest_init+0xa0/0xa5
[    0.039159]  panic+0xf6/0x2a4
[    0.039178]  ? kernel_execve+0x145/0x1b0
[    0.039198]  ? rest_init+0xa5/0xa5
[    0.039220]  kernel_init+0xa5/0xfb
[    0.039238]  ret_from_fork+0x1f/0x30
[    0.039273] Kernel Offset: disabled

krun_set_exec: No changes when using API

In examples/launch-tee.c, I add the following:

char *envp[3];
char *x ="KRUN_ATTESTATION_URL=test_url";
char *y = "KRUN_WORKLOAD_ID=test_wid";

envp[0] = x;
envp[1] = y;
envp[2] = NULL;

/*
 * exec path = "/init"
 * argv = NULL
 * envp = envp;
 */
if (err = krun_set_exec(ctx_id, "/init", NULL, envp)) {
    errno = -err;
    perror("Error setting exec(2) parameters");
    return -1;
}

In src/libkrun/src/lib.rs, we can print our envp to be used:

envp to be used: "KRUN_ATTESTATION_URL=url" "KRUN_WORKLOAD_ID=test"

Yet, when initializing a guest VM and reading each environment variable, only these two are found:


envp[0]: HOME=/
envp[1]: TERM=linux

Neither KRUN_ATTESTATION_URL or KRUN_WORKLOAD_ID are available.

Updating vendored dependencies breaks the build

Hi @slp ! :-) So, just ad a foreword, I'm not yet that much into Rust myself, so forgive any inaccuracy/incorrectness of the explanation of the problem.

That said, in openSUSE, we do automatic updates of the vendored dependencies, for Rust packages (I think!). Doing that for libkrun causes some build error.

I believe you should be able to see what happens via this link:
https://build.opensuse.org/build/home:firstyear:branches:Virtualization/openSUSE_Tumbleweed/x86_64/libkrun/_log

But, just in case you don't, or if it disappears, here's what I think is the most relevant part:

[   36s] + cd libkrun-0.1.7
[   36s] + /usr/bin/xz -dc /home/abuild/rpmbuild/SOURCES/vendor.tar.xz
[   36s] + /usr/bin/tar -xof -
[   39s] + STATUS=0
[   39s] + '[' 0 -ne 0 ']'
[   39s] + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
[   39s] + mkdir .cargo
[   39s] + cp /home/abuild/rpmbuild/SOURCES/cargo_config .cargo/config
[   39s] + RPM_EC=0
[   39s] ++ jobs -p
[   39s] + exit 0
[   39s] Executing(%build): /usr/bin/bash -e /var/tmp/rpm-tmp.V4UMJQ
[   39s] + umask 022
[   39s] + cd /home/abuild/rpmbuild/BUILD
[   39s] + /usr/bin/rm -rf /home/abuild/rpmbuild/BUILDROOT/libkrun-0.1.7-14.1.x86_64
[   39s] ++ dirname /home/abuild/rpmbuild/BUILDROOT/libkrun-0.1.7-14.1.x86_64
[   39s] + /usr/bin/mkdir -p /home/abuild/rpmbuild/BUILDROOT
[   39s] + /usr/bin/mkdir /home/abuild/rpmbuild/BUILDROOT/libkrun-0.1.7-14.1.x86_64
[   39s] + cd libkrun-0.1.7
[   39s] + export RUSTFLAGS=-Clink-arg=-Wl,-z,relro,-z,now
[   39s] + RUSTFLAGS=-Clink-arg=-Wl,-z,relro,-z,now
[   39s] + /usr/bin/make -O -j8 V=1 VERBOSE=1
[   40s] gcc -O2 -static -Wall -o init/init init/init.c
[   53s] cargo build --release
[   53s]    Compiling libc v0.2.126
[   53s]    Compiling bitflags v1.3.2
[   53s]    Compiling cfg-if v1.0.0
[   53s]    Compiling version_check v0.9.4
[   53s]    Compiling kvm-ioctls v0.11.0
[   53s]    Compiling log v0.4.17
[   53s]    Compiling once_cell v1.12.0
[   53s]    Compiling lazy_static v1.4.0
[   53s]    Compiling arch_gen v0.1.0 (/home/abuild/rpmbuild/BUILD/libkrun-0.1.7/src/arch_gen)
[   53s]    Compiling cc v1.0.73
[   53s]    Compiling virtio_gen v0.1.0 (/home/abuild/rpmbuild/BUILD/libkrun-0.1.7/src/virtio_gen)
[   53s]    Compiling ahash v0.7.6
[   53s]    Compiling libkrun v0.1.7 (/home/abuild/rpmbuild/BUILD/libkrun-0.1.7/src/libkrun)
[   53s]    Compiling vmm-sys-util v0.9.0
[   53s]    Compiling getrandom v0.2.6
[   53s]    Compiling vm-memory v0.8.0 (https://github.com/rust-vmm/vm-memory#781d300d)
[   53s]    Compiling hashbrown v0.11.2
[   53s]    Compiling utils v0.1.0 (/home/abuild/rpmbuild/BUILD/libkrun-0.1.7/src/utils)
[   53s]    Compiling kvm-bindings v0.5.0
[   53s]    Compiling logger v0.1.0 (/home/abuild/rpmbuild/BUILD/libkrun-0.1.7/src/logger)
[   53s]    Compiling polly v0.0.1 (/home/abuild/rpmbuild/BUILD/libkrun-0.1.7/src/polly)
[   53s] warning: use of deprecated associated function `std::sync::atomic::AtomicUsize::compare_and_swap`: Use `compare_exchange` or `compare_exchange_weak` instead
[   53s]    --> src/logger/src/logger.rs:318:21
[   53s]     |
[   53s] 318 |         match STATE.compare_and_swap(UNINITIALIZED, locked_state, Ordering::SeqCst) {
[   53s]     |                     ^^^^^^^^^^^^^^^^
[   53s]     |
[   53s]     = note: `#[warn(deprecated)]` on by default
[   53s] 
[   53s]    Compiling kernel v0.1.0 (/home/abuild/rpmbuild/BUILD/libkrun-0.1.7/src/kernel)
[   53s]    Compiling hvf v0.1.0 (/home/abuild/rpmbuild/BUILD/libkrun-0.1.7/src/hvf)
[   53s]    Compiling lru v0.6.6
[   53s] warning: `logger` (lib) generated 1 warning
[   53s]    Compiling arch v0.1.0 (/home/abuild/rpmbuild/BUILD/libkrun-0.1.7/src/arch)
[   53s]    Compiling cpuid v0.1.0 (/home/abuild/rpmbuild/BUILD/libkrun-0.1.7/src/cpuid)
[   53s] warning: unnecessary trailing semicolon
[   53s]    --> src/cpuid/src/brand_string.rs:225:10
[   53s]     |
[   53s] 225 |         };
[   53s]     |          ^ help: remove this semicolon
[   53s]     |
[   53s]     = note: `#[warn(redundant_semicolons)]` on by default
[   53s] 
[   53s] error[E0308]: mismatched types
[   53s]    --> src/arch/src/x86_64/msr.rs:221:19
[   53s]     |
[   53s] 221 |     vcpu.set_msrs(&msrs)
[   53s]     |                   ^^^^^ expected struct `vmm_sys_util::fam::FamStructWrapper`, found enum `std::result::Result`
[   53s]     |
[   53s]     = note: expected reference `&vmm_sys_util::fam::FamStructWrapper<kvm_msrs>`
[   53s]                found reference `&std::result::Result<vmm_sys_util::fam::FamStructWrapper<kvm_msrs>, vmm_sys_util::fam::Error>`
[   53s] 
[   53s] error[E0599]: no method named `as_fam_struct_ref` found for enum `std::result::Result` in the current scope
[   53s]    --> src/arch/src/x86_64/msr.rs:224:44
[   53s]     |
[   53s] 224 |             if msrs_written as u32 != msrs.as_fam_struct_ref().nmsrs {
[   53s]     |                                            ^^^^^^^^^^^^^^^^^ method not found in `std::result::Result<vmm_sys_util::fam::FamStructWrapper<kvm_msrs>, vmm_sys_util::fam::Error>`
[   53s] 
[   53s] Some errors have detailed explanations: E0308, E0599.
[   53s] For more information about an error, try `rustc --explain E0308`.
[   53s] error: could not compile `arch` due to 2 previous errors

And our Rust maintainer (hello @Firstyear :-D) believes this is better dealt with here, upstream. What do you think?

Example error InvalidHostAddress

So far so good I successfully build libkrunfw and libkrun (Asahi fedora remix 38).

It is certainly a host configuration problem on my side but I cannot put my finger on it.
But when I try the example I run into this error:

thread '' panicked at 'called 'Result::unwrap()' on an 'Err' value: InvalidHostAddress', src/libkrun/src/lib.rs:228:50
stack backtrace:
0: 0xffff53d83804 -
1: 0xffff53c842c0 -
2: 0xffff53d5bfa4 -
3: 0xffff53d84a50 -
4: 0xffff53d84648 -
5: 0xffff53d855b8 -
6: 0xffff53d850bc -
7: 0xffff53d85030 -
8: 0xffff53d85024 -
9: 0xffff53c52234 -
10: 0xffff53c524fc -
11: 0xffff53c629b4 - krun_create_ctx
12: 0x41022c - main
at /home/bertrand/repo/libkrun/examples/chroot_vm.c:62:14
13: 0xffff53a90598 - __libc_start_call_main
14: 0xffff53a90670 - __libc_start_main_impl
15: 0x410570 - _start
16: 0x0 -
fatal runtime error: failed to initiate panic, error 630731648
Abandon (core dumped)

Non-ASCII/Non-printable environment variables support

Currently libkrun only supports printable ASCII character range for environment variables. I believe this limitation comes from variables being set as kernel args and passed down to the init process.

While this approach is really simple it also makes it not possible to run containers with more complex configuration.

I've been blocked by this limitation when trying to run a Gitlab Runner with Docker (or rather Podman) executor. There are multiple environment variables set by the runner and some of them fall outside of the supported range even if no custom secrets are configured (I believe it might be about new lines in CA files).

While #93 and #94 block me from verifying, it seems likely the issue is purely on validation side - I've built a version with valid_char check disabled and successfully started containers with env variables set to random bytes generated using openssl or new lines.

Relay SIGTERM from the VMM to the isolated process

Some signals, such as SIGTERM, should be relayed from libkrun to the isolated process. There biggest question is how to notify the guest kernel. A dedicated device just for this looks like an overkill to me. Perhaps we could use an NMI or some virtual exception?

ld Unable to find krun_set_data_disk

When running the latest example of examples/launch-tee.c, I receive the following message:

libkrun/examples/launch-tee.c:81: undefined reference to krun_set_data_disk'

This API was introduced in #102

Will investigate.

EFI variant not building correctly

On aarch64 macOS:

$ make EFI=1
cargo build --release --features efi
    Finished release [optimized] target(s) in 0.03s
cp target/release/libkrun-efi.dylib target/release/libkrun-efi.1.7.2.dylib
cp: target/release/libkrun-efi.dylib: No such file or directory
make: *** [target/release/libkrun-efi.1.7.2.dylib] Error 1

$ ls -l target/release
...
libkrun.dylib
...

It seems that the generated dylib is not being named properly.

When manually copying the file over:

$ cp target/release/libkrun.dylib target/release/libkrun-efi.dylib
$ make
gcc -O2 -static -Wall  -o init/init init/init.c 
init/init.c:23:10: fatal error: 'linux/vm_sockets.h' file not found
#include <linux/vm_sockets.h>
         ^~~~~~~~~~~~~~~~~~~~
1 error generated.
make: *** [init/init] Error 1

The init binary is being compiled, which should not be as the EFI feature explicitly sets BUILD_INIT = 0. This is strange, we see that the EFI feature is found because of cargo build's --features efi feature, yet some other features (like BUILD_INIT) may be overwritten?

hvf fails to build

Building on CI seem to fail with errors such as this:

error[E0793]: reference to packed field is unaligned
   --> src/hvf/src/bindings.rs:937:18
    |
937 |         unsafe { &(*(::std::ptr::null::<_OSUnalignedU16>())).__val as *const _ as usize },
    |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: fields of packed structs are not properly aligned, and creating a misaligned reference is undefined behavior (even if that reference is never dereferenced)
    = help: copy the field contents to a local variable, or replace the reference with a raw pointer and use `read_unaligned`/`write_unaligned` (loads and stores via `*p` must be properly aligned even when using raw pointers)

AFAIU this seems to be related to the bindings to the MacOSX hypervisor.

--user flag not respected under (rootless) podman

Regardless of the value of --user, pods started with (rootless) podman + krun have a UID/GID of 0 within the container.

krun:

> podman --runtime=krun run --user=1000:1000 --rm -it registry.fedoraproject.org/fedora sh -c 'id -u; id -g'
0
0

Another runtime (crun):

> podman --runtime=crun run --user=1000:1000 --rm -it registry.fedoraproject.org/fedora sh -c 'id -u; id -g'
1000
1000

Attach to interactive containers doesn't work without TTY

In short, this does work:

# podman run -d -i --runtime /usr/bin/crun alpine
63634bd214c84293bbd00b75d4d64da06bfe1f92fce1a777e2e4331da5611de9
# echo "echo 'test'" | podman attach 6 
test

while this does not:

# podman run -d -i --runtime /usr/bin/krun alpine
45b4014d2c6098c4a7295f23a14892668503fa3b6ebe5e569699e9c99aca5f1f
# echo "echo 'test'" | podman attach 45

For some reason trying to attach to a container without a TTY ends with a freeze (regardless of the --init being used or not).
This breaks usage of containers in scripts.

Running example fails

Trying to run this with CentOS 8 and got the following error

Any idea what causing this?

$ LD_LIBRARY_PATH=/usr/local/lib64 RUST_BACKTRACE=full ./chroot_vm rootfs/ /bin/sh
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: CreateVsockBackend(UnixBind(Os { code: 98, kind: AddrInUse, message: "Address already in use" }))', src/libkrun/src/lib.rs:156:55
stack backtrace:
   0:     0x7fcbd5ab1d8a - std::backtrace_rs::backtrace::libunwind::trace::h04d12fdcddff82aa
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/../../backtrace/src/backtrace/libunwind.rs:100:5
   1:     0x7fcbd5ab1d8a - std::backtrace_rs::backtrace::trace_unsynchronized::h1459b974b6fbe5e1
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x7fcbd5ab1d8a - std::sys_common::backtrace::_print_fmt::h9b8396a669123d95
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:67:5
   3:     0x7fcbd5ab1d8a - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::he009dcaaa75eed60
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:46:22
   4:     0x7fcbd5a4dc3c - core::fmt::write::h77b4746b0dea1dd3
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/fmt/mod.rs:1078:17
   5:     0x7fcbd5ab1571 - std::io::Write::write_fmt::heb7e50902e98831c
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/io/mod.rs:1518:15
   6:     0x7fcbd5ab0f25 - std::sys_common::backtrace::_print::h2d880c9e69a21be9
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:49:5
   7:     0x7fcbd5ab0f25 - std::sys_common::backtrace::print::h5f02b1bb49f36879
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:36:9
   8:     0x7fcbd5ab0f25 - std::panicking::default_hook::{{closure}}::h658e288a7a809b29
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:208:50
   9:     0x7fcbd5ab04e5 - std::panicking::default_hook::hb52d73f0da9a4bb8
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:227:9
  10:     0x7fcbd5ab04e5 - std::panicking::rust_panic_with_hook::hfe7e1c684e3e6462
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:593:17
  11:     0x7fcbd5acdb68 - std::panicking::begin_panic_handler::{{closure}}::h42939e004b32765c
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:499:13
  12:     0x7fcbd5acdadc - std::sys_common::backtrace::__rust_end_short_backtrace::h9d2070f7bf9fd56c
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:141:18
  13:     0x7fcbd5acda8d - rust_begin_unwind
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:495:5
  14:     0x7fcbd5a4c680 - core::panicking::panic_fmt::ha0bb065d9a260792
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/panicking.rs:92:14
  15:     0x7fcbd5a4e302 - core::option::expect_none_failed::h7e1dd0a94971eb61
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/option.rs:1268:5
  16:     0x7fcbd5a9c3ea - krun_create_ctx
  17:           0x40088d - main
                               at /home/vagrant/libkrun/examples/chroot_vm.c:41:14
  18:     0x7fcbd569d7b3 - __libc_start_main
  19:           0x400a2e - _start
  20:                0x0 - <unknown>
fatal runtime error: failed to initiate panic, error 5
Aborted (core dumped)

Collect prlimits and enforce them inside the VM

Some containers request a particular set resource limits. Collect them from the context where the VM is being created, and enforce them inside it. We can use environment variables to get init.krun do the work before starting the isolated process.

Remove legacy AMD SEV module

With legacy SEV largely being replaced by SEV-SNP, is it worthwhile to remove the libkrun-sev (containing both SEV and SEV-SNP modules) in favor of strictly libkrun-snp? This is motivated by attestation changes to be merged soon that largely have post attestation in mind.

@slp

passt socket set via `krun_set_passt_fd` is never closed

The nature of raw fd's means that we should assume ownership, and close it so that the passt process can exit.

The calling program e.g. chroot_vm cannot do this, as once krun_start_enter is called, control never returns to the caller.

chroot_vm example and STDIN question

I'm trying to understand the handling of STDIN and am a little lost. From the comments in libkrun.h where it says

The VMM will attempt to take over stdin/stdout to manage them on behalf of the process running inside the isolated environment, simulating that the latter has direct control of the terminal.

I'm assuming that something is being done with the STDIN FD of the chroot_vm and therefore the target process STDIN and it looks from running a process with chroot_vm as a subprocess via python that STDIN is getting closed and reopened somehow.

In these lsof listings I can see the parent (75590) has opened PIPEs to the subprocess and STDOUT/STDERR remain open and connected to the chroot_vm child (75622) but the child has FD 0 as a PIPE with no connection back to the parent.

Is this expected? If so what is the recommended way to drive a subprocess where you need to feed input to it's STDIN?

➜ lsof -p75590
COMMAND     PID USER   FD   TYPE             DEVICE SIZE/OFF    NODE NAME
python3.1 75590  pfw  cwd    DIR               1,17      192 6234961 .../libkrun/examples/prog
python3.1 75590  pfw  txt    REG               1,17    33815 4268189 .../.pyenv/versions/3.11.2/bin/python3.11
python3.1 75590  pfw  txt    REG               1,17    53528 4276793 .../.pyenv/versions/3.11.2/lib/python3.11/lib-dynload/fcntl.cpython-311-darwin.so
python3.1 75590  pfw  txt    REG               1,17    54867 4276767 .../.pyenv/versions/3.11.2/lib/python3.11/lib-dynload/_posixsubprocess.cpython-311-darwin.so
python3.1 75590  pfw  txt    REG               1,17   110624 2572457 /opt/homebrew/Cellar/gettext/0.21.1/lib/libintl.8.dylib
python3.1 75590  pfw  txt    REG               1,17    77033 4276771 .../.pyenv/versions/3.11.2/lib/python3.11/lib-dynload/select.cpython-311-darwin.so
python3.1 75590  pfw  txt    REG               1,17   101511 4276748 .../.pyenv/versions/3.11.2/lib/python3.11/lib-dynload/math.cpython-311-darwin.so
python3.1 75590  pfw  txt    REG               1,17  5407488 4268190 .../.pyenv/versions/3.11.2/lib/libpython3.11.dylib
python3.1 75590  pfw    0u   CHR               16,6  0t83467    1671 /dev/ttys006
python3.1 75590  pfw    1u   CHR               16,6  0t83467    1671 /dev/ttys006
python3.1 75590  pfw    2u   CHR               16,6  0t83467    1671 /dev/ttys006
python3.1 75590  pfw    5   PIPE 0xaa73e3d0de8532fb    16384         ->0xa6b137dd3fcc053a
python3.1 75590  pfw    7   PIPE 0xf89f4fafff9312f5    16384         ->0x6802587995e8e280

➜ lsof -p75622
COMMAND     PID USER   FD     TYPE             DEVICE SIZE/OFF    NODE NAME
chroot_vm 75622  pfw  cwd      DIR               1,17      192 6234961 .../libkrun/examples/prog
chroot_vm 75622  pfw  txt      REG               1,17    55248 6119277 .../libkrun/examples/chroot_vm
chroot_vm 75622  pfw  txt      REG               1,17    74064 5872731 /opt/homebrew/Cellar/dtc/1.7.0/lib/libfdt-1.7.0.dylib
chroot_vm 75622  pfw  txt      REG               1,17  3714304 5878207 /opt/homebrew/Cellar/libkrun/1.5.1/lib/libkrun.1.5.1.dylib
chroot_vm 75622  pfw  txt      REG               1,17 17702848 5872578 /opt/homebrew/Cellar/libkrunfw/3.10.0/lib/libkrunfw.3.dylib
chroot_vm 75622  pfw    0     PIPE 0x5a9168da4848861a    16384         
chroot_vm 75622  pfw    1     PIPE 0xa6b137dd3fcc053a    16384         ->0xaa73e3d0de8532fb
chroot_vm 75622  pfw    2     PIPE 0x6802587995e8e280    16384         ->0xf89f4fafff9312f5
chroot_vm 75622  pfw    3u  KQUEUE                                     count=1, state=0x8
chroot_vm 75622  pfw    4     PIPE 0x27c70e5691c6b5da    16384         ->0x561a22f187d72b2

build failure on aarch64

Since version 1.4.7 (still present in 1.4.9), libkrun doesn't build anymore on aarch64 on openSUSE Tumbleweed:

[   84s]    Compiling serde v1.0.149
[   84s]    Compiling arch_gen v0.1.0 (/home/abuild/rpmbuild/BUILD/libkrun-1.4.8/src/arch_gen)
[   84s]    Compiling arch v0.1.0 (/home/abuild/rpmbuild/BUILD/libkrun-1.4.8/src/arch)
[   84s] error[E0308]: mismatched types
[   84s]     --> src/arch/src/aarch64/linux/regs.rs:128:47
[   84s]      |
[   84s] 128  |     vcpu.set_one_reg(arm64_core_reg!(pstate), PSTATE_FAULT_BITS_64)
[   84s]      |          -----------                          ^^^^^^^^^^^^^^^^^^^^ expected `u128`, found `u64`
[   84s]      |          |
[   84s]      |          arguments to this function are incorrect
[   84s]      |
[   84s] note: associated function defined here
[   84s]     --> /home/abuild/rpmbuild/BUILD/libkrun-1.4.8/vendor/kvm-ioctls/src/ioctls/vcpu.rs:1199:12
[   84s]      |
[   84s] 1199 |     pub fn set_one_reg(&self, reg_id: u64, data: u128) -> Result<()> {
[   84s]      |            ^^^^^^^^^^^
[   84s] help: you can convert a `u64` to a `u128`
[   84s]      |
[   84s] 128  |     vcpu.set_one_reg(arm64_core_reg!(pstate), PSTATE_FAULT_BITS_64.into())
[   84s]      |                                                                   +++++++
[   84s] 
[   84s] error[E0308]: mismatched types
[   84s]     --> src/arch/src/aarch64/linux/regs.rs:135:47
[   84s]      |
[   84s] 135  |         vcpu.set_one_reg(arm64_core_reg!(pc), boot_ip)
[   84s]      |              -----------                      ^^^^^^^ expected `u128`, found `u64`
[   84s]      |              |
[   84s]      |              arguments to this function are incorrect
[   84s]      |
[   84s] note: associated function defined here
[   84s]     --> /home/abuild/rpmbuild/BUILD/libkrun-1.4.8/vendor/kvm-ioctls/src/ioctls/vcpu.rs:1199:12
[   84s]      |
[   84s] 1199 |     pub fn set_one_reg(&self, reg_id: u64, data: u128) -> Result<()> {
[   84s]      |            ^^^^^^^^^^^
[   84s] help: you can convert a `u64` to a `u128`
[   84s]      |
[   84s] 135  |         vcpu.set_one_reg(arm64_core_reg!(pc), boot_ip.into())
[   84s]      |                                                      +++++++
[   84s] 
[   84s] error[E0308]: mismatched types
[   84s]     --> src/arch/src/aarch64/linux/regs.rs:143:49
[   84s]      |
[   84s] 143  |         vcpu.set_one_reg(arm64_core_reg!(regs), get_fdt_addr(mem) as u64)
[   84s]      |              -----------                        ^^^^^^^^^^^^^^^^^^^^^^^^ expected `u128`, found `u64`
[   84s]      |              |
[   84s]      |              arguments to this function are incorrect
[   84s]      |
[   84s] note: associated function defined here
[   84s]     --> /home/abuild/rpmbuild/BUILD/libkrun-1.4.8/vendor/kvm-ioctls/src/ioctls/vcpu.rs:1199:12
[   84s]      |
[   84s] 1199 |     pub fn set_one_reg(&self, reg_id: u64, data: u128) -> Result<()> {
[   84s]      |            ^^^^^^^^^^^
[   84s] help: you can convert a `u64` to a `u128`
[   84s]      |
[   84s] 143  |         vcpu.set_one_reg(arm64_core_reg!(regs), (get_fdt_addr(mem) as u64).into())
[   84s]      |                                                 +                        ++++++++
[   84s] 
[   84s] error[E0308]: mismatched types
[   84s]    --> src/arch/src/aarch64/linux/regs.rs:155:5
[   84s]     |
[   84s] 154 | pub fn read_mpidr(vcpu: &VcpuFd) -> Result<u64> {
[   84s]     |                                     ----------- expected `std::result::Result<u64, regs::Error>` because of return type
[   84s] 155 |     vcpu.get_one_reg(MPIDR_EL1).map_err(Error::GetSysRegister)
[   84s]     |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `u64`, found `u128`
[   84s]     |
[   84s]     = note: expected enum `std::result::Result<u64, _>`
[   84s]                found enum `std::result::Result<u128, _>`
[   84s] 
[   84s] For more information about this error, try `rustc --explain E0308`.

Container stops without error when many env variables specified

When running a container through podman/crun in krun mode it's not possible to specify more than 26 environment variables.

If more variables are specified container exits (without any visible errors).

Environment:
Debian 11
libkrun built from #92
libkrunfw 3.8.1
Both self-built without any special flags.

Build examples failed

When build examples, I got this failure:
gcc -o chroot_vm chroot_vm.c -O2 -g -lkrun
//usr/local/lib64/libkrun.so: undefined reference to `copy_file_range'
collect2: error: ld returned 1 exit status
Makefile:22: recipe for target 'chroot_vm' failed
make: *** [chroot_vm] Error 1

My build steps:

  1. Build libkrunfw
git clone https://github.com/containers/libkrunfw.git
cd libkrunfw
make -j8
sudo make install
  1. Build libkrun
git clone https://github.com/containers/libkrun.git
cd libkrun
make -j8
sudo make install
  1. Build this examples
cd examples
export LD_LIBRARY_PATH=/usr/local/lib64
make

Weird permission destruction

Using krunvm 0.1.5 from brew on a M1 MPB, I get this very weird behaviour:

$ echo foo >testfile
$ ls -l testfile
-rw-r--r-- 1 ross ross 4 Jun 28 14:40 testfile
$ sed -i -e 's/foo/bar/' testfile
$ ls -l testfile
----------+ 1 ross ross 4 Jun 28 14:40 testfile

Container freezes when long env variable specified

Similar story to #93. If you provide a very long environment variable (even a single one) containers fails to start and freezes indefinitely (or at least for a long time - waited for an hour before killing it). It uses 100% of a single core when frozen.

Example env variable long enough to freeze it:

GITLAB_FEATURES=audit_events,blocked_issues,board_iteration_lists,code_owners,code_review_analytics,contribution_analytics,description_diffs,elastic_search,full_codequality_report,group_activity_analytics,group_bulk_edit,group_webhooks,issuable_default_templates,issue_weights,iterations,ldap_group_sync,member_lock,merge_request_approvers,milestone_charts,multiple_issue_assignees,multiple_ldap_servers,multiple_merge_request_assignees,multiple_merge_request_reviewers,project_merge_request_analytics,protected_refs_for_users,push_rules,repository_mirrors,resource_access_token,seat_link,scoped_issue_board,usage_quotas,visual_review_app,wip_limits,send_emails_from_admin_area,repository_size_limit,adjourned_deletion_for_projects_and_groups,admin_audit_log,auditor_user,blocking_merge_requests,board_assignee_lists,board_milestone_lists,ci_cd_projects,ci_secrets_management,cluster_agents_ci_impersonation,cluster_deployments,code_owner_approval_required,commit_committer_check,compliance_framework,custom_compliance_frameworks,cross_project_pipelines,custom_file_templates,custom_file_templates_for_namespace,custom_project_templates,cycle_analytics_for_groups,cycle_analytics_for_projects,db_load_balancing,default_branch_protection_restriction_in_groups,default_project_deletion_protection,disable_name_update_for_users,domain_verification,email_additional_text,epics,extended_audit_events,external_authorization_service_api_management,feature_flags_related_issues,feature_flags_code_references,file_locks,fips_disable_personal_access_tokens,geo,generic_alert_fingerprinting,git_two_factor_enforcement,github_integration,group_allowed_email_domains,group_coverage_reports,group_forking_protection,group_merge_request_analytics,group_milestone_project_releases,group_project_templates,group_repository_analytics,group_saml,group_scoped_ci_variables,group_wikis,incident_sla,incident_metric_upload,ide_schema_config,issues_analytics,jira_issues_integration,ldap_group_sync_filter,merge_pipelines,merge_request_performance_metrics,admin_merge_request_approvers_rules,merge_trains,metrics_reports,multiple_alert_http_integrations,multiple_approval_rules,multiple_group_issue_boards,object_storage,operations_dashboard,package_forwarding,pages_size_limit,password_complexity,productivity_analytics,project_aliases,protected_environments,reject_non_dco_commits,reject_unsigned_commits,saml_group_sync,scoped_labels,smartcard_auth,swimlanes,type_of_work_analytics,minimal_access_role,unprotection_restrictions,ci_project_subscriptions,incident_timeline_view,oncall_schedules,escalation_policies,export_user_permissions,zentao_issues_integration,coverage_check_approval_rule,issuable_resource_links,group_ip_restriction

Didn't check how long it needs to be.

Environment:
Debian 11
libkrun built from #92
libkrunfw 3.8.1
Both self-built without any special flags.

libkrun-tee: Force the usage of a TeeConfig

cc @slp

This really applies to libkrun and reference-kbs. I feel it would be much easier if we'd force a user to supply a TeeConfig when wanting to run a confidential workload. I have a commit for libkrun that does exactly that, but I'd like some thoughts before I commit. This way, we wouldn't have to continually unwrap() Option<TeeConfig>s, as the case in which someone would like to run a confidential workload yet not supply a TeeConfig wouldn't make sense.

Support non-TTY use cases

In its current form, the way in which stdin/stdout are tied to virtio-console only plays nice with interactive TTY-like sessions. This means, in may scenarios, libkrun-based VMMs can't be used from scripts.

We need to revamp the way in which stdin/stdout, virtio-console and init.c interact between them to also support non-TTY use cases. This means we need to:

  • Stop tying the VM lifetime to the availability of stdin.
  • Implement a handshake between virtio-console and init.c to ensure no bytes are lost while the VM is booting.
  • Extend init.c so, when a non-TTY use case is detected, is kept alive acting as proxy between the app running in the VM and virtio-console.

This will fix #130 #97

build issue `VcpuExit::FailEntry`

failed on

 error[E0532]: expected unit struct, unit variant or constant, found tuple variant `VcpuExit::FailEntry`
     --> src/vmm/src/linux/vstate.rs:1169:17
      |
 1169 |                 VcpuExit::FailEntry => {
      |                 ^^^^^^^^^^^^^^^^^^^ help: use the tuple variant pattern syntax instead: `VcpuExit::FailEntry(/* fields */)`
      |
     ::: /home/abuild/rpmbuild/BUILD/libkrun-1.4.6/vendor/kvm-ioctls/src/ioctls/vcpu.rs:59:5
      |
 59 |     FailEntry(
      |     --------- `VcpuExit::FailEntry` defined here

it appears this patch resolves the issue:

--- a/src/vmm/src/linux/vstate.rs
+++ b/src/vmm/src/linux/vstate.rs
@@ -1166,7 +1166,7 @@ impl Vcpu {
                 }
                 // Documentation specifies that below kvm exits are considered
                 // errors.
-                VcpuExit::FailEntry => {
+                VcpuExit::FailEntry(..) => {
                     error!("Received KVM_EXIT_FAIL_ENTRY signal");
                     Err(Error::VcpuUnhandledKvmExit)
                 }

Failed to make libkrun-sev

Hi, I am trying to build the SEV variant of libkrun, but it is failed. The error message is:

$ make SEV=1
gcc -O2 -static -Wall -DSEV=1 -lcurl -lidn2 -lssl -lcrypto -lzstd -lz -lbrotlidec-static -lbrotlicommon-static -o init/init init/init.c init/tee/snp_attest.c init/tee/snp_attest.h init/tee/kbs/kbs.h init/tee/kbs/kbs_util.c init/tee/kbs/kbs_types.c init/tee/kbs/kbs_curl.c init/tee/kbs/kbs_crypto.c  -DSEV=1 -lcurl -lidn2 -lssl -lcrypto -lzstd -lz -lbrotlidec-static -lbrotlicommon-static
init/init.c: In function ‘main’:
init/init.c:818:17: warning: ignoring return value of ‘sethostname’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  818 |                 sethostname(hostname, strlen(hostname));
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
init/init.c:820:17: warning: ignoring return value of ‘sethostname’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  820 |                 sethostname(&localhost[0], strlen(localhost));
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
init/init.c:830:17: warning: ignoring return value of ‘chdir’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  830 |                 chdir(env_workdir);
      |                 ^~~~~~~~~~~~~~~~~~
init/init.c:832:17: warning: ignoring return value of ‘chdir’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  832 |                 chdir(config_workdir);
      |                 ^~~~~~~~~~~~~~~~~~~~~
init/init.c: In function ‘chroot_luks’:
init/init.c:352:9: warning: ignoring return value of ‘pipe’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  352 |         pipe(pipefd);
      |         ^~~~~~~~~~~~
init/init.c:365:17: warning: ignoring return value of ‘write’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  365 |                 write(pipefd[1], pass, strnlen(pass, pass_len));
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
init/init.c:379:9: warning: ignoring return value of ‘chdir’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  379 |         chdir("/luksroot");
      |         ^~~~~~~~~~~~~~~~~~
init/init.c:385:9: warning: ignoring return value of ‘chroot’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  385 |         chroot(".");
      |         ^~~~~~~~~~~
init/init.c: In function ‘mount_filesystems’:
init/init.c:448:9: warning: ignoring return value of ‘symlink’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  448 |         symlink("/proc/self/fd", "/dev/fd");
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/libcrypto.a(libcrypto-lib-dso_dlfcn.o): in function `dlfcn_globallookup':
(.text+0x17): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /usr/local/lib/libcurl.a(libcurl_la-netrc.o): in function `Curl_parsenetrc':
netrc.c:(.text+0x6e4): warning: Using 'getpwuid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /usr/local/lib/libcurl.a(libcurl_la-curl_addrinfo.o): in function `Curl_getaddrinfo_ex':
curl_addrinfo.c:(.text+0x1f4): warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/libcrypto.a(libcrypto-lib-bio_sock.o): in function `BIO_gethostbyname':
(.text+0x75): warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/usr/bin/ld: /usr/local/lib/libcurl.a(libcurl_la-psl.o): in function `Curl_psl_destroy':
psl.c:(.text+0x29): undefined reference to `psl_free'
/usr/bin/ld: /usr/local/lib/libcurl.a(libcurl_la-psl.o): in function `Curl_psl_use':
psl.c:(.text+0xbb): undefined reference to `psl_latest'
/usr/bin/ld: psl.c:(.text+0x187): undefined reference to `psl_builtin'
/usr/bin/ld: psl.c:(.text+0x1a1): undefined reference to `psl_free'
/usr/bin/ld: /usr/local/lib/libcurl.a(libcurl_la-cookie.o): in function `Curl_cookie_add':
cookie.c:(.text+0x176f): undefined reference to `psl_is_cookie_domain_acceptable'
/usr/bin/ld: /usr/local/lib/libcurl.a(libcurl_la-http2.o): in function `cf_h2_query':
http2.c:(.text+0x5a): undefined reference to `nghttp2_session_check_request_allowed'
..........balabala, the following messages are similar, i.e., undefined reference to 'xxx'

Above error messages contain two questions, the first is the warning:

Using 'xxx' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

I want to ask that does this warning means the compiling is not linked with static libraries? My question may sound naive because I'm not very familiar with static compilation.

And the second question is the error of undefined reference to different functions. I tried to address this issue and I modified the SEV_LD_FLAGS in Makefile as:

SEV_LD_FLAGS =	-lcurl -lnghttp2 -lpsl -lidn2 -lunistring -lssl -lcrypto -lzstd -lz -lbrotlidec-static -lbrotlicommon-static

The compilation can be succeed, even the warning messages of "Using 'xxx' in statically linked applications ...." are still there. Then I finish the installation of libkrun-sev.so to /usr/local/lib64.

However, when I run "make launch-tee" in the example folder, it is failed and output errors:

gcc -o launch-tee launch-tee.c -O2 -g -lkrun-sev
/usr/bin/ld: /usr/local/lib64/libkrun-sev.so: undefined reference to `_kBrotliPrefixCodeRanges'
/usr/bin/ld: /usr/local/lib64/libkrun-sev.so: undefined reference to `BrotliDefaultAllocFunc'
/usr/bin/ld: /usr/local/lib64/libkrun-sev.so: undefined reference to `BrotliTransformDictionaryWord'
/usr/bin/ld: /usr/local/lib64/libkrun-sev.so: undefined reference to `_kBrotliContextLookupTable'
/usr/bin/ld: /usr/local/lib64/libkrun-sev.so: undefined reference to `BrotliSharedDictionaryAttach'
/usr/bin/ld: /usr/local/lib64/libkrun-sev.so: undefined reference to `BrotliDefaultFreeFunc'
/usr/bin/ld: /usr/local/lib64/libkrun-sev.so: undefined reference to `BrotliSharedDictionaryDestroyInstance'
/usr/bin/ld: /usr/local/lib64/libkrun-sev.so: undefined reference to `BrotliSharedDictionaryCreateInstance'
collect2: error: ld returned 1 exit status
make: *** [Makefile:29: launch-tee] Error 1

The error seems to be caused by brotli libraries. But both libbrotlidec-static.a and libbrotlicommon-static.a should have been installed on my computer. Do you have any idea about this problem ?

fails to build with `EFI=1`

cargo build --release --features efi,gpu
   Compiling vmm v0.1.0 (/var/srv/walters/src/github/containers/libkrun/src/vmm)
error[E0425]: cannot find value `kernel_bundle` in this scope
   --> src/vmm/src/builder.rs:475:46
    |
475 |     let boot_ip: GuestAddress = GuestAddress(kernel_bundle.entry_addr);
    |                                              ^^^^^^^^^^^^^ not found in this scope

error[E0412]: cannot find type `MmapRegion` in this scope
   --> src/vmm/src/builder.rs:678:20
    |
678 |     kernel_region: MmapRegion,
    |                    ^^^^^^^^^^ not found in this scope
    |
help: consider importing this struct
    |
8   + use vm_memory::MmapRegion;
    |

error[E0061]: this function takes 4 arguments but 1 argument was supplied
   --> src/vmm/src/builder.rs:331:44
    |
331 |       let (guest_memory, arch_memory_info) = create_guest_memory(
    |  ____________________________________________^^^^^^^^^^^^^^^^^^^-
332 | |         vm_resources
333 | |             .vm_config()
334 | |             .mem_size_mib
...   |
345 | |         initrd_bundle,
346 | |     )?;
    | |_____- three arguments are missing
    |
note: function defined here
   --> src/vmm/src/builder.rs:676:8
    |
676 | pub fn create_guest_memory(
    |        ^^^^^^^^^^^^^^^^^^^
677 |     mem_size_mib: usize,
    |     -------------------
678 |     kernel_region: MmapRegion,
    |     -------------------------
679 |     kernel_load_addr: u64,
    |     ---------------------
680 |     kernel_size: usize,
    |     ------------------
help: provide the arguments
    |
331 ~     let (guest_memory, arch_memory_info) = create_guest_memory(vm_resources
332 +             .vm_config()
333 +             .mem_size_mib
334 ~             .ok_or(StartMicrovmError::MissingMemSizeConfig)?, /* kernel_region */, /* u64 */, /* usize */)?;
    |

Some errors have detailed explanations: E0061, E0412, E0425.
For more information about an error, try `rustc --explain E0061`.
error: could not compile `vmm` (lib) due to 3 previous errors

Some files in a glusterfs mount do not appear.

Unsure of the cause yet but will try and find a reproducible example to update the issue with.

I have a glusterfs disk (mounted with mount -t glusterfs node:/gv0 /mnt) which is then mounted into a container with krun and podman: podman --runtime=/usr/local/bin/krun -v /mnt:/mnt which for some reason does not display some files, of a rough 30,000 files, about 10% of them simple do not appear in the VM. All permissions are correct and running with crun or podman machine does not have this problem.

I notice that this doesn't seem to be a file based problem though, but more that no files appear in certain folders, I can touch /mnt/foo/test.txt for example where /mnt/foo is one such affected folder and the file will appear on the host, but still not show in the VM. Any files placed, or created in this folder whether by the VM or the host are invisible to the container.

Clock skew

I have very significant clock skew with libkrun:

make[1]: Warning: File 'recorder' has modification time 69 s in the future
make[1]: warning:  Clock skew detected.  Your build may be incomplete.

The skew seems to increase and be somewhat random. I noticed skew values with 5 digits.

How is time managed in libkrun?

podman run with krun not honoring env from container image config

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Running a container with podman --runtime krun results in the container running without the environment specified in the container image metadata.

Steps to reproduce the issue:

  1. Run an image with crun, and print out env variables.
$ podman --runtime crun run --rm -it docker.io/alpine:3.17.0 env
TERM=xterm
container=podman
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOME=/root
HOSTNAME=af84bff9c00a
  1. Run the same image with krun, and print out env variables.
$ podman --runtime krun run --rm -it docker.io/alpine:3.17.0 env
HOME=/
TERM=linux
KRUN_INIT=/usr/bin/env
KRUN_WORKDIR=/

Describe the results you received:

I expected to see the env variables from as seen in the image config:

$ podman inspect -f '{{.Config.Env}}'  docker.io/alpine:3.17.0
[PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin]

Describe the results you expected:

Running with crun had env vars set as specified in the image config, but krun did not.

Output of podman version:

$ podman version
Client:       Podman Engine
Version:      4.3.1
API Version:  4.3.1
Go Version:   go1.18.8
Built:        Wed Dec 31 19:00:00 1969
OS/Arch:      linux/amd64

Output of podman info:

$ podman info
host:
  arch: amd64
  buildahVersion: 1.28.0
  cgroupControllers:
  - memory
  - pids
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: Unknown
    path: /usr/local/lib/podman/conmon
    version: 'conmon version 2.1.5, commit: c9f7f19eb82d5b8151fc3ba7fbbccf03fdcd0325'
  cpuUtilization:
    idlePercent: 98.2
    systemPercent: 0.68
    userPercent: 1.12
  cpus: 8
  distribution:
    codename: bookworm
    distribution: debian
    version: unknown
  eventLogger: file
  hostname: beast3
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.0.0-5-amd64
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 8759042048
  memTotal: 33635688448
  networkBackend: cni
  ociRuntime:
    name: crun
    package: Unknown
    path: /usr/local/bin/crun
    version: |-
      crun version 1.7.2
      commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +LIBKRUN +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_AUDIT_WRITE,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_MKNOD,CAP_NET_BIND_SERVICE,CAP_NET_RAW,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: ""
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/local/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.2
  swapFree: 34324082688
  swapTotal: 34324082688
  uptime: 166h 46m 46.00s (Approximately 6.92 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /data/home/andy/.config/containers/storage.conf
  containerStore:
    number: 5
    paused: 0
    running: 4
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.ignore_chown_errors: "true"
    overlay.mount_program:
      Executable: /usr/local/bin/fuse-overlayfs
      Package: Unknown
      Version: |-
        fuse-overlayfs: version 1.9
        fusermount3 version: 3.12.0
        FUSE library version 3.12.0
        using FUSE kernel interface version 7.31
    overlay.mountopt: nodev,fsync=0
  graphRoot: /data/home/andy/.local/share/containers/storage
  graphRootAllocated: 4000787030016
  graphRootUsed: 1148042207232
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 7
  runRoot: /run/user/1000/containers
  volumePath: /data/home/andy/.local/share/containers/storage/volumes
version:
  APIVersion: 4.3.1
  Built: 0
  BuiltTime: Wed Dec 31 19:00:00 1969
  GitCommit: ""
  GoVersion: go1.18.8
  Os: linux
  OsArch: linux/amd64
  Version: 4.3.1

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

crun and krun are both 1.7.2

$ ls -l /usr/local/bin/{crun,krun}
-rwxr-xr-x 1 root root  1955200 Dec  7 00:18 /usr/local/bin/crun
lrwxrwxrwx 1 root staff      19 Dec 15 01:18 /usr/local/bin/krun -> /usr/local/bin/crun
$ krun --version
crun version 1.7.2
commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
rundir: /run/user/1000/crun
spec: 1.0.0
+SELINUX +APPARMOR +CAP +SECCOMP +EBPF +LIBKRUN +YAJL

$ crun --version
crun version 1.7.2
commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
rundir: /run/user/1000/crun
spec: 1.0.0
+SELINUX +APPARMOR +CAP +SECCOMP +EBPF +LIBKRUN +YAJL

Out of range slice panic for `init` virtio reads.

I'm testing out libkrun with podman and ran into the following error:

$ podman run --runtime=/usr/local/bin/krun --rm docker.io/hello-world
panicked at 'range end index 733184 out of range for slice of length 732776', src/devices/src/virtio/fs/linux/passthrough.rs:987:29

Which I noticed 732776 is the size of my init binary. Indeed this seems to happen specifically when the read is detected against the init file's inode.

I believe this may be related to this check in setupmapping? Which I'm guessing would normally lead to a SIGBUS error when trying to read beyond the end of a partially filled page. I worked around this by padding my init with 0's:

truncate -s 733184 init/init

This does work, but it is a bit of a nasty hack. I'm guessing the main cause here is an off by one page mapping bug of some kind but I am not familiar at all with virtio/fuse so I'm making wild guesses, but it happens with the standard suggested build process.

EDIT:

Just spotted #132 -- I'm assuming this is related, just for context this is happening on master for me, my page size for reference:

$ getconf PAGESIZE
4096

init: consider mkdir the mount directories

The init executable is going to mount some filesystems on the guest (proc, sysfs, and so on). If the rootfs doesn't have, say, `/proc' then the operation fails.

It can be reproduced with the chroot_vm program and the rootfs made from a busybox container.

Should some (or all) of those mount point directories be created?

Args with '--' don't parse correctly

Given a test program, /test-args, with the following content:

#!/bin/sh

echo $@

When I use chroot_vm.c to call /test-args with the arguments 1 2 3 4 5 6, I get back 1 2 3 4 5 6 respectively.

However, when I use chroot_vm.c to call /test-args with the arguments -- 1 2 3 4 5 6, I get back no arguments.

It seems that libkrun is stripping out arguments entirely if they contain --. I ran into this when I was trying to get tini to cooperate and realized that it wasn't seeing any parameters if the args contained a --.

Note that this doesn't happen if I don't use krun_set_exec at all and instead pivot to using /.krun_config.json to set my command, which works fine.

Readme.md and virtio-net

Hello,

Just recently I discovered this amazing library, so I was investigating PR and commits and so on, and found out that virtio-net was introduced past month with: #142

And the networking part of the README.md is not updated, or at least is not clear enough to me if TSI is still being used or if virtio-net is available, as it specifically says that it's not.

Since I'm not so familiar yet, I'll not create a PR with a corrected README, but just wanted you to notice :)

Thanks for this amazing piece of software!

[Question] Upstreaming TSI patches to Linux and support for snapshotting

I find the use of TSI (Transparent Socket Impersonation) in the guest VM to communicate with the host very interesting. It simplifies the whole networking story in the VM which is always a huge hassle.

I notice that TSI is still implemented as a series of patches to the Linux kernel. My understanding is that TSI is planned to be upstreamed into Linux proper.

However, its been sometime since these patches were out there and wanted to know when this might be merged into the mainline.

So currently we have the convenience of TSI via a patched kernel and on the other hand we have user space networking like slirp. Why was the use of something like slirp not considered as an option in krunvm?

I am also interested in the possibility of snapshotting the VM like firecracker does. Is that a feature that is considered interesting/useful to the libkrun project?

WARN devices::virtio::vsock::muxer] stream: unhandled op=3

This is regarding the same scenario as in #112 .

When running libkrun with warnings enabled you can see these lines frequently:

WARN devices::virtio::vsock::muxer] stream: unhandled op=3

When this warning is shown we are also leaking a Proxy object bound to a listening socket (see proof in subsequent comment)

op=3 is VSOCK_OP_RST. This means that the TcpProxy in tcp.rs does not get updated with VSOCK reset events.

Please advice - if trivial I'll gladly fix and test.

Building the microVM failed: SecureVirtPrepare(SnpSecVirtPrepare

I am testing the launch-tee example on a Dell R6515 equipped with AMD 7313P.

I manage to run the sev-config-no-attest.json example up to the point where the LUKS tries to unlock the protected partition. However, the snp-config-no-attest.json does not get that far but fails with:

# RUST_LOG=debug ~/libkrun/examples/launch-tee ~/disk-fedora.raw snp-config-noattest.json 
[2022-11-27T20:21:32Z INFO  vmm::linux::vstate] Guest memory starts at 0x7f24e7400000
[2022-11-27T20:21:32Z INFO  vmm::linux::vstate] Guest memory starts at 0x7f2569b0f000
[2022-11-27T20:21:32Z ERROR krun] Building the microVM failed: SecureVirtPrepare(SnpSecVirtPrepare(CreateLauncher(Custom { kind: Other, error: IoError(Os { code: 22, kind: InvalidInput, message: "Invalid argument" }) })))
Error creating the microVM: Invalid argument

Some more context

  • Updated to Latest BIOS (2.8.5)
  • Ubuntu 22.10 running Linux 5.19
  • sevctl is happy (everything is PASS)
  • sev=1 is enabled in amd_kvm driver

these are the relevant lines from dmesg dmesg:

[    4.531915] ccp 0000:46:00.1: no command queues available
[    4.532609] ccp 0000:46:00.1: sev enabled
[    4.532611] ccp 0000:46:00.1: psp enabled
[    4.582819] ccp 0000:46:00.1: SEV API:1.52 build:4
[    4.633884] kvm: Nested Virtualization enabled
[    4.633885] SVM: kvm: Nested Paging enabled
[    4.633888] SEV supported: 410 ASIDs
[    4.633889] SEV-ES supported: 99 ASIDs

iperf3 causes virtio device to panic on macOS

I'm trying to run iperf3 to benchmark TSI on an M1 host running macOS 13. A panic occurs in a virtio device after sending about 1.3 GB:

/ # iperf3 -c 10.1.1.3
Connecting to host 10.1.1.3, port 5201
[  5] local 0.0.0.0 port 0 connected to 10.1.1.3 port 5201
thread 'fc_vcpu 1' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 35, kind: WouldBlock, message: "Resource temporarily unavailable" }', src/devices/src/virtio/mmio.rs:322:53
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'fc_vcpu 0' panicked at 'Failed to acquire device lock: PoisonError { .. }', src/devices/src/bus.rs:150:18

The panic seems to be here:

eventfd.write(v as u64).unwrap();

Steps to reproduce:

  • iperf3 -s on host
  • Run a container with krunvm (e.g. Alpine) and install iperf3
  • iperf3 -c <host external LAN IP> in VM

Windows host support?

How crazy would it be to try to support Windows as a host OS, via the Windows Hypervisor Platform? How pervasive is the assumption of a Unix-like host OS? I suppose the most challenging part would be filesystem support.

Running podman with krun on Asahi Fedora breaks

This potentially could be a 16k vs 4k page size thing, not that I have any clues that suggest that. It just often is with Asahi bugs

$ sudo podman run --runtime krun -it debian bash
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidHostAddress', src/libkrun/src/lib.rs:228:50
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 4215608256

Why not mono-repo ?

Hi!

Just want to raise an open question to understand why you chose to distribute qboot, initrd, libkrun.so, and libkrunfw.so over the following three repositories as they are tightly dependent:

From my point of view it only has disadvantages over maintaining everything in one repo:

  • additional build complexity
  • more administration
  • harder to follow (understand where code comes from)
  • binary blobs in libkrunfw

I'm probably missing some key point, which outweighs the disadvantages above ?

having trouble compiling on Alpine 3.17

I'm getting the following with rust 1.66.1, and musl 1.2.3-r4:

error[E0425]: cannot find function `copy_file_range` in crate `libc`
    --> src/devices/src/virtio/fs/linux/passthrough.rs:1721:19
     |
1721 |             libc::copy_file_range(
     |                   ^^^^^^^^^^^^^^^
     |
    ::: /usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/libc-0.2.126/src/unix/linux_like/linux/musl/b64/x86_64/mod.rs:576:1
     |
576  | pub const SYS_copy_file_range: ::c_long = 326;
     | --------------------------------------- similarly named constant `SYS_copy_file_range` defined here
     |
help: a constant with a similar name exists
     |
1721 |             libc::SYS_copy_file_range(
     |                   ~~~~~~~~~~~~~~~~~~~
help: consider importing this function
     |
5    | use nix::fcntl::copy_file_range;
     |
help: if you import `copy_file_range`, refer to it directly
     |
1721 -             libc::copy_file_range(
1721 +             copy_file_range(
     |

libkrun is leaking TCP sockets

I am trying to run an avalanchego 1.9.7 node in an AMD SEV-ES TEE.
The app in question is a non-validating node in the Avalanche (P2P) network. Anybody can run this without investing in Avalanche. Avalanche nodes "discover" each over TCP port 9650 (in my case 9750). A node typically connects to 1000+ other nodes (I had to bump the NOFILE limit to 32768 - both for the VMM and in the guest TEE).

My node seems to be working well for about 24 hours - the only thing off is:

  1. every 20-30 the HTTP request (e.g. the health api) is delayed 10-45 seconds
  2. there are frequent (seconds) deferring proxy removal warnings in libkrun logs
  3. occasional (minutes) stream: unhandled op warnings with VSOCK_OP_RST

After approx 24h. I can no longer reach my node over HTTP:

  • I can see the thousands of TCP sockets in state CLOSE-WAIT.
  • In libkrun logs this shows up frequently now: ERROR devices::virtio::vsock::muxer] couldn't push pkt to queue, adding it to rxq
  • I can no longer connect to my TEE app over e.g. HTTP (not even SYN goes through to libkrun)

The libkrun host system

  • AMD EPYC 7302P (SEV FW: 0.24 build 17)
  • (up-to-date) debian bullseye backports kernel: 6.0.12-1~bpo11+1

The guest system

  • libkrun 1.4.9
  • libkrunfw 3.9.0

Socket status on libkrun host:

ss -sp --tcp state CLOSE-WAIT
Total: 8294
TCP:   11475 (estab 1122, closed 5650, orphaned 0, timewait 5)

Transport Total     IP        IPv6
RAW       0         0         0        
UDP       1         1         0        
TCP       5825      5824      1        
INET      5826      5825      1        
FRAG      0         0         0        

Recv-Q Send-Q Local Address:Port     Peer Address:Port Process                                   
293    0          127.0.0.1:9750        127.0.0.1:40636 users:(("libkrun VM",pid=245580,fd=5493))
293    0          127.0.0.1:9750        127.0.0.1:33738 users:(("libkrun VM",pid=245580,fd=5756))
293    0          127.0.0.1:9750        127.0.0.1:36756 users:(("libkrun VM",pid=245580,fd=2101))
293    0          127.0.0.1:9750        127.0.0.1:56618 users:(("libkrun VM",pid=245580,fd=1872))
293    0          127.0.0.1:9750        127.0.0.1:51638 users:(("libkrun VM",pid=245580,fd=3849))
293    0          127.0.0.1:9750        127.0.0.1:42248 users:(("libkrun VM",pid=245580,fd=6794))
...

libkrun logs when avalanchego is no longer reachable on HTTP:

[2023-01-28T13:40:15Z ERROR devices::virtio::vsock::muxer] couldn't push pkt to queue, adding it to rxq
[2023-01-28T13:40:16Z ERROR devices::virtio::vsock::muxer] couldn't push pkt to queue, adding it to rxq
[2023-01-28T13:40:17Z ERROR devices::virtio::vsock::muxer] couldn't push pkt to queue, adding it to rxq
[2023-01-28T13:40:18Z ERROR devices::virtio::vsock::muxer] couldn't push pkt to queue, adding it to rxq
[2023-01-28T13:40:18Z ERROR devices::virtio::vsock::muxer] couldn't push pkt to queue, adding it to rxq

Socket leak rate over time:

image

Example binary "chroot_vm" can't find libkrun.so on Fedora 32

Compiled libkrunfw and libkrun with the instructions in the README file. However, when trying:

$ ./chroot_vm rootfs/ /bin/sh
./chroot_vm: error while loading shared libraries: libkrun.so: cannot open shared object file: No such file or directory

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.