Giter VIP home page Giter VIP logo

rust-hypervisor-firmware's Introduction

1. What is Cloud Hypervisor?

Cloud Hypervisor is an open source Virtual Machine Monitor (VMM) that runs on top of the KVM hypervisor and the Microsoft Hypervisor (MSHV).

The project focuses on running modern, Cloud Workloads, on specific, common, hardware architectures. In this case Cloud Workloads refers to those that are run by customers inside a Cloud Service Provider. This means modern operating systems with most I/O handled by paravirtualised devices (e.g. virtio), no requirement for legacy devices, and 64-bit CPUs.

Cloud Hypervisor is implemented in Rust and is based on the Rust VMM crates.

Objectives

High Level

  • Runs on KVM or MSHV
  • Minimal emulation
  • Low latency
  • Low memory footprint
  • Low complexity
  • High performance
  • Small attack surface
  • 64-bit support only
  • CPU, memory, PCI hotplug
  • Machine to machine migration

Architectures

Cloud Hypervisor supports the x86-64 and AArch64 architectures. There are minor differences in functionality between the two architectures (see #1125).

Guest OS

Cloud Hypervisor supports 64-bit Linux and Windows 10/Windows Server 2019.

2. Getting Started

The following sections describe how to build and run Cloud Hypervisor.

Prerequisites for AArch64

  • AArch64 servers (recommended) or development boards equipped with the GICv3 interrupt controller.

Host OS

For required KVM functionality and adequate performance the recommended host kernel version is 5.13. The majority of the CI currently tests with kernel version 5.15.

Use Pre-built Binaries

The recommended approach to getting started with Cloud Hypervisor is by using a pre-built binary. Binaries are available for the latest release. Use cloud-hypervisor-static for x86-64 or cloud-hypervisor-static-aarch64 for AArch64 platform.

Packages

For convenience, packages are also available targeting some popular Linux distributions. This is thanks to the Open Build Service. The OBS README explains how to enable the repository in a supported Linux distribution and install Cloud Hypervisor and accompanying packages. Please report any packaging issues in the obs-packaging repository.

Building from Source

Please see the instructions for building from source if you do not wish to use the pre-built binaries.

Booting Linux

Cloud Hypervisor supports direct kernel boot (the x86-64 kernel requires the kernel built with PVH support or a bzImage) or booting via a firmware (either Rust Hypervisor Firmware or an edk2 UEFI firmware called CLOUDHV / CLOUDHV_EFI.)

Binary builds of the firmware files are available for the latest release of Rust Hypervisor Firmware and our edk2 repository

The choice of firmware depends on your guest OS choice; some experimentation may be required.

Firmware Booting

Cloud Hypervisor supports booting disk images containing all needed components to run cloud workloads, a.k.a. cloud images.

The following sample commands will download an Ubuntu Cloud image, converting it into a format that Cloud Hypervisor can use and a firmware to boot the image with.

$ wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img
$ qemu-img convert -p -f qcow2 -O raw focal-server-cloudimg-amd64.img focal-server-cloudimg-amd64.raw
$ wget https://github.com/cloud-hypervisor/rust-hypervisor-firmware/releases/download/0.4.2/hypervisor-fw

The Ubuntu cloud images do not ship with a default password so it necessary to use a cloud-init disk image to customise the image on the first boot. A basic cloud-init image is generated by this script. This seeds the image with a default username/password of cloud/cloud123. It is only necessary to add this disk image on the first boot. Script also assigns default IP address using test_data/cloud-init/ubuntu/local/network-config details with --net "mac=12:34:56:78:90:ab,tap=" option. Then the matching mac address interface will be enabled as per network-config details.

$ sudo setcap cap_net_admin+ep ./cloud-hypervisor
$ ./create-cloud-init.sh
$ ./cloud-hypervisor \
	--kernel ./hypervisor-fw \
	--disk path=focal-server-cloudimg-amd64.raw path=/tmp/ubuntu-cloudinit.img \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask="

If access to the firmware messages or interaction with the boot loader (e.g. GRUB) is required then it necessary to switch to the serial console instead of virtio-console.

$ ./cloud-hypervisor \
	--kernel ./hypervisor-fw \
	--disk path=focal-server-cloudimg-amd64.raw path=/tmp/ubuntu-cloudinit.img \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask=" \
	--serial tty \
	--console off

Custom Kernel and Disk Image

Building your Kernel

Cloud Hypervisor also supports direct kernel boot. For x86-64, a vmlinux ELF kernel (compiled with PVH support) or a regular bzImage are supported. In order to support development there is a custom branch; however provided the required options are enabled any recent kernel will suffice.

To build the kernel:

# Clone the Cloud Hypervisor Linux branch
$ git clone --depth 1 https://github.com/cloud-hypervisor/linux.git -b ch-6.2 linux-cloud-hypervisor
$ pushd linux-cloud-hypervisor
# Use the x86-64 cloud-hypervisor kernel config to build your kernel for x86-64
$ wget https://raw.githubusercontent.com/cloud-hypervisor/cloud-hypervisor/main/resources/linux-config-x86_64
# Use the AArch64 cloud-hypervisor kernel config to build your kernel for AArch64
$ wget https://raw.githubusercontent.com/cloud-hypervisor/cloud-hypervisor/main/resources/linux-config-aarch64
$ cp linux-config-x86_64 .config  # x86-64
$ cp linux-config-aarch64 .config # AArch64
# Do native build of the x86-64 kernel
$ KCFLAGS="-Wa,-mx86-used-note=no" make bzImage -j `nproc`
# Do native build of the AArch64 kernel
$ make -j `nproc`
$ popd

For x86-64, the vmlinux kernel image will then be located at linux-cloud-hypervisor/arch/x86/boot/compressed/vmlinux.bin. For AArch64, the Image kernel image will then be located at linux-cloud-hypervisor/arch/arm64/boot/Image.

Disk image

For the disk image the same Ubuntu image as before can be used. This contains an ext4 root filesystem.

$ wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img # x86-64
$ wget https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-arm64.img # AArch64
$ qemu-img convert -p -f qcow2 -O raw focal-server-cloudimg-amd64.img focal-server-cloudimg-amd64.raw # x86-64
$ qemu-img convert -p -f qcow2 -O raw focal-server-cloudimg-arm64.img focal-server-cloudimg-arm64.raw # AArch64

Booting the guest VM

These sample commands boot the disk image using the custom kernel whilst also supplying the desired kernel command line.

  • x86-64
$ sudo setcap cap_net_admin+ep ./cloud-hypervisor
$ ./create-cloud-init.sh
$ ./cloud-hypervisor \
	--kernel ./linux-cloud-hypervisor/arch/x86/boot/compressed/vmlinux.bin \
	--disk path=focal-server-cloudimg-amd64.raw path=/tmp/ubuntu-cloudinit.img \
	--cmdline "console=hvc0 root=/dev/vda1 rw" \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask="
  • AArch64
$ sudo setcap cap_net_admin+ep ./cloud-hypervisor
$ ./create-cloud-init.sh
$ ./cloud-hypervisor \
	--kernel ./linux-cloud-hypervisor/arch/arm64/boot/Image \
	--disk path=focal-server-cloudimg-arm64.raw path=/tmp/ubuntu-cloudinit.img \
	--cmdline "console=hvc0 root=/dev/vda1 rw" \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask="

If earlier kernel messages are required the serial console should be used instead of virtio-console.

  • x86-64
$ ./cloud-hypervisor \
	--kernel ./linux-cloud-hypervisor/arch/x86/boot/compressed/vmlinux.bin \
	--console off \
	--serial tty \
	--disk path=focal-server-cloudimg-amd64.raw \
	--cmdline "console=ttyS0 root=/dev/vda1 rw" \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask="
  • AArch64
$ ./cloud-hypervisor \
	--kernel ./linux-cloud-hypervisor/arch/arm64/boot/Image \
	--console off \
	--serial tty \
	--disk path=focal-server-cloudimg-arm64.raw \
	--cmdline "console=ttyAMA0 root=/dev/vda1 rw" \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask="

3. Status

Cloud Hypervisor is under active development. The following stability guarantees are currently made:

  • The API (including command line options) will not be removed or changed in a breaking way without a minimum of 2 major releases notice. Where possible warnings will be given about the use of deprecated functionality and the deprecations will be documented in the release notes.

  • Point releases will be made between individual releases where there are substantial bug fixes or security issues that need to be fixed. These point releases will only include bug fixes.

Currently the following items are not guaranteed across updates:

  • Snapshot/restore is not supported across different versions
  • Live migration is not supported across different versions
  • The following features are considered experimental and may change substantially between releases: TDX, vfio-user, vDPA.

Further details can be found in the release documentation.

As of 2023-01-03, the following cloud images are supported:

Direct kernel boot to userspace should work with a rootfs from most distributions although you may need to enable exotic filesystem types in the reference kernel configuration (e.g. XFS or btrfs.)

Hot Plug

Cloud Hypervisor supports hotplug of CPUs, passthrough devices (VFIO), virtio-{net,block,pmem,fs,vsock} and memory resizing. This document details how to add devices to a running VM.

Device Model

Details of the device model can be found in this documentation.

Roadmap

The project roadmap is tracked through a GitHub project.

4. Relationship with Rust VMM Project

In order to satisfy the design goal of having a high-performance, security-focused hypervisor the decision was made to use the Rust programming language. The language's strong focus on memory and thread safety makes it an ideal candidate for implementing VMMs.

Instead of implementing the VMM components from scratch, Cloud Hypervisor is importing the Rust VMM crates, and sharing code and architecture together with other VMMs like e.g. Amazon's Firecracker and Google's crosvm.

Cloud Hypervisor embraces the Rust VMM project's goals, which is to be able to share and re-use as many virtualization crates as possible.

Differences with Firecracker and crosvm

A large part of the Cloud Hypervisor code is based on either the Firecracker or the crosvm project's implementations. Both of these are VMMs written in Rust with a focus on safety and security, like Cloud Hypervisor.

The goal of the Cloud Hypervisor project differs from the aforementioned projects in that it aims to be a general purpose VMM for Cloud Workloads and not limited to container/serverless or client workloads.

The Cloud Hypervisor community thanks the communities of both the Firecracker and crosvm projects for their excellent work.

5. Community

The Cloud Hypervisor project follows the governance, and community guidelines described in the Community repository.

Contribute

The project strongly believes in building a global, diverse and collaborative community around the Cloud Hypervisor project. Anyone who is interested in contributing to the project is welcome to participate.

Contributing to a open source project like Cloud Hypervisor covers a lot more than just sending code. Testing, documentation, pull request reviews, bug reports, feature requests, project improvement suggestions, etc, are all equal and welcome means of contribution. See the CONTRIBUTING document for more details.

Slack

Get an invite to our Slack channel, join us on Slack, and participate in our community activities.

Mailing list

Please report bugs using the GitHub issue tracker but for broader community discussions you may use our mailing list.

Security issues

Please contact the maintainers listed in the MAINTAINERS.md file with security issues.

rust-hypervisor-firmware's People

Contributors

benmaddison avatar dependabot-preview[bot] avatar dependabot[bot] avatar edigaryev avatar fdr avatar henryksloan avatar jongwu avatar josephlr avatar lencerf avatar mrxinwang avatar ning-yang avatar rbradford avatar retrage avatar rveerama1 avatar shpark avatar thenewwazoo avatar yuuzi41 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rust-hypervisor-firmware's Issues

Virtio over MMIO no longer works with cloud-hypervisor

It looks like Virtio-over-MMIO was initially added to the firmware (in 411449d) to get some something simple working without having to implement PCI enumeration. However, now that the firmware supports PCI (see 7b90e41), it seems that the MMIO code broke without anyone noticing.

Specifically, building cloud-hypervisor with --no-default-features --features=acpi,cmos,mmio and running the firmware results in "Error configuring block device" (even if the PCI code is commented out). When virtio-mmio was added to cloud-hypervisor, it seems like it was designed to work with direct kernel booting, not with this firmware.

The big problem is that Virtio-over-MMIO doesn't support device discovery so a hard-coded physical address must be used. So if the device is allocated at a different address, the firmware won't boot. It also makes it hard for OSes booted by the firmware to find their block devices (for similar reasons).

I think it would be fine to just remove the MMIO code (instead of trying to fix it), it doesn't really seem necessary.

Hang due to infinite loop when using EFI Stub

Hi. When I use this image, hypervisor-fm hangs at:
https://github.com/cloud-hypervisor/rust-hypervisor-firmware/blob/0.3.0/src/main.rs#L132

I tried to debug, and realized that this loop does not exit (section_bytes_remaining=32, page_rva=0, block_size=0):
https://github.com/cloud-hypervisor/rust-hypervisor-firmware/blob/0.3.0/src/pe.rs#L193

I think it is related to FAT. Can you take a look?

(cloud-hypervisor v0.14.1 & rust-hypervisor-firmware 0.3.0)

Thank you.

Unable to build on intel machines.

After getting the latest nightly build I am still unable to build on linux machines.

I am getting the following:

note: instantiated into assembly here
--> :53:10
|
53 | movw %ax, %ss
| ^

error: unknown token in expression

on every line of the assembly file. Is there something else I need to setup except for what I can see in the build scripts or the main page.

Support building as flat BIOS image

Right now the firmware is built into a normal ELF binary, and then booted using Firecracker. However, it would be nice if were possible to build the firmware into a flat BIOS binary that can be directly executed by the processor on reset.

Specifically, the firmware would be loaded at Guest Physical Address 4 GiB - sizeof(binary) and then execution would begin at the standard x86 reset vector 0xFFFFFFF0 in Real Mode. This is what SeaBIOS's bios.bin and EDK2's OVMF_CODE.fd/OVMF_VARS.fd builds do.

This would then allow for the binary to be used with any VMM that uses the normal BIOS loading process. This means automatic support for QEMU and XEN. Specifically, using QEMU would then be possible with:

qemu-system-x86_64  -drive if=pflash,format=raw,readonly,file=hypervisor-fw

Design Ideas

I would be happy to start exploring this. My idea of the execution flow would be:

  1. One instruction at the reset vector: jump to code in (2)
  2. Assembly code that:
    b) Deals with A20
    c) Sets up stub IDT/GDT/Paging
    d) Switches to long mode
    e) Jump to _start (i.e. the normal ELF entry point)
  3. Our normal Rust entry point: _start.

This could be done by having two similar target.json files, that only differed in refering to different layout.ld files. The layout.fd file for our flat BIOS build, would just have to make sure that the code for (1) was properly aligned and at the end of the file. This also means that our flat BIOS build would still be an ELF file, which has its advantages.

aarch64 Support

This issue tracks aarch64 support.

I am working on an experimental implementation of aarch64 support for Rust Hypervisor Firmware.
The code is available at:

Current Status

The changes in this branch have been tested to work using QEMU aarch64 virt.
Similar to x86_64, you can boot the Linux image by specifying the firmware with -kernel. The major changes are as follows:

  • Add target and linker scripts for aarch64.
  • Add BootInfo for FDT.
  • Support for PCI configuration space access using MMIO on aarch64
  • Add virtio-mmio driver (only for QEMU)

I will propose these changes in multiple PRs.

To-Do

  • [ ] Implement remapping of EFI runtime services
  • Add initial CH support
  • Split architecture dependent code for multiple architecture support (#203)
  • Improve EFI compatibility for aarch64/Linux boot (#204)
  • Add initial aarch64 support (#205)
  • Add testing (#205)
  • Add CI builds (#205)
  • Support vanilla Ubuntu cloud images (#262)
  • Add integration testing (#267)
  • Update documents

Known Issues

This aarch64 support has the following issues.

Cloud Hypervisor is not yet supported.

This aarch64 support does not yet support Cloud Hypervisor. This is because I do not have an aarch64 machine that supports GICv3 or later and cannot test it. Please let me know if there is a good test environment available.

Update: It works with a custom Ubuntu bionic cloud image.

objcopy is required as post-processing after build.

QEMU aarch64 virt has an emulator loader that behaves differently depending on the type of binary passed with -kernel. When the binary is not ELF, QEMU executes the binary with the first address of the FDT passed in the x0 register. You need to run objcopy as a post-processing step to convert it to raw binary, as follows:
This post process is not needed by generating as a raw binary at link time. See eaed071.

Run QEMU with the binary as follows:

qemu-system-aarch64 \
  -machine virt \
  -cpu cortex-a53 \
  -m 8G \
  -nographic \
  -serial mon:stdio \
  -drive id=disk,file=$(AARCH64_IMG),if=none \
  -device virtio-blk-pci,drive=disk,disable-legacy=on \
  -kernel target/aarch64-unknown-none/debug/hypervisor-fw

feature request for DevicePath parse to support sub type 0x03

Currently the DevicePath parse function only supports TYPE_MEDIA sub type 0x04 (MEDIA_FILEPATH_DP)1, however, booting with systemd boot stub requires sub type 0x03 (MEDIA_VENDOR_DP) with LoadImage2.

At the moment this causes boot with a UKI using the systemd boot stub to fail with:

[ERROR] Unexpected end of device path
Error loading kernel image: Unsupported
PANIC: panicked at src/main.rs:290:5:
Unable to boot from any virtio-blk device

It would be great if this sub type could be supported to allow UKI booting.

Footnotes

  1. https://github.com/cloud-hypervisor/rust-hypervisor-firmware/blob/bd9d7dc0e7f78b9627988a22f08f086424d54226/src/efi/device_path.rs#L27 โ†ฉ

  2. https://github.com/systemd/systemd/blob/11d5e2b5fbf9f6bfa5763fd45b56829ad4f0777f/src/boot/efi/linux.c#L58 โ†ฉ

Unable to build on arm64 devices.

It is not clear if the firmware is supported on linux or not. I have not found an arm build officially there. However, I attempted to build it on an arm machine and as I was worried it failed on the assembly code in ram32.s. I am building on arm64 using the nightly toolchain. I keep getting the following error for all commands:

error: instruction requires: Not 64-bit mode
ljmpl $0x08, $rust64_start

Guest os can't boot from spdk bdev

We tried to start the guest OS from spdk bdev with this firmware, but failed because the EFI partition could not be found. If we change the queue length of the vhost user block to 16, we can find the EFI partition and read the bootloader, but during OS startup, reset vhost owner will cause the spdk backend to crash.

Windows guest bluescreen with hypervisor-fw

Windows guest with CH using hypervisor-fw instead of OVMF doesn't shutdown correctly and encounters a bluescreen:

SAC>                                                                            
The SAC will become unavailable soon.  The computer is shutting down.           
                                                                                
SAC><?xml><BP>                                                                  
<INSTANCE CLASSNAME="BLUESCREEN">                                               
<PROPERTY NAME="STOPCODE" TYPE="string"><VALUE>"0x7E"</VALUE></PROPERTY><machine-info>
<name>WIN-L3C8M6IQS0Q</name>                                                    
<guid>00000000-0000-0000-0000-000000000000</guid>                               
<processor-architecture>AMD64</processor-architecture>                          
<os-version>10.0</os-version>                                                   
<os-build-number>17763</os-build-number>                                        
<os-product>Windows Server 2019</os-product>                                    
<os-service-pack>None</os-service-pack>                                         
</machine-info>                                                                 

</INSTANCE>
</BP>
!SAC>
SYSTEM_THREAD_EXCEPTION_NOT_HANDLED


0xFFFFFFFF80000003
0xFFFFF80137EB3B7B
0xFFFFDB8C5996F388
0xFFFFDB8C5996EBD0

The Cloud Hypervisor process keeps hanging and doesn't terminate. To reproduce, it's just about booting the guest and then hitting the shutdown button. This issue doesn't happen with OVMF.

As OVMF is currently used for the tests and seems to be the most stable option, we should first clarify on the priority switching to hypervisor-fw.

The guest will need to be debugged the usual way, in first place to identify the issue. Any hints to debug on the firmware side might be helpful, too.

EFI memory map compatibility

@retrage After your last commit I have to admit I didn't retest Windows. Did it work for you? I get the following error:

<INSTANCE CLASSNAME="BLUESCREEN">                                               
<PROPERTY NAME="STOPCODE" TYPE="string"><VALUE>"0x1E"</VALUE></PROPERTY><machine-info>
<name>WIN-L3C8M6IQS0Q</name>                                                    
<guid>00000000-0000-0000-0000-000000000000</guid>                               
<processor-architecture>AMD64</processor-architecture>
<os-version>10.0</os-version>
<os-build-number>17763</os-build-number>
<os-product>Windows Server 2019</os-product>
<os-service-pack>None</os-service-pack>
</machine-info>

</INSTANCE>
</BP>
!SAC>
KMODE_EXCEPTION_NOT_HANDLED


0xFFFFFFFFC0000005
0xFFFFF8024AEB51F1
0x0000000000000000
0x000000000011A0D0

Going back to 282ebc0 makes Windows boot fine. So I think there needs to be a little bit more refinement on the memory map.

Originally posted by @rbradford in #114 (comment)

NixOS with systemd-boot broken in v0.4

This issue appears to have been introduced by 2e7e87b.

NixOS, and probably other distros, place the full name of the default boot entry in the default option of loader.conf.

The above commit forcefully adds a further .conf to the name returned by loader::default_entry_file(), which results in loader::load_default_entry() attempting to open a bogus path such as /loader/entries/foo.conf.conf.

As far as I am aware, /loader/loader.conf is only a systemd-boot thing - i.e. it's not part of the Boot Loader Spec, and is only defined in the systemd manual.

According to that documentation, the default option should be handled as a glob pattern, to be matched against (presumably) the filenames in /loader/entries/.

I am happy to attempt a fix if others agree that this is a bug.

Cannot open a file in FAT in specific cases

The firmware couldn't load the EFI "/EFI/BOOT/BOOTX64.EFI" if there is a directory entry which name is like "/EFI-SYSTEM".
This is a real issue. at least I found the entry "/EFI-SYSTEM" in the EFI system partition of Fedora CoreOS 35.20211215.3.0 stable.

The root cause of this issue is that compare_name function and compare_short_name function in fat.rs doesn't check whether the length of each names are same.

I'm preparing a Pull request to fix it now. After I have created the pull request, I'll link it here.

Page table creation clobbers provided command line

Both cloud-hypervisor and firecracker support passing a command line argument to the image they are launching. The firmware code in bzimage.rs tries to create a command line that consists of:

  • The hypervisor-provided command line, followed by
  • The command line from the on-disk data

However, the provided command line is often at address 0x20000 (cloud-hypervisor example and firecracker example). This causes a problem when running setup_pagetables() which writes a single PML3 table at 0xa000 but writes 64 PML2 tables starting at 0xb000, this will overwrite the command line with garbage, causing either:

  • Nothing to be prepended to the command line when it should
  • Junk to be prepended to the commandline

Cargo build command fails to build wtih Cargoโ€™s stable channel

Is it possible to build the project with Cargoโ€™s stable channel?

$ cargo --version
cargo 1.52.0
$ cargo build --release --target target.json -Zbuild-std=core,alloc -Zbuild-std-features=compiler-builtins-mem
error: the `-Z` flag is only accepted on the nightly channel of Cargo, but this is the `stable` channel
See https://doc.rust-lang.org/book/appendix-07-nightly-rust.html for more information about Rust release channels.

Failed to Boot Mariner Cloud Image

While I was trying to boot Mariner VM image on Microsoft Hypervisor I got the boot failed. I was able to collect the log from emulated serial console.

Setting up 4 GiB identity mapping
Page tables setup

Booting with PVH Boot Protocol
Found PCI device vendor=8086 device=d57 in slot=0
Found PCI device vendor=1af4 device=1043 in slot=1
Found PCI device vendor=1af4 device=1042 in slot=2
Found PCI device vendor=1af4 device=1041 in slot=3
Found PCI device vendor=1af4 device=1044 in slot=4
PCI Device: 0:2.0 1af4:1042
Bar: type=MemorySpace32 address=e7f80000
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=Unused address=0
Virtio block device configured. Capacity: 204800000 sectors
Found EFI partition
Filesystem ready
Error loading default entry: FileError(NotFound)
Using EFI boot.
Found bootloader (BOOTX64.EFI)
Executable loaded
TPM logging failed: Unsupported
Could not create MokListRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListXRT: Unsupported
Something has gone seriously wrong: import_mok_state() failed: Unsupported
TPM logging failed: Unsupported
Reloc 0 block size 0 is invalid
Relocation failed: Unsupported
Failed to load image: Unsupported
start_image() returned Unsupported
PANIC: panicked at 'Unable to boot from any virtio-blk device', src/main.rs:188:5

Image with systemd-boot (bootctl) fails to boot if memory is set to any value bigger than 2G

I created an Arch linux image that uses bootctl boot loader. The boot entries, and defaults are set correctly and the VM boots perfectly well if cloud-hypervisor --memory size is anything less than or equal 2G.

If the memory was higher that 2G the machine fails with following logs

Booting with PVH Boot Protocol
Found PCI device vendor=8086 device=d57 in slot=0
Found PCI device vendor=1af4 device=1042 in slot=1
Found PCI device vendor=1af4 device=1044 in slot=2
PCI Device: 0:1.0 1af4:1042
Bar: type=MemorySpace32 address=e7f80000
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Virtio block device configured. Capacity: 4194304 sectors
Found EFI partition
Filesystem ready
Error loading default entry: BzImageError(NoInitrdMemory)
Using EFI boot.
Found bootloader (BOOTX64.EFI)
Executable loaded
Setting up 4 GiB identity mapping
Page tables setup

There error is Error loading default entry: BzImageError(NoInitrdMemory) this block of logs keeps repeating forever until i kill the cloud-hypervisor process.

Bootloader

bootctl install
cat /boot/loader/entries/arch.conf 

shows

title Arch Linux
linux /vmlinuz-linux
initrd /initramfs-linux.img
options root="LABEL=cloud-root" rw console=tty1 console=ttyS0 panic=1
cat /boot/loader/loader.conf 

shows

default arch.conf

Command line

sudo cloud-hypervisor --kernel hypervisor-fw   --cpus boot=4 --memory size=4G --console off --serial tty --disk path=image.raw

Ubuntu Xenial does not boot on Cloud Hypervisor

I suspect there is an issue in the EFI compatability layer.

Booting via PVH Boot Protocol
Setting up 4 GiB identity mapping
Page tables setup
Found PCI device vendor=8086 device=d57 in slot=0
Found PCI device vendor=1af4 device=1042 in slot=1
Found PCI device vendor=1af4 device=1041 in slot=2
Found PCI device vendor=1af4 device=1044 in slot=3
PCI Device: 0:1.0 1af4:1042
Bar: type=MemorySpace32 address=e7f80000
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=Unused address=0
Virtio block device configured. Capacity: 4612096 sectors
Found EFI partition
Filesystem ready
Error loading default entry: FileError(NotFound)
Using EFI boot.
Found bootloader (BOOTX64.EFI)
Executable loaded
TPM logging failed: Unsupported
error: no such device: root.

Booting via PVH Boot Protocol
Setting up 4 GiB identity mapping
Page tables setup
....etc

Question: Is there/Will there be support for UEFI SNP(Simple Network Protocol)?

Hi,
Thank you all the developers for this amazing project.

I would like to use this project to install VM on a hardware during its early development stage to test applications. And I wish to be able to boot the VM via the network.

Questions:

  • Does the project support UEFI SNP(Simple Network Protocol) for booting via the network?
    Browsing through the code, it seems that it does not support the protocol, but I would just like to make sure.
    • If there is support or ongoing implementation, is there any way that I can contribute to it?
    • If there is no support for it but the project would like add support, can I fork and do a pull request?
    • Or might it be that the project does not intend to add support for UEFI SNP?

Unable to boot with systemd-boot

I'm trying to add Ubuntu 24.04 to the integration test targets, but the guest fails to find the rootfs:

[    0.174931] /dev/root: Can't open blockdev
[    0.175095] VFS: Cannot open root device "LABEL=cloudimg-rootfs" or unknown-block(0,0): error -6

Ubuntu 24.04 uses systemd-boot, which sets up the rootfs information at the startup. It installs some EFI protocols during the boot process, but the current RHF implementation does not support the operations. The boot log says it failed to install protocols:

install_multiple_protocol_interfaces: 4006c0c1-fcb3-403e-996d-4a6c8724e06d
error: failed to install protocols.

To fix this issue, RHF needs to support EFI protocol installation operations.

Working branch: https://github.com/retrage/rust-hypervisor-firmware/tree/ubuntu-2404-integration-tests

hypervisor-fw doesn't work with RO root disk

When using a readonly root disk with hypervisor-fw. I get the following.

./target/release/cloud-hypervisor  --cpus boot=4 --memory size=2G --kernel ../hypervisor-fw --cmdline "root=/dev/vda1 console=hvc0 console=ttyS0 rw systemd.journald.forward_to_console=1 " --disk path=../focal-server-cloudimg-amd64.raw,readonly=true  --console tty
cloud-hypervisor: 16.17385915s: <_disk0_q0> ERROR:virtio-devices/src/block.rs:221 -- Request failed: Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" }
cloud-hypervisor: 16.173971948s: <_disk0_q0> ERROR:virtio-devices/src/thread_helper.rs:55 -- Error running worker: HandleEvent(Failed to process queue (complete): AsyncRequestFailure)

It looks like hypervisor-fw is trying to write to the disk somehow?

Not sure how much we care about this use case.

unable to load EFI binaries with ImageBase != 0

Trying to load EFI binaries with the ImageBase field in the PE binary fails with errors like

cloud-hypervisor: 61.698026ms: <_disk0_q0> ERROR:virtio-devices/src/thread_helper.rs:55 -- Error running worker: HandleEvent(Failed to process queue (submit): RequestExecuting(GetHostAddress(InvalidGuestAddress(GuestAddress(5603131392)))))

An example of such a binary is any UKI using the systemd boot stub compiled from recent versions of systemd, these set the ImageBase field to a pseudo-random value1. Forcing this to be opt.ImageBase = 0 allows the stub to be loaded by hypervisor-fw.

objdump of working EFI binary:

src/boot/efi/linuxx64.efi.stub:     file format pei-x86-64
src/boot/efi/linuxx64.efi.stub
architecture: i386:x86-64, flags 0x00000103:
HAS_RELOC, EXEC_P, D_PAGED
start address 0x0000000000009135

Characteristics 0x22e
        executable
        line numbers stripped
        symbols stripped
        large address aware
        debugging information removed

Time/Date               Wed Jul 24 21:45:14 2024
Magic                   020b    (PE32+)
MajorLinkerVersion      0
MinorLinkerVersion      0
SizeOfCode              000000000001a8fe
SizeOfInitializedData   000000000000615e
SizeOfUninitializedData 0000000000000000
AddressOfEntryPoint     0000000000009135
BaseOfCode              0000000000001000
ImageBase               0000000000000000
SectionAlignment        00001000
FileAlignment           00000200
MajorOSystemVersion     0
MinorOSystemVersion     0
MajorImageVersion       257
MinorImageVersion       0
MajorSubsystemVersion   1
MinorSubsystemVersion   1
Win32Version            00000000
SizeOfImage             00026000
SizeOfHeaders           00000400
CheckSum                00000000
Subsystem               0000000a        (EFI application)
DllCharacteristics      00000160
                                        HIGH_ENTROPY_VA
                                        DYNAMIC_BASE
                                        NX_COMPAT
SizeOfStackReserve      0000000000100000
SizeOfStackCommit       0000000000001000
SizeOfHeapReserve       0000000000100000
SizeOfHeapCommit        0000000000001000
LoaderFlags             00000000
NumberOfRvaAndSizes     00000010

objdump of broken EFI binary:

src/boot/efi/linuxx64.efi.stub:     file format pei-x86-64
src/boot/efi/linuxx64.efi.stub
architecture: i386:x86-64, flags 0x00000103:
HAS_RELOC, EXEC_P, D_PAGED
start address 0x000000014df99135

Characteristics 0x22e
        executable
        line numbers stripped
        symbols stripped
        large address aware
        debugging information removed

Time/Date               Wed Jul 24 21:37:09 2024
Magic                   020b    (PE32+)
MajorLinkerVersion      0
MinorLinkerVersion      0
SizeOfCode              000000000001a8fe
SizeOfInitializedData   000000000000615e
SizeOfUninitializedData 0000000000000000
AddressOfEntryPoint     0000000000009135
BaseOfCode              0000000000001000
ImageBase               000000014df90000
SectionAlignment        00001000
FileAlignment           00000200
MajorOSystemVersion     0
MinorOSystemVersion     0
MajorImageVersion       257
MinorImageVersion       0
MajorSubsystemVersion   1
MinorSubsystemVersion   1
Win32Version            00000000
SizeOfImage             00026000
SizeOfHeaders           00000400
CheckSum                00000000
Subsystem               0000000a        (EFI application)
DllCharacteristics      00000160
                                        HIGH_ENTROPY_VA
                                        DYNAMIC_BASE
                                        NX_COMPAT
SizeOfStackReserve      0000000000100000
SizeOfStackCommit       0000000000001000
SizeOfHeapReserve       0000000000100000
SizeOfHeapCommit        0000000000001000
LoaderFlags             00000000
NumberOfRvaAndSizes     00000010

Footnotes

  1. https://github.com/systemd/systemd/blob/11d5e2b5fbf9f6bfa5763fd45b56829ad4f0777f/tools/elf2efi.py#L602 โ†ฉ

Rebooting fails with LA57 and hypervisor-fw

Reboot does not work on a machine with 5 level page table when Cloud Hypervisor is booted with hypervisor-fw (0.3.1).
It does work with direct kernel boot though.

The guest image version is Hirsute Hippo cloud image (21.04), including a 5.11 kernel. The kernel used for direct kernel boot is a 5.12 kernel coming from the branch ch-5.12 from cloud-hypervisor/linux.

Windows CI

We can now self hosted runners which gives us enough disk space to re-enable Windows CI (using runner name "garm-jammy-16"). However since a secret needs to be used to download the image and forks do not have access to secrets we need to decide whether to go with merge queue approach (like Cloud Hypervisor) or only test Windows after it has already been pushed to main.

@retrage What would your preference be?

aarch64๏ผš efi get time fail in guest kernel

If efi_rtc is enabled in guest kernel, a warning occurs:

[    0.795685] [Firmware Bug]: Unable to handle paging request in EFI runtime service
[    0.858764] ------------[ cut here ]------------
[    0.860111] WARNING: CPU: 0 PID: 1 at drivers/firmware/efi/runtime-wrappers.c:346 __efi_queue_work+0xe8/0x11c
[    0.862929] Modules linked in:
[    0.863850] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W I        6.5.0-next-20230905+ #228
[    0.866502] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    0.868580] pc : __efi_queue_work+0xe8/0x11c
[    0.869863] lr : __efi_queue_work+0xd4/0x11c
[    0.871143] sp : ffff80008002bad0
[    0.872135] x29: ffff80008002bad0 x28: 0000000000000000 x27: ffff800081945868
[    0.874287] x26: ffff8000818500b0 x25: ffff8000817c48b0 x24: 0000000000000000
[    0.876425] x23: ffff800082425c18 x22: 0000000000000000 x21: ffff80008002bb98
[    0.878544] x20: ffff800082434a40 x19: ffff8000824349e0 x18: 3d464f6d78cd8426
[    0.880673] x17: ffff800034fc4000 x16: ffff800080000000 x15: 00003a1efce9c232
[    0.882805] x14: 0000000000000228 x13: 0000000000000001 x12: 0000000000000001
[    0.884950] x11: 0000000000000000 x10: 0000000000000980 x9 : ffff80008002b920
[    0.887077] x8 : ffff0000018e09e0 x7 : ffff0000b69f5780 x6 : 0000000000000000
[    0.889205] x5 : 00000000430f0af0 x4 : 0000000000f0000f x3 : 0000000000000000
[    0.891337] x2 : 0000000000000000 x1 : 8000000000000015 x0 : 8000000000000015
[    0.893475] Call trace:
[    0.894210]  __efi_queue_work+0xe8/0x11c
[    0.895384]  virt_efi_get_time+0x64/0xb4
[    0.896568]  efi_rtc_probe+0x40/0x170
[    0.897663]  platform_probe+0x68/0xdc
[    0.898756]  really_probe+0x148/0x2ac
[    0.899852]  __driver_probe_device+0x78/0x12c
[    0.901165]  driver_probe_device+0x3c/0x15c
[    0.902416]  __driver_attach+0x94/0x19c
[    0.903560]  bus_for_each_dev+0x74/0xd4
[    0.904714]  driver_attach+0x24/0x30
[    0.905789]  bus_add_driver+0xe4/0x1e8
[    0.906913]  driver_register+0x60/0x128
[    0.908064]  __platform_driver_probe+0x58/0xd0
[    0.909410]  efi_rtc_driver_init+0x24/0x30
[    0.910637]  do_one_initcall+0x6c/0x1b0
[    0.911785]  kernel_init_freeable+0x1b8/0x280
[    0.913096]  kernel_init+0x24/0x1e0
[    0.914140]  ret_from_fork+0x10/0x20
[    0.915211] ---[ end trace 0000000000000000 ]---

And the time is wrong in guest.

Integration tests need to finish quicker

Our integration tests are taking too long. Ideas:

  • Run the coreboot tests in parallel (on a second VM)
  • Work to remove --test-threads=1 to allow the tests to run in parallel

Bionic (EFI booting) fails to boot with master and cloud-hypervisor

After merging #26 the bionic image which uses EFI to boot no longer boots. I did a bisect to find the cause but some of commits failed to build. Here is what bisect suggests are the possible commits:

rob@rbradford-test:~/rust-hypervisor-firmware$ git bisect visualize
commit e3873908a69622001e35b6e1f85ee8fdb28d9fbf (HEAD, refs/bisect/bad)
Author: Joe Richey <[email protected]>
Date:   Sat Mar 28 19:30:22 2020 -0700

    main: Rewrite boot_from_device
    
    We now pass the boot::Info to `loader::load_default_entry` and
    `efi::efi_exec`. We also reorganize the code in this function to:
      - Avoid unnecessary nesting
      - log errors when they occur
    
    The code is now much more readable
    
    Signed-off-by: Joe Richey <[email protected]>

commit 1ff201b9d5bfa1ccdb911bd718203dee74a5e661 (refs/bisect/skip-1ff201b9d5bfa1ccdb911bd718203dee74a5e661)
Author: Joe Richey <[email protected]>
Date:   Sat Mar 28 18:54:55 2020 -0700

    bzimage: Rewrite Linux Kernel Loading code
    
    This allows Linux to be booted with any boot protocol.
    
    The old code took in the Zeropage passed in via the Linux Kernel Boot
    Protocol, modified it, and passed it into the Linux Kernel. This is not
    the correct way to boot Linux per the documentation:
        https://www.kernel.org/doc/Documentation/x86/boot.txt
    
    This code now correctly:
      - Uses a brand-new Zeropage inside the `Kernel` struct
      - Adds in the E820 map and RSDP pointer from the boot::Info
      - Reads the header from the file and copies it into the Zeropage
      - Loads the kernel and initrd into avalible memory
      - Properly manages the command-line at a fixed memory location
      - Jumps to the appropriate starting address
    
    Signed-off-by: Joe Richey <[email protected]>

commit e2d541041e11dc885ce8619c7cb811b7151e6e03 (refs/bisect/skip-e2d541041e11dc885ce8619c7cb811b7151e6e03)
Author: Joe Richey <[email protected]>
Date:   Sat Mar 28 17:53:35 2020 -0700

    efi: Use Info to setup allocator and EFI tables
    
    This allows efi_exec to work with multiple boot protocols.
    
    Signed-off-by: Joe Richey <[email protected]>

Here is the output

Booting via PVH Boot Protocol
Setting up 4 GiB identity mapping
Page tables setup
Found PCI device vendor=8086 device=d57 in slot=0
Found PCI device vendor=1af4 device=1042 in slot=1
Found PCI device vendor=1af4 device=1044 in slot=2
PCI Device: 0:1.0 1af4:1042
Bar: type=MemorySpace32 address=e7f80000
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=Unused address=0
Virtio block device configured. Capacity: 4612096 sectors
Found EFI partition
Filesystem ready
Error loading default entry: FileError(NotFound)
Using EFI boot.
Found bootloader (BOOTX64.EFI)
Executable loaded

It appears to be failing to jump to run the EFI binary itself.

I also forced it to use Linux boot vs PVH boot and got the same result (by reverting the addition of the PVH note.)

@josephlr would you mind taking a look please?

Expore running tests on the main target

Right now all the tests run on the host. In an ideal world, we would set .cargo/config's build.target to be our custom target and have everything (xbuild, xclippy, xtest, etc...) build for the main target. This is how blog_os does it.

We would have to setup a custom testing framework however, which could be complex. We would also have to figure out an easy way to run the tests without too much hassle (maybe once #6 is done).

Panic when booting rhel9 guest image

$ target/debug/cloud-hypervisor --kernel ~/src/rust-hypervisor-firmware/target/target/release/hypervisor-fw --disk path=~/workloads/rhel-guest-image-9.0-20220420.0.x86_64.raw path=/tmp/ubuntu-cloudinit.img --serial tty --console off
Setting up 4 GiB identity mapping
Page tables setup

Booting with PVH Boot Protocol
Found PCI device vendor=8086 device=d57 in slot=0
Found PCI device vendor=1af4 device=1042 in slot=1
Found PCI device vendor=1af4 device=1042 in slot=2
Found PCI device vendor=1af4 device=1044 in slot=3
PCI Device: 0:1.0 1af4:1042
Bar: type=MemorySpace32 address=e7f80000
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Virtio block device configured. Capacity: 20971520 sectors
Found EFI partition
Filesystem ready
Error loading default entry: File(NotFound)
Using EFI boot.
Found bootloader (BOOTX64.EFI)
Executable loaded
Failed to set MokListRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListRT: Unsupported
Failed to set MokListXRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListXRT: Unsupported
Something has gone seriously wrong: import_mok_state() failed: Unsupported
TPM logging failed: Unsupported
PANIC: panicked at 'index out of bounds: the len is 16 but the index is 16', src/fat.rs:319:18

riscv64 support

First version chaining off SBI under QEMU

  • PCI BAR size discovery (see #224)
  • Memory range for PCI BARs discovery from FDT (needed to program BARs)
  • PCI BAR allocation & programming (under QEMU BAR addresses start as zero)
  • PCI device early setup (device programming needed)
  • Linker script/toolchain/stubbed functionality
  • MMIO UART
  • Non-EFI loading support
  • EFI support

Unable to boot Ubuntu 22.04 LTS (cloud image) on arm64

How to reproduce

#!/bin/bash

set -euo pipefail

QCOW2_NAME="jammy-server-cloudimg-arm64.img"

if [ ! -e "${QCOW2_NAME}" ]
then
	wget -O "${QCOW2_NAME}" https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-arm64.img
fi

RAW_NAME="jammy-server-cloudimg-arm64.raw"

if [ ! -e "${RAW_NAME}" ]
then
	qemu-img convert -p -f qcow2 -O raw "${QCOW2_NAME}" "${RAW_NAME}"
fi

if [ ! -d "rust-hypervisor-firmware" ]
then
	git clone https://github.com/cloud-hypervisor/rust-hypervisor-firmware.git
fi

FIRMWARE_PATH="rust-hypervisor-firmware/target/aarch64-unknown-none/release/hypervisor-fw"

if [ ! -e "${FIRMWARE_PATH}" ]
then
	pushd rust-hypervisor-firmware
	cargo build --release --target aarch64-unknown-none.json -Zbuild-std=core,alloc -Zbuild-std-features=compiler-builtins-mem
	popd
fi

cloud-hypervisor --serial tty --console pty --kernel "${FIRMWARE_PATH}" --disk path="${RAW_NAME}"

Expected output

The VM boots and shows ubuntu login: .

Actual output

$ ./run.sh
[...]
cloud-hypervisor: 8.063405ms: <vmm> WARN:arch/src/aarch64/fdt.rs:114 -- File: /sys/devices/system/cpu/cpu0/cache/index3/size does not exist.
cloud-hypervisor: 8.138801ms: <vmm> WARN:arch/src/aarch64/fdt.rs:148 -- File: /sys/devices/system/cpu/cpu0/cache/index3/coherency_line_size does not exist.
cloud-hypervisor: 8.187810ms: <vmm> WARN:arch/src/aarch64/fdt.rs:171 -- File: /sys/devices/system/cpu/cpu0/cache/index3/number_of_sets does not exist.
cloud-hypervisor: 8.266878ms: <vmm> WARN:arch/src/aarch64/fdt.rs:428 -- L2 cache shared with other cpus

Booting with FDT
Found PCI device vendor=8086 device=d57 in slot=0
Found PCI device vendor=1af4 device=1043 in slot=1
Found PCI device vendor=1af4 device=1042 in slot=2
Found PCI device vendor=1af4 device=1044 in slot=3
PCI Device: 0:2.0 1af4:1042
Bar: type=MemorySpace32 address=0x2ff80000 size=0x80000
Bar: type=MemorySpace32 address=0x0 size=0x0
Bar: type=MemorySpace32 address=0x0 size=0x0
Bar: type=MemorySpace32 address=0x0 size=0x0
Bar: type=MemorySpace32 address=0x0 size=0x0
Bar: type=MemorySpace32 address=0x0 size=0x0
Updated BARs: type=MemorySpace32 address=2ff80000 size=80000
Updated BARs: type=MemorySpace32 address=0 size=0
Updated BARs: type=MemorySpace32 address=0 size=0
Updated BARs: type=MemorySpace32 address=0 size=0
Updated BARs: type=MemorySpace32 address=0 size=0
Updated BARs: type=MemorySpace32 address=0 size=0
Virtio block device configured. Capacity: 4612096 sectors
Found EFI partition
Filesystem ready
Error loading default entry: File(NotFound)
Using EFI boot.
Found bootloader: \EFI\BOOT\BOOTAA64.EFI
Executable loaded
Failed to set MokListRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListRT: Unsupported
Failed to set MokListXRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListXRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListTrustedRT: Unsupported
Something has gone seriously wrong: import_mok_state() failed: Unsupported
TPM logging failed: Unsupported

The VM hangs and 100% CPU usage by Cloud Hypervisor process can be observed.

Versions tested

Rust Hypervisor Firmware built from main, and:

$ cloud-hypervisor --version
cloud-hypervisor v36.0.0

Hardware used

a1.metal AWS EC2 instance running Debian 12 (arm64).

Notes

EDK2 works is just fine.

Related: #198.

cloud-hypervisor fails while using hypervisor-fw

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.1 LTS
Release:        20.04
Codename:       focal

$ uname -r
5.4.0-53-generic

Build with


cargo build --release --all  #commit b399287430801
$ md5sum hypervisor-fw
c14b3949f9062e434bd59037c3e9e5ac  hypervisor-fw

$ md5sum focal-server-cloudimg-amd64.*  # Latest images as of today
4b9ec490b1c8e1585faeb4235d72b162  focal-server-cloudimg-amd64.img
a3abcd22741e1147a043d3e32474291a  focal-server-cloudimg-amd64.raw

$ ../cloud-hypervisor/target/release/cloud-hypervisor \
	--kernel ./hypervisor-fw \
	--disk path=focal-server-cloudimg-amd64.raw \
	--console off \
	--serial tty \
	--cpus boot=4 \
	--memory size=1024M \
	--net "tap=,mac=,ip=,mask=" \
	--rng

Cloud Hypervisor Guest
        API server: /run/user/1000/cloud-hypervisor.3398
        vCPUs: 4
        Memory: 1024 MB
        Kernel: Some(KernelConfig { path: "./hypervisor-fw" })
        Initramfs: None
        Kernel cmdline:
        Disk(s): Some([DiskConfig { path: Some("focal-server-cloudimg-amd64.raw"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, id: None }])

Booting via PVH Boot Protocol
Setting up 4 GiB identity mapping
Page tables setup
Found PCI device vendor=8086 device=d57 in slot=0
Found PCI device vendor=1af4 device=1042 in slot=1
Found PCI device vendor=1af4 device=1041 in slot=2
Found PCI device vendor=1af4 device=1044 in slot=3
PCI Device: 0:1.0 1af4:1042
Bar: type=MemorySpace32 address=e7f80000
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=Unused address=0
Virtio block device configured. Capacity: 4612096 sectors
Found EFI partition
Filesystem ready
Error loading default entry: FileError(NotFound)
Using EFI boot.
Found bootloader (BOOTX64.EFI)
Executable loaded
Failed to set MokListRT: Unsupported
Could not create MokListRT: Unsupported
Something has gone seriously wrong: import_mok_state() failed
: Unsupported
TPM logging failed: Unsupported
error: can't find command `hwmatch'.
exit_boot() failed!
efi_main() failed!

Any pointers on what could be causing this failure? I also installed a custom 5.8 kernel on the host and tried the above, still the same failure.

Stock groovy images don't boot

Enters reset loop:

rob@artemis:~/src/cloud-hypervisor (master)$ target/debug/cloud-hypervisor --kernel ~/src/rust-hypervisor-firmware/target/target/release/hypervisor-fw --disk path=~/workloads/groovy-server-cloudimg-amd64.raw --cpus boot=1 --memory size=512M  --serial tty --console off --api-socket=/tmp/api
Setting up 4 GiB identity mapping
Page tables setup

Booting with PVH Boot Protocol
Found PCI device vendor=8086 device=d57 in slot=0
Found PCI device vendor=1af4 device=1042 in slot=1
Found PCI device vendor=1af4 device=1044 in slot=2
PCI Device: 0:1.0 1af4:1042
Bar: type=MemorySpace32 address=e7f80000
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=Unused address=0
Virtio block device configured. Capacity: 4612096 sectors
Found EFI partition
Filesystem ready
Error loading default entry: FileError(NotFound)
Using EFI boot.
Found bootloader (BOOTX64.EFI)
Executable loaded
TPM logging failed: Unsupported
System BootOrder not found.  Initializing defaults.
Creating boot entry "Boot0000" with label "ubuntu" for file "\EFI\UBUNTU\shimx64.efi"

TPM logging failed: Unsupported
error: can't find command `hwmatch'.
Setting up 4 GiB identity mapping
Page tables setup
...

UEFI: correctly handle SetVirtualAddressMap

Right now, SetVirtualAddressMap in the UEFI compatibility layer just adjusts the virtual_start for the MemoryDescriptors returned by the allocator. However, this isn't what we should be doing (if I'm reading the spec correctly).

From the UEFI Specification 2.8 (Errata A), Section 8.4:

  • SetVirtualAddressMap should only be called exactly once during runtime (i.e. after calling ExitBootServices)
  • As Boot Services cannot be active when this is called, we don't need to modify the allocator at all (it won't be used again).
  • We do need to fixup any RuntimeServices code/data so that it can be called w/ non-identity paging.

The basic idea here would be to have separate ELF sections for EfiRuntimeServicesCode and EfiRuntimeServicesData. This would allow the remaining firmware to be unmapped by the OS. The EfiRuntimeServicesCode would need to be built with "relocation-model": "pic".

On a call to SetVirtualAddressMap, the code would then need to also fixup any pointers in static memory to use the new memory mapping. This can be automated by having the linker emit the necessary relocation entries.

EDK2's Implementation: https://github.com/tianocore/edk2/blob/3806e1fd139775610d8f2e7541a916c3a91ad989/MdeModulePkg/Core/RuntimeDxe/Runtime.c#L232

Reading directory entries via File Protocol of EFI misses some entries when caller's buffer is too small

an EFI application calls Read function via File protocol.
Read function of this firmware returns entries sequentially when the given handler is a directory's one.

Read function is implemented as that it moves the handler's cluster/sector/offset ahead by next_node function before it checks if the given buffer's size is enough.
This causes an EFI application cannot retrieve a directory entry when it calls Read with too-small buffer and get BUFFER_TOO_SMALL then it re-calls Read with enough buffer.

Remove Linux Boot Protocol Support

Is this something that we want to keep around long-term?

Right now we only support this because CH used to require it, but now both CH and this firmware support PVH. Also, unlike PVH, our final binary doesn't advertise support for the Linux Boot Protocol (i.e. we don't have the right header+magic bits), so it seems strange that we implement it by blindly assuming that anything that uses the main 64-bit ELF entry point expects us to be Linux.

Update copyright headers for @josephlr contributions

Google's preferred policy on this is actually public: https://opensource.google/documentation/reference/patching#license_headers_and_copyright_notices

When contributing to third-party projects, Googlers do not need to add Google's copyright notice when authoring patches to existing files. Googlers should add Google's copyright notice (or a "The Project Authors" style copyright notice) to new files being added to the library if permitted by the project maintainers.

So either:

// SPDX-License-Identifier: Apache-2.0
// Copyright <year> Google LLC.

or

// SPDX-License-Identifier: Apache-2.0
// Copyright <year> The Rust Hypervisor Firmware Authors

would work for any files I created.

Originally posted by @josephlr in #256 (comment)

Support parsing multiple virtio-blk device

At the moment, the code looks for the first device that is a virtio-blk device, and if it cannot find an EFI partition on it, it simply fails.
This is a problem when the first block device is not the actual boot device.
In order to solve this problem, we need the firmware to loop over the entire list of virtio-block devices until it can actually find one with an EFI partition.

Support more VMMs (QEMU, crosvm, XEN)

Right now it's not possible to run the firmware with any non-Firecracker VMM. Ideally we would support a wide range of VMMs. There are many small changes that would fix this:

  • In the ELF binary, advertise support for a common boot specification. This would allow any VMM supporting that spec to run the firmware. Our options are:

    • Mutliboot/Multiboot2: this is an older FSF common boot standard. It would also let us work with QEMU's -kernel option (might also work with Xen).
    • PVH direct boot: This standard started with XEN, but support was added to QEMU recently. To use this, we just need to advertise XEN_ELFNOTE_PHYS32_ENTRY as an ELFNOTE. Not supporting PVH is why running QEMU with -kernel does not work:
      > qemu-system-x86_64 -kernel target/target/release/hypervisor-fw   
      qemu-system-x86_64: Error loading uncompressed kernel without PVH ELF Note
      
    • Note that both of these specs would (probably) require a separate 32-bit entry point. The firmware would then need to setup paging before jumping to the normal 64-bit ELF entry point (i.e. _start).
  • Issue #5: support building the firmware as a flat BIOS binary (like OVMF.fd), allowing it to be directly loaded/executed by any VMM that supports SeaBIOS/OVMF.

Implementing either of these would let the firmware work on QEMU. I'm not sure about how to get it working for crosvm (it should "just work" with any ELF binary). I can ask around at work Monday to see what the deal is.

Boot Loop with RISC-V

The latest RHF/RISC-V goes boot loop like:

Starting on RV64 0x0 0xbfe00000
Starting on RV64 0xbfe00000 0x0
PANIC: panicked at src/fdt.rs:31:27:
Failed to create device tree object: BadPtr

I found that the issue first appears after commit af0d3c7, which updates Rust toolchain to nightly-2023-08-11 from nightly-2023-05-26. I also found that the issue appears from nightly-2023-06-07 by bisecting the Rust nightly toolchain.

EFI: FileProtocol.open does not work for directories

Per the UEFI spec, calling open on a directory path should work, but right now it will fail, as the code here is only setup to return a file.

The solution to this is probably reworking the filesystem code to better handle working with directories.

Work out how to mitigate "nightly" risks

Nightly doesn't have clippy at the moment so the CI has currently failed. Although it's possible to fix on a particular version of nightly (via rust-toolchain) that sometimes goes wrong as it pulls down the latest rust-src which might not compile with that compiler.

@joshtriplett do you have any thoughts on how we could move away from nightly or mitigate the risks? A docker container with a known good toolchain already installed?

Unable to load Windows boot manager EFI binary

Windows has a copy of its bootmgrfw.efi as bootx64.efi inside ESP, so hypervisor-fw should be able to find and load it.

In the attempt to load bootx64.efi, a debug build of hypervisor-fw panicked.

$ ./target/debug/cloud-hypervisor --kernel /tmp/hypervisor-fw --disk path=/home/wei/workloads/windows-server-2019.raw --cpus boot=2,kvm_hyperv=on,max_phys_bits=39 --memory size=4G --serial tty --console off --net tap= --api-socket=/tmp/ch.sock

Cloud Hypervisor Guest
        API server: /tmp/ch.sock
        vCPUs: 2
        Memory: 4096 MB
        Kernel: Some(KernelConfig { path: "/tmp/hypervisor-fw" })
        Initramfs: None
        Kernel cmdline:
        Disk(s): Some([DiskConfig { path: Some("/home/wei/workloads/windows-server-2019.raw"), readonly: false, direct: false, iommu: false, num_queues: 1, queue_size: 128, vhost_user: false, vhost_socket: None, poll_queue: true, id: None }])

Booting with PVH Boot Protocol
Setting up 4 GiB identity mapping
Page tables setup
Found PCI device vendor=8086 device=d57 in slot=0
Found PCI device vendor=1af4 device=1042 in slot=1
Found PCI device vendor=1af4 device=1041 in slot=2
Found PCI device vendor=1af4 device=1044 in slot=3
PCI Device: 0:1.0 1af4:1042
Bar: type=MemorySpace32 address=e7f80000
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=MemorySpace32 address=0
Bar: type=Unused address=0
Virtio block device configured. Capacity: 41943040 sectors
Found EFI partition
Filesystem ready
Error loading default entry: FileError(NotFound)
Using EFI boot.
Found bootloader (BOOTX64.EFI)
PANIC: panicked at 'attempt to subtract with overflow', src/pe.rs:199:71

This can be reproduced by replacing OVMF.fd with hypervisor-fw.

Bump version

The latest 0.4.2 release was exactly a year ago, and there were 147 commits made since then.

It would be nice to have a new version cut, so people could try the Rust Hypervisor Firmware for aarch64 and all the other features and bug fixes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.