bpfman / bpfman Goto Github PK

View Code? Open in Web Editor NEW

448.0 12.0 41.0 45.18 MB

An eBPF Manager for Linux and Kubernetes

Home Page: https://bpfman.io

License: Apache License 2.0

Rust 61.48% C 1.47% Shell 2.41% Python 0.36% Makefile 1.59% Go 32.27% Dockerfile 0.08% Smarty 0.34%

ebpf kubernetes kubernetes-operator rust

bpfman's Introduction

bpfman: An eBPF Manager

Formerly know as bpfd

Welcome to bpfman

bpfman operates as an eBPF manager, focusing on simplifying the deployment and administration of eBPF programs. Its notable features encompass:

System Overview: Provides insights into how eBPF is utilized in your system.
eBPF Program Loader: Includes a built-in program loader that supports program cooperation for XDP and TC programs, as well as deployment of eBPF programs from OCI images.
eBPF Filesystem Management: Manages the eBPF filesystem, facilitating the deployment of eBPF applications without requiring additional privileges.

Our program loader and eBPF filesystem manager ensure the secure deployment of eBPF applications. Furthermore, bpfman includes a Kubernetes operator, extending these capabilities to Kubernetes. This allows users to confidently deploy eBPF through custom resource definitions across nodes in a cluster.

Here are some links to help in your bpfman journey (all links are from the bpfman website https://bpfman.io/):

Welcome to bpfman for overview of bpfman.
Quick Start for a quick installation of bpfman without having to download or build the code from source. Good for just getting familiar with bpfman and playing around with it.
Deploying Example eBPF Programs On Local Host for some examples of running bpfman on local host and using the CLI to install eBPF programs on the host.
Deploying Example eBPF Programs On Kubernetes for some examples of deploying eBPF programs through bpfman in a Kubernetes deployment.
Setup and Building bpfman for instructions on setting up your development environment and building bpfman.
Example eBPF Programs for some examples of eBPF programs written in Go, interacting with bpfman.
Deploying the bpfman-operator for details on launching bpfman in a Kubernetes cluster.
Meet the Community for details on community meeting details.

License

With the exception of eBPF code, everything is distributed under the terms of the Apache License (version 2.0).

eBPF

All eBPF code is distributed under either:

The terms of the GNU General Public License, Version 2 or the BSD 2 Clause license, at your option.
The terms of the GNU General Public License, Version 2.

The exact license text varies by file. Please see the SPDX-License-Identifier header in each file for details.

Files that originate from the authors of bpfman use (GPL-2.0-only OR BSD-2-Clause) - for example the TC dispatcher or our own example programs.

Files that were originally created in libxdp use GPL-2.0-only.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project by you, as defined in the GPL-2 license, shall be dual licensed as above, without any additional terms or conditions.

bpfman's People

Contributors

Stargazers

Watchers

bpfman's Issues

Implement POC using dispatcher approach for adding tc bpf programs

The intent is to have the same user experience for managing multiple programs regardless of the program type. In particular, programs should be prioritized in the same way and we should support the atomic addition and deletion of programs.

The goal for this issue is to prove out the following design.

Instead of adding tc programs directly to the tc subsystem, the intent is to use a dispatcher in the same way as is currently done for xdp programs.

Atomic replacement will be accomplished as follows:

If none of the added tc programs returns a terminal value, the dispatcher will return a TC_ACT_OK so that no lower priority tc programs get executed.
When a program is added or deleted, we will create a new dispatcher program with the new list of programs as eBPF extension programs.
The new dispatcher program will first be added at a higher priority than the current dispatcher program first, and then the old dispatcher will be removed.
A set of tc priority slots will be reserved for bpfd programs. We will need to handle the wrap case in which the current dispatcher is at the highest priority of the reserved priorities. We intend to handle this case by moving the current dispatcher to the lowest priority before the new dispatcher is added.

Expose bpfman API over dbus

It might be desirable to expose the API over DBus. We should investigate if this is possible with https://crates.io/crates/zbus and what features we might not be able to support (not sure how authn would work).

bpfd: Be a well behaved modern daemon

See "New style daemons" in man daemon(7):
https://man7.org/linux/man-pages/man7/daemon.7.html

Hopefully we don't need to be a SysV daemon...

Add a config file for the bpfctl tool

This depends on #8.
Initial use case would be to provide the path to the certificate to avoid having to pass it via flag to every command.

Add Unit tests to bytecode image pulling

Follow up to #102

Simply add some unit tests for the new bpfd bytecode pulling functionality in pull_bytecode.go

Test...

A bad image url
A malformed image (empty or otherwise not following the spec)
A non-public image
etc ...

Interoperability: Attach a C Extension to a Rust Program and vice-versa

Assuming the BTF types are compatible I'm hopeful that this is posisble

Add config file for bpfd

Initially this should allow setting the XDP mode for an interface

Fix bpfctl configuration

Current layout in /etc:

/etc/bpfctl/
├── bpfctl.toml
└── certs
    ├── bpfctl
    │   ├── bpfctl.key
    │   └── bpfctl.pem
    └── gocounter
        ├── gocounter.key
        └── gocounter.pem

Notes:

bpfctl should support local configuration with a higher precedence e.g in ~/.config/bpfctl or somesuch.
We use the SubjectCN of certificates as the username as far as bpfd is concerned. This will be used to correctly set ownership of files in /var/run/bpfd/bpffs with ${username}:bpfd and to set the ACL to be something like 0664 to allow r/w to owner and bpfd group and r/o to everyone else (see #18) as well as to determine API access. We have 2 choices:
a. Implement (1) and deprecate using /etc for bpfctl
b. Decouple the transport authentication (TLS certs) from the user authentication somehow. i.e have our own username/apitoken database that we use for authentication. Then you can bootstrap a /etc/bpfd/certs/client/client.pem certificate that any user in the right group can use... but you must also provide an API token.

security: API Authorization (Access Control)

Since #8 the API is restricted to authenticated users.
The process of authentication is to provide $user with a certificate.

Currently the API consists of:

Load
Unload
GetMap

With #36 operations will only be permitted against resources that are owned by the user.
And GetMap is likely to be deprecated as pinned map paths can be sent in the LoadResponse.

That leaves only 2 APIs that would require authorization and it doesn't seem necessary at this point.
However if we decide to add #9 perhaps there is an argument for restricting the scope of calls made to only permit the loading of certain program types.

Return error after all available dispatcher slots are filled

Refine Map API

Since the map fd over UDS API is here to stay, we want to make sure we get a new fd to send to userspace.
That new fd should only allow map operations that we want... and kernel support for that in unvpriv bpf is landing soon.

Would require some work in Aya to:

Call bpf(BPF_OBJ_GET_INFO_BY_FD) to populate the map info the Map struct
Add an API to call Map::new_fd(flags), which would do bpf(BPF_MAP_GET_FD_BY_ID) where flags should be an enum that supports ReadOnly and ReadWrite where the former propagates BPF_F_RDONLY and the latter, none.

docs: Add some useful documentation

CONTRIBUTING.MD

Explain the Github Workflow
Explain patch formatting with cargo +nightly fmt
Explain patch liniting with cargo +nightly clippy

docs/design.md

Explains a little bit of the motivation behind bpfd and how it is designed

docs/quickstart.md

A walkthrough of how to start and configure bpfd
An example of how to deploy the sample programs in-tree
An example of how to manage (list/unload) applications

docs/go-client.md

A walkthrough of the example code example/gocounter that explains how to use the go bindings example/gobpfd

Update gocounter example

I think I removed the bpf code that this used 😅 It might make more sense to use whatever cilium/ebpf recommends or just write the prog in bpf asm directly as it's only small. Was inspired by this program
It now needs a certificate generated - ./scripts/certificates.sh should be updated to allow easy generation of a client cert.

bpfd won't run if "/etc/bpfd/programs.d" doesn't exist

With the addition of adding the ability to load programs at daemon startup, bpfd won't run if "/etc/bpfd/programs.d" doesn't exist:

$ sudo ./target/debug/bpfd
Error: No such file or directory (os error 2)

Switch to ifindex as map index instead of interface name

Some interfaces have an altname.

$ ip a
:
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:42:48:36 brd ff:ff:ff:ff:ff:ff
    altname enp0s3
    inet 192.168.122.118/24 brd 192.168.122.255 scope global dynamic noprefixroute ens3
       valid_lft 2686sec preferred_lft 2686sec
    inet6 fe80::5054:ff:fe42:4836/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

If a program is inserted on ens3 and then insert one on enp0s3, then the bpfd panics. Since the index to the map is the interface name, they are treated as different interfaces, so on the second call, there is a panic when the interface is attempted to be taken. Switching the index to map to be the ifindex would ensure unique interfaces and therefore less error prone.

security: Program ACLs

Programs created by $user should only be allowed to be removed by $user or superuser (i.e bpfctl)

Deployment on K8s

We should consider (and implement) the best way of deploying bpfd in a k8s environment.

A user would install bpfd-operator which would add a CRD that lets K8s users submit BPF programs to be loaded
The operator would install a new component bpfd-agent via a DaemonSet. bpfd-agent would be a K8s controller that watches for changes in the CRD, and (if applicable) uses bpfd's GRPC API to perform the operations against bpfd on the host.

Open questions:

Is bpfd installed on the host (i.e via rpm) or is it a running in a DaemonSet - or can we support both models?
Can we re-use the TOML config file format described in #30 for this?
Does bpfd somehow need to tag resources that are managed by k8s - i.e in case of controller restart/upgrade so it can re-compute it's state.

Get XDP Multiprog Working Correctly with C eBPF Programs

Most of the upstream work that was blocking this has now been implemented in Aya

Ability to manipulate .rodata maps
Support for loading BTF into the kernel
Support for BPF_PROG_TYPE_EXT
Ability to load program from a pinned file

We may also need

Ability to load a map from a pinned file. See: aya-rs/aya#75

Consume the grpc API from Go

Generate a client using the protobuf file and load a program form Go.

Logging to systemd journal

According to this article stdout/stderr is automatically captured.
We should use the correct format for logs to get correctly picked at the right levels in the journal...
So either syslog, if /dev/log is supported/available or perhaps using a log file.

UPDATE: looks like https://docs.rs/systemd-journal-logger/0.5.0/systemd_journal_logger/ could solve this for us.
And using a snippet like this we can fall back to using env_logger if not connected to the journal

Switch to using env_logger

Aya is making this change also

security: API Fuzzing

We should use cargo-fuzz to fuzz the API to ensure that we don't panic on invalid input.
Fuzzing should be guided with valid input data for each API.

Implement POC of using bpfd to add individual tc bpf programs

Explore what can become optional in the BPFD GRPC API

From a bpfd standpoint we should:

Build config from BTF (skip that for now)
Apply overrides from image labels (if container)
Apply overrides from the API request

That probably means that a lot of what's in the GRPC api now can actually become optional... and the CLI design would follow.
So --path and --image are the only required items (mutually exclusive)
Everything else would be Option<...>

This issue will track the work to explore how GRPC handles optional fields, and how we can optimize the API and bpfd to reduce "data over the wire" as much as possible.

Use Rust for the dispatcher

This is blocked on having stable BTF from rustc. See aya-rs/bpf-linker#1

Add bpfd-api crate

Remove bpfd/build.rs
Follow https://github.com/hyperium/tonic/tree/master/tonic-build#google-apis-example to add a command to the xtask crate to generate the code, setting out_dir to bpfd-api/src/api_v0.rs and adding a lib.rs appropriately.
Update dev docs on new process for updating protobuf files.

Bonus

Migrate the bash script for updating Go bindings to the same xtask.

Motivation

We can remove the build script setting from rust-analyzer. Builds should be faster.

api: Add extension delete

bpfd: Support reload of config files on SIGHUP

api: Implement Listing of Programs/Extensions

Add config option to specify container registry credentials for pulling bytecode images

Originates from #102 (comment)

In #102 we introduced the ability to extract bytecode from container images.

These images are pulled from remote repositories and currently only public repositories are supported.

We need to add a config option to bpfd.toml to authenticate against private image repositories

something like

[ registry ]

[ registry.docker-io ]
username = foo
password = bar

and then nfer the registry from the image path and login if required.

Remove dnf references in README

and/or provide package names for Debian/Ubuntu and other distros

Get XDP MultiProg Working With Rust eBPF Programs

The current issue is "func#0 @0\nSubprog prog0 doesn't exist which is likely related to the func_info and line_info that we get in .BTF.ext. We could be not correctly handling this in bpf-linker or, more likely, in aya.

grpc: Generate client API bindings for other languages

Targeting Go might be a good first step.
A go client should be able to load a program using this API client.

Program Persistence

Currently, the lifecycle of all bpf loaded by bpfd is equivalent to the lifetime of the process.
Once bpfd dies, all bpf programs are detached and unloaded from the kernel.

For systemctl restart bpfd, this behaviour isn't desirable.
We can either:

Pin all programs, links etc...

Pro:

eBPF code continues to run unhindered while the daemon restarts

Con:

We need to manage additional code to pin all objects... and to remove the reference from bpffs when we drop them to avoid leaking resources. Not only that, we also need to rebuild our runtime state using the BPF query APIs - some work may be necessary upstream in Aya to make this possible.

Persist state to /var/run/bpfd/$iface/$program.toml

Pro:

Less lifecycle code to maintain as all runtime state can successfully be rebuilt from /var/run

Cons:

eBPF programs will be unloaded/reloaded

I'm not really sure which option is best as of now.

Improve the Map API

I think the best way is probably to have an API like GetMap(iface, id, map_name, path_to_uds).
Since gRPC doesn't have a native API for passing FDs, we could probably clone the map fd and send a copy over a UDS.
Userspace programs would then need an API for creating a map from an FD, but hopefully some existing BPF libs already have this feature

packaging: Investigate rust2rpm

https://docs.fedoraproject.org/en-US/packaging-guidelines/Rust/

We should build RPMs in CI that we can use for testing...

Section name validation

If you pass a bad section name using:

sudo ./target/debug/bpfctl load -p xdp -i wlp2s0 -s "invalid" --priority 50 ~/dev/xdp-tutorial/basic01-xdp-pass/xdp_pass_kern.o

We return an error, but we never clean up the program metadata.
As a result, a subsequent call with a valid section name will fail, since it's considered a "new" extension, and the old failed one will still fail to load.

We should Bpf::load_file(...) and bpf.programs_get() BEFORE we add this to our list extension programs.
We don't need to actually load the program (although we could, and just not attach it)

Create a test deployment

A simple script to standup a VM, install Fedora 36, install the RPMs from #5 and provide SSH access for testing.
For bonus points we could do with a kernel that's built on net-next with unprivileged BPF enabled.

`unload` the last program and system hangs

Running in a F36 VM, if two programs are added to an interface and then 'unloaded`, I lose access to the VM and need to reboot it.

$ sudo ./target/debug/bpfctl load -p xdp -i ens3 --priority 55 -s "xdp" /home/bmcfall/src/xdp-tutorial/basic01-xdp-pass/xdp_pass_kern.o
f7fe8625-5453-4eca-bb0b-c9130f15d6ea
$ sudo ./target/debug/bpfctl load -p xdp -i ens3 --priority 60 -s "xdp" /home/bmcfall/src/xdp-tutorial/basic01-xdp-pass/xdp_pass_kern.o
909edd07-f023-4dd1-9fee-16b0975bffd6

$ sudo ./target/debug/bpfctl list -i ens3
ens3
xdp_mode: skb

0: f7fe8625-5453-4eca-bb0b-c9130f15d6ea
	name: "xdp"
	priority: 55
	path: /home/bmcfall/src/xdp-tutorial/basic01-xdp-pass/xdp_pass_kern.o
1: 909edd07-f023-4dd1-9fee-16b0975bffd6
	name: "xdp"
	priority: 60
	path: /home/bmcfall/src/xdp-tutorial/basic01-xdp-pass/xdp_pass_kern.o

$ sudo ./target/debug/bpfctl unload -i ens3 f7fe8625-5453-4eca-bb0b-c9130f15d6ea
$ sudo ./target/debug/bpfctl unload -i ens3 909edd07-f023-4dd1-9fee-16b0975bffd6
packet_write_wait: Connection to 192.168.122.118 port 22: Broken pipe

Dispatcher program + some loaded maps are left orphaned

After Loading and unloading an xdp program which defines a map, such as the one in #53 , the dispatcher program is left attached to the specified interface and the program's map is orphaned.

To recreate the stale map scenario simply load and unload a BPF program
which defines a new map.

An easy way to recreate is to checkout #53,
run the go example, exit it while leaving bpfd running, you'll see that the
stats program is gone but it's map xdp_stats_map is left orphaned even after closing
the FD.

[astoycos@localhost bpfd]$ sudo bpftool map list name xdp_stats_map
733: percpu_array  name xdp_stats_map  flags 0x0
        key 4B  value 16B  max_entries 5  memlock 4096B
        pids bpfd(644765)

Clearly bpfd is still owning/ maintaing a reference to this even though the program that
defined it (the xdp counter prog) no longer exists.

#55 Is a first attempt at fixing this however I don't know if it's the right method.

Use Aya Upstream

Currently depends on aya-rs/aya#282
And the MapFd trait I added...
And various other things.

Multiprog for TC

The dispatcher code is already written but hasn't been tested. We can already attach multiple programs to a TC hook but having deterministic ordering using a dispatcher seems a better option - I think. Either way we should investigate whether a dispatcher is necessary, or whether the existing mechanism in kernel can be tamed.

Support other program types?

While the API is built specifically to handle network (TC/XDP) program types, there is an argument to be made that supporting other programs could be useful.

In which case we'd need to:

Rename our current Load API to LoadXDP and maybe LoadTC depending on #9
Add new APIs for the other supported program types

Expanding the API to support more program types would likely require #37 to reduce the scope of authorized APIs for each user.

What happens if the bytecode is removed AFTER load

Extend API to allow passing of chain call actions

See: https://github.com/xdp-project/xdp-tools/blob/master/lib/libxdp/protocol.org#populating-the-dispatcher-configuration-map

From the command line, we should have something like --proceed-on pass --proceed-on redirect.. where that argument can be repeated multiple times...
The default would be to assume --proceed-on pass. Should be a Vec<XdpAction>, where XdpAction is an enum that has a TryFrom<String> implementation - see ProgramType for reference.

The bpfd.proto file will need a new argument added to the Load RPC. I would suggest a repeated arg of type String.

On the server side, we will need some code to convert the Vec<String> into a bitmap (see: libxdp protocol linked above).
The bitmap would be cached in ExtensionProgram for re-use...
And it needs to be populated into the XdpDispatcherConfig

Loading from `/etc/bpfd/programs.d`

We should support loading from /etc/bpfd/programs.d/foo.toml.
On start, the daemon would parse files from this directory and load them to the specified interface.
The configuration would look something like this:

[program]
interface ="eth0"
path = "/opt/bin/myapp/lib/myebpf.o"
section_name = "firewall"
program_type ="xdp"
priority = 50

Using CNI images to ship eBPF bytecode

It should be relatively easy to do using oci-rs.

Define the rules that images should conform to. At the minimum we'll need an annotation/label to identify this as eBPF.
We may want to add optional labels/annotations to include API defaults - like program_section etc.. - but we would also need to document how these can be overridden.
Create an image repository: /var/lib/bpfd/images
Add the ability for bpfd to pull an image from a registry and extract the bytecode from the rootfs.
As a bonus we might want to consider integration with cosign to prevent the load of unsigned images.