Giter VIP home page Giter VIP logo

grafana / beyla Goto Github PK

View Code? Open in Web Editor NEW
1.2K 117.0 77.0 64.58 MB

eBPF-based autoinstrumentation of web applications and network metrics

Home Page: https://grafana.com/oss/beyla-ebpf/

License: Apache License 2.0

Dockerfile 0.18% Makefile 0.14% C 84.55% Go 14.45% Shell 0.04% Jsonnet 0.03% Python 0.03% Ruby 0.27% Rust 0.17% Java 0.08% JavaScript 0.05% HTML 0.01% C# 0.01% Smarty 0.03%
ebpf metrics-gathering observability traces

beyla's Introduction

Grafana Beyla logo

Grafana Beyla

Open source zero-code automatic instrumentation with eBPF and OpenTelemetry.

Build Status

๐ŸŸข We are hiring! ๐ŸŸข If you want to become a Beyla engineer, find our job post here.

Introduction

Beyla is a vendor agnostic, eBPF-based, OpenTelemetry/Prometheus application auto-instrumentation tool, which lets you easily get started with Application Observability. eBPF is used to automatically inspect application executables and the OS networking layer, allowing us to capture essential application observability events for HTTP/S and gRPC services. From these captured eBPF events, we produce OpenTelemetry web transaction trace spans and Rate-Errors-Duration (RED) metrics. As with most eBPF tools, all data capture and instrumentation occurs without any modifications to your application code or configuration.

Community

To engage with the Beyla community and to chat with us on our community Slack channel, please invite yourself to the Grafana Slack, visit https://slack.grafana.com/ and join the #beyla channel.

We also run a monthly Beyla community call, on the second Wednesday of the month at 4pm UTC. You can find all of the details about our community call on the Grafana Community Calendar.

Getting Started

To try out Beyla, you need to run a network service for Beyla to instrument. Beyla supports a wide range of programming languages (Go, Java, .NET, NodeJS, Python, Ruby, Rust, etc.), so if you already have an example service you can use it. If you don't have an example, you can download and run example-http-service.go from the examples/ directory:

curl -OL https://raw.githubusercontent.com/grafana/beyla/main/examples/example-http-service/example-http-service.go
go run ./example-http-service.go

Next, generate some traffic. The following command will trigger a GET request to http://localhost:8080 every two seconds.

watch curl -s http://localhost:8080

Now that we have an example running, we are ready to download and run Beyla.

First, download and unpack the latest release from the GitHub releases page. The release should contain the ./beyla executable.

Beyla supports multiple ways to find the service to be instrumented (by network port, executable name, process ID), and multiple exposition formats (Prometheus, OpenTelemetry metrics, Distributed Traces for Go, Single Span traces for other languages).

For getting started, we'll tell Beyla to instrument the service running on port 8080 (our example service) and expose metrics in Prometheus format on port 9400.

export BEYLA_PROMETHEUS_PORT=9400
export BEYLA_OPEN_PORT=8080
sudo -E ./beyla

Now, you should see metrics on http://localhost:9400/metrics.

See Documentation and the tutorials for more info.

Requirements

  • Linux with Kernel 5.8 or higher with BTF enabled. BTF became enabled by default on most Linux distributions with kernel 5.14 or higher. You can check if your kernel has BTF enabled by verifying if /sys/kernel/btf/vmlinux exists on your system. If you need to recompile your kernel to enable BTF, the configuration option CONFIG_DEBUG_INFO_BTF=y must be set.
  • eBPF enabled in the host
  • For instrumenting Go programs, they must have been compiled with at least Go 1.17. We currently support Go applications built with a major Go version no earlier than 3 versions behind the current stable major release.
  • Administrative access to execute the instrumenter
    • Or execute it from a user enabling the SYS_ADMIN capability. This might not work in some container environments.
Library Working
Kernel-level HTTP calls โœ…
OpenSSL library โœ…
Standard Go net/http โœ…
Gorilla Mux โœ…
Gin โœ…
gRPC-Go โœ…

Kubernetes

You can just trigger the Kubernetes descriptors in the deployments/ folder.

  1. Provide your Grafana credentials. Use the following K8s Secret template to introduce the endpoints, usernames and API keys for Mimir and Tempo:

    $ cp deployments/01-grafana-credentials.template.yml 01-grafana-credentials.yml
    $ # EDIT the fields
    $ vim 01-grafana-credentials.yml
    $ kubectl apply -f 01-grafana-credentials.yml 
    
  2. Deploy the Grafana Agent:

    kubectl apply -f deployments/02-grafana-agent.yml
    
  3. Deploy a demo app with the auto-instrumenter as a sidecar. You can use the blog example in the deployments/03-instrumented-app.yml file.

    $ kubectl apply -f ./deployments/03-instrumented-app.yml
    $ kubectl port-forward service/goblog 8443:8443
    

You should be able to query traces and metrics in your Grafana board.

Development recipes

How to regenerate the eBPF Kernel binaries

The eBPF program is embedded into the pkg/internal/ebpf/bpf_* generated files. This step is generally not needed unless you change the C code in the bpf folder.

If you have Docker installed, you just need to run:

make docker-generate

If you can't install docker, you should locally install the following required packages:

dnf install -y kernel-devel make llvm clang glibc-devel.i686
make generate

Tested in Fedora 35, 38 and Red Hat Enterprise Linux 8.

Credits

Part of the code is taken from: https://github.com/open-telemetry/opentelemetry-go-instrumentation

beyla's People

Contributors

arramos84 avatar baptisteroseau avatar biubiubiuboomboomboom avatar dashpole avatar esara avatar fridgepoet avatar fstab avatar georgesouzafarias avatar github-actions[bot] avatar gouthamve avatar grafanabot avatar grafsean avatar grcevski avatar jdbaldry avatar jjo avatar jtheory avatar lwangrabbit avatar marctc avatar marevers avatar mariomac avatar mattfrick avatar msvechla avatar myhro avatar nicolevanderhoeven avatar obito1903 avatar richih avatar sadikkuzu avatar seamusgrafana avatar vakalapa avatar vgnanasekaran avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

beyla's Issues

Get struct fields offsets from DWARF info

Instead of hardcoding the fields offsets (e.g. HTTP request path), get them at runtime from DWARF information.

Technically it is possible. E.g if you run:

objdump -Wi executable

Then you will find something like:

 <1><9bf94>: Abbrev Number: 39 (DW_TAG_structure_type)
    <9bf95>   DW_AT_name        : net/http.Request
    <9bfa6>   DW_AT_byte_size   : 248
    <9bfa8>   Unknown AT value: 2900: 25
    <9bfa9>   Unknown AT value: 2904: 0xac940
 <2><9bfb1>: Abbrev Number: 24 (DW_TAG_member)
    <9bfb2>   DW_AT_name        : Method
    <9bfb9>   DW_AT_data_member_location: 0
    <9bfba>   DW_AT_type        : <0x65741>
    <9bfbe>   Unknown AT value: 2903: 0
 <2><9bfbf>: Abbrev Number: 24 (DW_TAG_member)
    <9bfc0>   DW_AT_name        : URL
    <9bfc4>   DW_AT_data_member_location: 16
    <9bfc5>   DW_AT_type        : <0x8a4bb>
    <9bfc9>   Unknown AT value: 2903: 0
 <2><9bfca>: Abbrev Number: 24 (DW_TAG_member)
    <9bfcb>   DW_AT_name        : Proto
    <9bfd1>   DW_AT_data_member_location: 24
    <9bfd2>   DW_AT_type        : <0x65741>
    <9bfd6>   Unknown AT value: 2903: 0
 <2><9bfd7>: Abbrev Number: 24 (DW_TAG_member)
    <9bfd8>   DW_AT_name        : ProtoMajor
    <9bfe3>   DW_AT_data_member_location: 40
    <9bfe4>   DW_AT_type        : <0x65731>
    <9bfe8>   Unknown AT value: 2903: 0
 <2><9bfe9>: Abbrev Number: 24 (DW_TAG_member)
    <9bfea>   DW_AT_name        : ProtoMinor
    <9bff5>   DW_AT_data_member_location: 48
    <9bff6>   DW_AT_type        : <0x65731>
    <9bffa>   Unknown AT value: 2903: 0
 <2><9bffb>: Abbrev Number: 24 (DW_TAG_member)
    <9bffc>   DW_AT_name        : Header
    <9c003>   DW_AT_data_member_location: 56
    <9c004>   DW_AT_type        : <0x98c7d>
    <9c008>   Unknown AT value: 2903: 0
 <2><9c009>: Abbrev Number: 24 (DW_TAG_member)
    <9c00a>   DW_AT_name        : Body
    <9c00f>   DW_AT_data_member_location: 64
    <9c010>   DW_AT_type        : <0x95a2d>
    <9c014>   Unknown AT value: 2903: 0
 <2><9c015>: Abbrev Number: 24 (DW_TAG_member)
    <9c016>   DW_AT_name        : GetBody
    <9c01e>   DW_AT_data_member_location: 80
    <9c01f>   DW_AT_type        : <0x9c158>
    <9c023>   Unknown AT value: 2903: 0
 <2><9c024>: Abbrev Number: 24 (DW_TAG_member)
    <9c025>   DW_AT_name        : ContentLength
    <9c033>   DW_AT_data_member_location: 88
    <9c034>   DW_AT_type        : <0x6606b>
    <9c038>   Unknown AT value: 2903: 0
 <2><9c039>: Abbrev Number: 24 (DW_TAG_member)
    <9c03a>   DW_AT_name        : TransferEncoding
    <9c04b>   DW_AT_data_member_location: 96
    <9c04c>   DW_AT_type        : <0x69294>
    <9c050>   Unknown AT value: 2903: 0
 <2><9c051>: Abbrev Number: 24 (DW_TAG_member)
    <9c052>   DW_AT_name        : Close
    <9c058>   DW_AT_data_member_location: 120
    <9c059>   DW_AT_type        : <0x65091>
    <9c05d>   Unknown AT value: 2903: 0
 <2><9c05e>: Abbrev Number: 24 (DW_TAG_member)
    <9c05f>   DW_AT_name        : Host
    <9c064>   DW_AT_data_member_location: 128
    <9c066>   DW_AT_type        : <0x65741>
    <9c06a>   Unknown AT value: 2903: 0
 <2><9c06b>: Abbrev Number: 24 (DW_TAG_member)
    <9c06c>   DW_AT_name        : Form
    <9c071>   DW_AT_data_member_location: 144
    <9c073>   DW_AT_type        : <0x8a5a5>
    <9c077>   Unknown AT value: 2903: 0
 <2><9c078>: Abbrev Number: 24 (DW_TAG_member)
    <9c079>   DW_AT_name        : PostForm
    <9c082>   DW_AT_data_member_location: 152
    <9c084>   DW_AT_type        : <0x8a5a5>
    <9c088>   Unknown AT value: 2903: 0
 <2><9c089>: Abbrev Number: 24 (DW_TAG_member)
    <9c08a>   DW_AT_name        : MultipartForm
    <9c098>   DW_AT_data_member_location: 160
    <9c09a>   DW_AT_type        : <0x94646>
    <9c09e>   Unknown AT value: 2903: 0
 <2><9c09f>: Abbrev Number: 24 (DW_TAG_member)
    <9c0a0>   DW_AT_name        : Trailer
    <9c0a8>   DW_AT_data_member_location: 168
    <9c0aa>   DW_AT_type        : <0x98c7d>
    <9c0ae>   Unknown AT value: 2903: 0
 <2><9c0af>: Abbrev Number: 24 (DW_TAG_member)
    <9c0b0>   DW_AT_name        : RemoteAddr
    <9c0bb>   DW_AT_data_member_location: 176
    <9c0bd>   DW_AT_type        : <0x65741>
    <9c0c1>   Unknown AT value: 2903: 0
 <2><9c0c2>: Abbrev Number: 24 (DW_TAG_member)
    <9c0c3>   DW_AT_name        : RequestURI
    <9c0ce>   DW_AT_data_member_location: 192
    <9c0d0>   DW_AT_type        : <0x65741>
    <9c0d4>   Unknown AT value: 2903: 0
 <2><9c0d5>: Abbrev Number: 24 (DW_TAG_member)
    <9c0d6>   DW_AT_name        : TLS
    <9c0da>   DW_AT_data_member_location: 208
    <9c0dc>   DW_AT_type        : <0x8e6f5>
    <9c0e0>   Unknown AT value: 2903: 0
 <2><9c0e1>: Abbrev Number: 24 (DW_TAG_member)
    <9c0e2>   DW_AT_name        : Cancel
    <9c0e9>   DW_AT_data_member_location: 216
    <9c0eb>   DW_AT_type        : <0x7d89a>
    <9c0ef>   Unknown AT value: 2903: 0
 <2><9c0f0>: Abbrev Number: 24 (DW_TAG_member)
    <9c0f1>   DW_AT_name        : Response
    <9c0fa>   DW_AT_data_member_location: 224
    <9c0fc>   DW_AT_type        : <0x9c198>
    <9c100>   Unknown AT value: 2903: 0
 <2><9c101>: Abbrev Number: 24 (DW_TAG_member)
    <9c102>   DW_AT_name        : ctx
    <9c106>   DW_AT_data_member_location: 232
    <9c108>   DW_AT_type        : <0x7d950>
    <9c10c>   Unknown AT value: 2903: 0

Investigate adding few kprobes for more accurate request time for Golang

Right now we don't fully capture the time it takes from request start for Go. I have some usecases where the Go instrumentation reports similar timings to manual instrumentation, whereas the kprobes socket filter does capture the full time. I think with enabling separate type of kprobes for Golang instrumentation we'll be able to get similar timings.

Traces tests fail with keepalive off

Making our HTTP client for integration tests run without keepalive, which is default for Go, it causes failure in few of the Traces tests.

To make this happen I ran the test suite with the following addition to the Transport configuration for the HTTP client:

DisableKeepAlives: true,

Prepare BPF code for CPU architecture portability

Current C code has some hacks that only work for x86. E.g. assuming little endian or using specific CPU registers. Move all this stuff to another .h file and prepare it for supporting other architectures e.g. ARM64

Instrument multiple executables at once

Currently, a single agent instance only instruments a single executable.

Instead of providing a single executable name, provide a list of regexpes that could match multiple processes in the system, and instrument all of them from the same instrumenter instance.

Current spans map: consider using RemoteAddr field

During the aggregation of HTTP call spans, the eBPF code stores ongoing spans by using the request context pointer as key (private field ctx in type http.Request).

While this is performant (only have to store a pointer as map key), this is an implementation detail that might be broken if the http.Request implementation changes.

We could consider using something more stable, such as the public field http.Request.RemoteAddr. However this might involve some extra processing because this field is a variable-length string.

Provide grafana public repository for test images

Java Integration tests are failing in ARM64 hosts when trying to search by executable name, as, instead of greeting, the process ins named something like: {greeting} /usr/bin/qemu-x86_64 /greeting /greeting.

Current testing image is personal, so we should create a grafana/ebpf-autoinstrumenter-tests Hub entry with multi-arch images.

CI: automatically push image to Grafana DockerHub organization

For containerized demos, we are currently using my personal Docker account (mariomac/ebpf-autoinstrument:latest), which is updated and pushed manually.

We need to create a GitHub action that pushes a new image to the Grafana Docker organization on each merge into main branch.

Run with least permissions in Linux

Basically, investigate and document which capabilities are needed to run the executable without requiring full privileged mode. Provide some examples for e.g. Docker and Kubernetes deployments.

Support executables with both Gin and net/http

The way the instrumenter currently works, it searches either for net/http.HandlerFunc.ServeHTTP or github.com/gin-gonic/gin.(*Engine).ServeHTTP, and then adds an instrumentation point to the latest found function, but not to both.

This could lead to not being able to instrument customers that are progresssively migrating away from Gin to another framework.

Report timeout errors

Timeouts aren't properly reported. Probably because the ServeHTTP handler does not end by the usual mean and then the trace is not completed.

Report extra metadata

From other services in the App O11y plugin, I see that they report many resource attributes that we don't, and they could be useful e.g. to differentiate between instances:

  • host.name
  • os.description
  • os.type
  • process.command_args
  • process.executable.name
  • process.executable.path
  • process.owner
  • process.pid
  • process.runtime.description (e.g. go version go1.19.2 linux/amd64)
  • process.runtime.name
  • service.instance.id
  • service.namespace
  • telemetry.sdk.language
  • telemetry.sdk.name
  • telemetry.sdk.version

If we detect that the process runs in a container:

  • container.id

If we detect that the process runs in K8s:

  • k8s.node.name
  • k8s.pod.name
  • k8s.namespace.name

For Kubernetes, the easiest way would be to just allow setting these values via env vars, then use the valueFrom clause in the deployment descriptors.

Filter http.routes

To minimize traffic, allow the customers to report only routes following a given pattern.

This task can reuse the pattern definitions from issue #6

Allow users defining http.route

Currently, metrics/traces reports the http.target attribute as something like:

/users/123/article/456

We should let users defining regexpes so they can provide http.route according to the semantic conversions document

E.g.

target: /users/\d+/article/\d+
route: /users/{:userId}/article/{:articleId}

Provide integration tests

In e.g. a docker-compose file, deploy:

  • a simple http service (e.g. ping server)
  • http-autoinstrumenter
  • grafana-agent
  • open source tempo/mimir containers
    (optionally replace grafana agent+tempo+mimir by an OTEL traces+metrics collector)

On each testing scenario, play with the test service and verify that the data queried from tempo/mimir contains the generated traces/metrics.

Auto-regenerate offsets periodically

The make update-offsets task should be run each time Go version or any of the libraries we eventually instrument are updated.

We should create a periodic GitHub action that runs make update-offsets and submits a pull request each time there are new changes in the destination file.

Performance evaluation (and possible optimization?)

In projects previously using a similar architecture, 200K events/second consumed 0.5 CPUs. The impact was caused mostly because of the userspace having to wakeup on each message in the ringbuffer. In the case of the linked project, moving from a ringbuffer to a hashmap allowed aggregating millions of events/second with 0.1% CPU.

In the case of HTTP/GRPC servers, it seems unlikely that a single host will need to process so many events/second. So our current implementation should be fine. However, we could create performance tests to see how many resources we are consuming on a typical high-load scenarios.

If we decide to optimize the performance, I would recommend to address the userspace wakeups on each HTTP request message.

We can configure the ringbuffer to accumulate messages and send them to the user space in batches of X (user configurable). If, after a user-configured timeout, the batch size didn't reach X, we will anyway send them. This will decrease by orders of magnitude the number of userspace wakeups.

GRPC integration tests are flaky

From time to time, integration tests fail with the following message:

?   	github.com/grafana/ebpf-autoinstrument/test/integration/components/testserver/std	[no test files]
2023/04/19 12:37:40 INFO Getting feature for point lat=409146138 long=-746188906
2023/04/19 12:[37](https://github.com/grafana/ebpf-autoinstrument/actions/runs/4743350995/jobs/8422787779#step:4:38):40 name:"Berkshire Valley Management Area Trail, Jefferson, NJ, USA" location:{latitude:409146138 longitude:-746188906}
2023/04/19 12:37:40 INFO Getting feature for point lat=409146138 long=-746188906
2023/04/19 12:37:40 name:"Berkshire Valley Management Area Trail, Jefferson, NJ, USA" location:{latitude:409146138 longitude:-746188906}
2023/04/19 12:37:40 INFO Getting feature for point lat=4091461[38](https://github.com/grafana/ebpf-autoinstrument/actions/runs/4743350995/jobs/8422787779#step:4:39) long=-746188906
2023/04/19 12:37:40 name:"Berkshire Valley Management Area Trail, Jefferson, NJ, USA" location:{latitude:409146138 longitude:-746188906}
2023/04/19 12:38:[40](https://github.com/grafana/ebpf-autoinstrument/actions/runs/4743350995/jobs/8422787779#step:4:41) INFO Getting feature for point lat=409146138 long=-746188906
2023/04/19 12:38:40 name:"Berkshire Valley Management Area Trail, Jefferson, NJ, USA" location:{latitude:409146138 longitude:-746188906}
2023/04/19 12:38:40 INFO Getting feature for point lat=409146138 long=-746188906
2023/04/19 12:38:40 name:"Berkshire Valley Management Area Trail, Jefferson, NJ, USA" location:{latitude:409146138 longitude:-746188906}
2023/04/19 12:38:40 INFO Getting feature for point lat=409146138 long=-746188906
2023/04/19 12:38:40 name:"Berkshire Valley Management Area Trail, Jefferson, NJ, USA" location:{latitude:409146138 longitude:-746188906}
--- FAIL: TestSuite_NoDebugInfo (59.90s)
    --- FAIL: TestSuite_NoDebugInfo/GRPC_RED_metrics (5.01s)
        eventually.go:72: 
            	Error Trace:	/home/runner/work/ebpf-autoinstrument/ebpf-autoinstrument/test/integration/red_test.go:188
            	            				/home/runner/work/ebpf-autoinstrument/ebpf-autoinstrument/vendor/github.com/mariomac/guara/pkg/test/eventually.go:[51](https://github.com/grafana/ebpf-autoinstrument/actions/runs/4743350995/jobs/8422787779#step:4:52)
            	            				/opt/hostedtoolcache/go/1.20.3/x64/src/runtime/asm_amd64.s:1598
            	Error:      	"[]" should have 1 item(s), but has 0

Attaching test logs:

Test Logs.zip

Fix integration tests coverage metrics

It seems that integration test coverage files aren't always generated:

image

This leads to some PRs decreasing the coverage by 30%, and others restoring it.

Deal with chains of handlers

When a handler invokes another handler (e.g. a security filter), we might report twice the same transaction. We'd need to add some kind of "cookie" that discards an HTTP handling if the entry is already in the map. Same for return handler.

Tasks

No tasks being tracked yet.

Status code does not work with spanMetrics

In metrics that are generated by SpanMetrics from our Spans, the status code remains unset.
You can see here how status is correctly reported in normal metrics (left) but not in span metrics (right):

image

Below is the content of a trace information, as reported. It seems like they are expecting a status.code attribute (we use http.status_code):

image

Allow selecting executable by used port

Currently, you can select which executable to instrument by its name. This could lead to some issues in the Kubernetes scenario where deploying as a daemonset is not an option, and you need to deploy as a sidecar. For example, when you have 100 nodes and 10 instances of the service.

In that case, to avoid that a sidecar instruments processes from other pods in the same node, we could configure the instrumenter to listen to e.g. "process listening on port 8080". Since port space is internal to the pod, you can be sure you are instrumenting only the right executable.

Provide Go runtime information

In the App O11y Grafana plugin, there is a placeholder for Go runtime information but we aren't providing data:

image

We should look into their code to see which metric names and format use each section, to be compliant with them.

Create and deploy grafana/ebpf-autoinstrument-build image

It would be similar to the image that is locally created to generate the eBPF code. It will be integrated within our CI to allow:

  • Check drone drift (requires adding drone to the image)
  • Check that ebpf code is generated
  • Run tests from drone before creating/pushing the image

Separate code for different eBPF samplers

Currently we track GRPC, Goroutines and HTTP in a single file. This will be a bit unmaintainable as long as we keep growing in languages and libraries.

Currently, the instrumenter looks for the offsets of all the instrumentable functions (GRPC, goroutine queuing, HTTP...) and returns a map with all the found offsets, which are passed to a single eBPF loader.

I'd suggest to do the following architectural change:

The instrumenter should look for all the instrumentable functions, but should return multiple maps that would trigger different parts of the code selectively (for example, if only net/http Serve functions are found, load the BPF code of the HTTP tracer and the BPF code for the Goroutine tracer).

This will require:

  • splitting the go_nethttp.c code into go_grpc.c, go_server_common.c etc.... This will require multiple ebpf/... subpackages.
  • Optionally, sharing ringbuffers when HTTP and GRPC are traced at the same time.
  • Modifying the executable offsets inspector.

Not loading in some kernel versions

When enabling the OS-level HTTP tracer, some kernel versions aren't loading with this message:

integration-javaautoinstrumenter-1  | time=2023-06-06T15:36:29.797Z level=ERROR msg="can't instantiate instrumentation pipeline" err="instantiating start instance \"ebpf\": loading and assigning BPF objects: field KprobeSysExit: program kprobe_sys_exit: apply CO-RE relocations: load kernel spec: no BTF found for kernel version 5.15.49-linuxkit: not supported"

Might it be that the function is defined as int BPF_KPROBE(kprobe_sys_exit, int status) { ?

GRPC instrumentation

Tasks

  1. test
    grcevski

Tasks

K8s operator for installation and configuration

Will go into another repo.

Placeholder task. To be completed with subtasks when we design it.

Tasks

No tasks being tracked yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.