Giter VIP home page Giter VIP logo

Comments (11)

nsavoire avatar nsavoire commented on August 17, 2024 3

I think the root cause of the failure is the presence of compressed debug section in the binary.
I reproduce the issue with a simple C program compiled with compressed debug section:

// test.c
 __attribute__((noinline)) void foo() {
  while (1) {}
}

int main() {
  foo();
}
gcc -g test.c -o test -gz=zlib

from blazesym.

d-e-s-o avatar d-e-s-o commented on August 17, 2024 3

I think the root cause of the failure is the presence of compressed debug section in the binary.

Interesting. Yes, we do not currently support compressed debug sections and Go seems to have enabled them by default from what I understand.
I am still occupied wrapping up other work, but will look hopefully be able to take a closer look by Friday.

from blazesym.

danielocfb avatar danielocfb commented on August 17, 2024 2

So I didn't have much time to look at it, but I suspect we need two things. First, support for dealing with compressed ELF headers. I think that should be reasonably straight forward to add. Then, presumably we'd decompress the data on the fly while reading -- that should also not be that much, though it could be a bit problematic that we currently rely on memory mapping everywhere and that could translate to large allocations. So ideally we'd rework those bits eventually to support more gradual/on demand reads. What I haven't gotten to is what, if anything, needs to happen on the DWARF side of things. I've certainly come across special section names being used, but there could be more. Will have more time to look into it next week.

from blazesym.

r1viollet avatar r1viollet commented on August 17, 2024 2

I would find the first approach of just decompressing on the fly very valuable. I think users can expect some overhead when they ask for debug information.

On ideas to support more gradual reads, one thing I noticed was that when only asking for location information (without inlining), we seem to be paying the cost of more than just the debug_line section. I think the gimli dwarf loader just opens all of the dwarf it can find. I think I was seeing an 80 megs difference symbolizing an Envoy binary with debug information (and requesting code locations, no inlining info).

I think the gradual reads would be nice. Though the information for inlined functions are scattered around in the debug info section (afaik), so we would still pay a significant cost when parsing these. That seems like a generally hard thing to optimize.

from blazesym.

d-e-s-o avatar d-e-s-o commented on August 17, 2024 1

Thanks for the report. May only get to looking at this issue later this week.

from blazesym.

r1viollet avatar r1viollet commented on August 17, 2024 1

This could be dwarf related: --no-debug-syms allows us to get the symbol.

blazecli symbolize elf --no-debug-syms --path ./main "000000000045aac0"
0x0000000045aac0: runtime.rt0_go.abi0 @ 0x45aac0+0x0

from blazesym.

d-e-s-o avatar d-e-s-o commented on August 17, 2024 1

Thanks for the feedback! After looking into this some more, I believe that in the case of compression, there really is no (reasonably straightforward) way to have incremental reads. The reason is simple: you (generally) don't have random access on compressed data (that's perhaps a bit of a broad statement, as I understand there may be compression formats that allow for that to some extent or trickery based on implementation details (edit: potentially better reference) that effectively allows for doing so, but in the Rust realm most decompressors only implement std::io::Read and not std::io::Seek). While I think it would be very interesting work to change that for choosen implementations, for the time being I think we should take it as a given.
At the same time, that kind of makes me scratch my head a bit given the presence of multi-GiB debug information, which I suspect could easily map to gigabytes of data in a single section (though perhaps I am missing some detail where sections end up being capped in size and then spilled elsewhere or whatever). If one were to decode that, the result would pretty much necessarily be a multi-gigabyte allocation.

On ideas to support more gradual reads, one thing I noticed was that when only asking for location information (without inlining), we seem to be paying the cost of more than just the debug_line section. I think the gimli dwarf loader just opens all of the dwarf it can find. I think I was seeing an 80 megs difference symbolizing an Envoy binary with debug information (and requesting code locations, no inlining info).

This is interesting information, thanks for sharing. I haven't looked that deeply below the hood of gimli (and forgotten most the details of the few things I did look at...), so it's possible that they or we do some unnecessary work. I think it would be best if we had a concrete example to compare, discuss, and optimize. If you have something like that, that could be useful. Of course that's independent of this work.

from blazesym.

javierhonduco avatar javierhonduco commented on August 17, 2024 1

Just wanted to say thank you for opening the issue and @d-e-s-o for very quickly implementing the feature! Been using blazesym in a new project and so far things have been super smooth. Go symbolization was the only thing I found missing!

from blazesym.

r1viollet avatar r1viollet commented on August 17, 2024

For reproduction any binary should do. Here is a simple hello world.

package main

import "fmt"

func main() {
    fmt.Println("Hello, World!")
}

Then build the example:

go build main.go

You should not be able to symbolize associated symbols.

from blazesym.

r1viollet avatar r1viollet commented on August 17, 2024

I am curious as to why the elf resolver would have trouble with Go binaries. Feel free to reach out if I can help with repro / other cases.

from blazesym.

d-e-s-o avatar d-e-s-o commented on August 17, 2024

All that being said, I opened #590 which should add the necessary support. Feel free to give it a try. Right now it only supports zlib compression. Let me know if you need zstd as well and I should be able to add it quickly.

from blazesym.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.