Giter VIP home page Giter VIP logo

Comments (8)

 avatar commented on June 6, 2024

In my exploration this happens despite drgn using the libdw packaged in the wheel, which is version 188, the latest.

Just chiming in to say that I wouldn't expect the libdw version to have any bearing here, as the error comes from drgn's own DWARF parser, which explicitly does not yet have support for DW_OP_(GNU)_entry_value.

If this is presently preventing you from using drgn, as a temporary workaround (assuming you're compiling the kernel yourself) you could try adding -gdwarf-4 -gstrict-dwarf to CFLAGS which should prevent DW_OP_GNU_entry_value from being emitted.

from drgn.

brenns10 avatar brenns10 commented on June 6, 2024

Thanks! I had gotten to that part of the code, and since I know pretty much nothing about DWARF I figured I'd pause and wait for a second opinion.

I originally encountered this with Oracle's UEK7 kernel (5.15 based) -- maybe we can update the compiler flags but I'd imagine that would be a hard sell. On the other hand, this isn't currently blocking anything much. I encountered it during normal continuous integration tests and for now, I've marked the tests as xfail.

If I find free time I might try to learn more about what's going on here and see if I can help with adding support. I'll have to see what that entails though :)

from drgn.

osandov avatar osandov commented on June 6, 2024

Yeah, when I wrote the DWARF unwinder, I left those unimplemented so I could see if I ran into them in practice. I have, and now you have, too, sorry! This wasn't a big deal for me since I knew how to interpret it, but for what I see in your work-in-progress, it'd definitely be better to gracefully handle this rather than blow up.

DW_{,GNU_}OP_entry_value means "the value of the given location upon entry into this function". The main use case is for parameters that were passed in caller-saved registers when that register gets clobbered in the middle of a function. The parameter was essentially optimized out, but if the debugger saves the values of the registers upon entry into the function, then it can still recover the parameter.

Unfortunately, we don't (and can't) do that. So for us, I think we just have to treat these as optimized out. Something like:

diff --git a/libdrgn/dwarf_info.c b/libdrgn/dwarf_info.c
index 2350c1b1..58beb7eb 100644
--- a/libdrgn/dwarf_info.c
+++ b/libdrgn/dwarf_info.c
@@ -4436,6 +4436,9 @@ branch:
 		/* Special operations. */
 		case DW_OP_nop:
 			break;
+		case DW_OP_entry_value:
+		case DW_OP_GNU_entry_value:
+			return &drgn_not_found;
 		/* Location description operations. */
 		case DW_OP_reg0 ... DW_OP_reg31:
 		case DW_OP_regx:
@@ -4451,7 +4454,6 @@ branch:
 		 *
 		 * - DW_OP_push_object_address
 		 * - DW_OP_form_tls_address
-		 * - DW_OP_entry_value
 		 *   DW_OP_implicit_pointer
 		 * - Procedure calls: DW_OP_call2, DW_OP_call4, DW_OP_call_ref.
 		 * - Typed operations: DW_OP_const_type, DW_OP_regval_type,

from drgn.

osandov avatar osandov commented on June 6, 2024

The (null) in the exception seems busted, though. That should be the path of the offending file.

from drgn.

brenns10 avatar brenns10 commented on June 6, 2024

Oh wow, that explanation was worth a few dozen pages of DWARF5 spec at least! Thanks. I'll try out the patch tomorrow and see how it works.

from drgn.

osandov avatar osandov commented on June 6, 2024

As a follow up, I compiled a test program with -g -gdwarf-4 and -g -gdwarf-4 -gstrict-dwarf to compare the output (thanks @Svetlitski-FB for providing the flags to test this with).

The function in question is https://github.com/osandov/osandov-linux/blob/801ae515d22d689265a6940d8ef4aa9d99b81bf7/scripts/debuginfod_client.c#L14-L28, which compiles to

00000000004014a0 <usage>:
  4014a0:       53                      push   %rbx
  4014a1:       40 84 ff                test   %dil,%dil
  4014a4:       89 fb                   mov    %edi,%ebx
  4014a6:       48 8b 15 93 2c 00 00    mov    0x2c93(%rip),%rdx        # 404140 <progname>
  4014ad:       48 8b 3d ec 2c 00 00    mov    0x2cec(%rip),%rdi        # 4041a0 <stderr@GLIBC_2.2.5>
  4014b4:       48 0f 44 3d a4 2c 00    cmove  0x2ca4(%rip),%rdi        # 404160 <stdout@GLIBC_2.2.5>
  4014bb:       00
  4014bc:       31 c0                   xor    %eax,%eax
  4014be:       be 10 20 40 00          mov    $0x402010,%esi
  4014c3:       e8 e8 fb ff ff          call   4010b0 <fprintf@plt>
  4014c8:       0f b6 fb                movzbl %bl,%edi
  4014cb:       e8 c0 fb ff ff          call   401090 <exit@plt>

error is passed in %rdi and then clobbered at 4014ad. The debug info without -gstrict-dwarf uses DW_OP_GNU_entry_value starting from 4014b4 to represent this:

 [   448] range 4014a0, 4014b4
          0x00000000004014a0 <usage>..
          0x00000000004014b3 <usage+0x13>
           [ 0] reg5
          range 4014b4, 4014d0
          0x00000000004014b4 <usage+0x14>..
          0x00000000004014cf <usage+0x2f>
           [ 0] GNU_entry_value:
                [ 0] reg5
           [ 3] stack_value

But with -gstrict-dwarf, it doesn't have a location starting from 4014b4, meaning the variable is optimized out:

 [   3be] range 4014a0, 4014b4
          0x00000000004014a0 <usage>..
          0x00000000004014b3 <usage+0x13>
           [ 0] reg5

So I'm more confident now that my patch is the correct thing to do (at least its intention, since I haven't tested that it does what it claims to do 😉)

from drgn.

brenns10 avatar brenns10 commented on June 6, 2024

After a bit of confusion with the vmtest, I did go ahead and test this patch on my branch and see that it works great. For my part I'll probably catch this exception and regex match it, since the next version won't be around for a while, it's no big deal.

Let me know if you need more testing help or anything else on this. I consider it resolved, but I suppose you'll want to keep it open until there's a proper fix for these opcodes.

from drgn.

brenns10 avatar brenns10 commented on June 6, 2024

Strangely enough, I just encountered "unknown DWARF expression opcode 0xa3" corresponding to the non-GNU entry_value opcode, on a crash dump on my arch desktop.

from drgn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.