Comments (4)
The mangled name grammar intends to provide a unique (and ideally as short as is reasonably possible) name for each external-linkage symbol, and to be (sufficiently-experienced-)human readable as well as sufficient to allow a demangler such as __cxa_demangle
to produce a more pleasant human readable description of the symbol. But it's not designed to be such a sufficiently rigid representation of the original source as to allow full reconstruction of the original declaration, and I'd view that as scope creep that we should not succumb to (despite being the author of your (1)). For example, I think it would be reasonable to add a namespace/class distinction only if it provides benefits for one of the intended use cases.
from cxx-abi.
@zygoloid Appreciate the context, thank you.
The main use-case that I'm interested in involves LD_PRELOAD. Given the importance of function interposition in the developer's toolkit, I don't think it's unreasonable to request that LD_PRELOAD use-cases be admitted to the "intended" category if they aren't there already. You may disagree, but let me elaborate:
99% of the time, effective function interposition through LD_PRELOAD has
- An explicit dependency on mangled names; and
- An implicit dependency on demangled names.
The explicit dependency on the mangled name is due to use of dlsym(...)
as the primary mechanism of runtime interposition. Yes, some LD_PRELOAD use-cases may not involve interposition at all, but those are in the 1% as far as I can tell.
The implicit dependency on the demangled name comes when trying to get LD_PRELOAD to correctly catch flow control in the first place. To do this, the author of the interposition shim has to correctly reverse-engineer the declaration for the callable that s/he wants to get in front of. The workflow there almost always starts with demangling. Due to the ambiguities in the mangling grammar, source code exporting the target symbol is required to achieve certainty about the callable declaration's precise syntax. This is problematic, especially when no source code is available.
On the whole, the argument is: (a) LD_PRELOAD is important and (b) critical LD_PRELOAD workflows depend on demangling, so (3) the ABI should provide enhanced support for those use-cases. Interested to hear your thoughts.
Cheers!
from cxx-abi.
To do this, the author of the interposition shim has to correctly reverse-engineer the declaration for the callable that s/he wants to get in front of. The workflow there almost always starts with demangling. Due to the ambiguities in the mangling grammar, source code exporting the target symbol is required to achieve certainty about the callable declaration's precise syntax. This is problematic, especially when no source code is available.
I don't think it is true that reverse-engineering is the first step. The natural first step is reading the documentation and sources (possibly partial sources, like a header provided to compile plugins). If you do not have any sources, that means that whoever provided the object to you does not support your interposition, and they could have obfuscated it by renaming all the mangled symbols to just "a", "b", etc. Even if you do have the original mangled names, the class/namespace distinction is likely to be much less of an issue than finding the layout of classes, expected semantics, etc.
from cxx-abi.
@mglisse Thanks for your thoughts here as well. I think you may be overlooking the most critical and popular use-case for interposition, namely, tracing. Given that context, I offer a rebuttal:
-
RE: "Natural first steps" -- Reading docs and sources (if available) is neither efficient nor to-the-point, as the goal of tracing is to understand runtime behavior that can't be deduced in a straightforward manner from manual static analysis. This generally involves mass interposition that would simply take too much time without direct support from the tooling.
-
RE: "Tracing support" in provided shared objects -- I have never heard anyone engaged in a tracing exercise say, "Wow, this library provides great mass interposition support." This just isn't a thing, as the tracing use-case presumes a lack of source-level instrumentation -- "logging" -- in the first place. In the subset of cases where source code is available, the burden of tracing support has fallen to the compilers, vis-a-vis
-finstrument-functions
. Aside from not working on existing shared objects, this method is a very blunt instrument with several drawbacks: (a) it is inappropriate for latency-sensitive applications; (b) it generates truly massive amounts of trace data, much of which is superfluous; (c) it does not allow for the easy extraction of arguments to the callable at call time. -
RE: "Expected semantics" of the underlying code -- Again, for the mass interposition use-case, the underlying semantics aren't relevant from the outset. The engineer just wants to get traces back with as little work as possible while not compromising the application. Not having to deal with the semantics of the underlying code is important due to the size of the problem.
I look forward to your feedback.
Cheers!
from cxx-abi.
Related Issues (20)
- "Deducing this" mangling HOT 14
- Should std::rethrow_exception be covered by the EH ABI? HOT 2
- Emergency EH buffer is overspecified HOT 6
- Where is the most recent ABI document? HOT 1
- Add `[[trivial_abi]]` attribute
- Lambda POD for the purposes of layout? HOT 2
- Mangling the name of an externally visible lambda in a static data member of a class HOT 1
- Proposal: Include an optional specification for mangling names that reference anonymous symbols HOT 4
- Is it possible to form a pointer-to-data-member with offset -1 using explicit derived-to-base conversions without UB? HOT 3
- unnecessary `E`s after <expression> and mangling collisions between <expression> and <number> HOT 1
- need mangling for lambdas appearing in unevaluated operands within a class body HOT 3
- What does "forbidding the use of function templates" mean? HOT 2
- [C++20] [Modules] Do we need the concept of `key function` for class defined in module purview? HOT 25
- Missing HTML encoding in 2.3.1 Data Member Pointers HOT 2
- Proposal: document or somehow notice __cxa_init_primary_exception HOT 3
- Mangling for C++ pack indexing HOT 1
- Function and function pointer types with vendor calling conventions HOT 2
- Ambiguity in mangling grammar around type qualifiers HOT 8
- Question about section 2.9.4 HOT 4
- Questions About Non-POD Types Data Layout HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cxx-abi.