Comments (11)
There might be small amounts of data in the .rodata section if the dep uses a lot of &'static strs
I don't think "small" or a focus on string literals is quite right: there's quite a few crates with large static tables, e.g. unicode properties, text encoding, cached computations for performance (~100K), time-zone look-ups (~270K). The first two particularly turn up in central crates, with a lot of dependent crates. I personally have a few crates with large tables like the above, and it would be nice if they were highlighted in cargo bloat
output, so I can easily understand how much effort I should put into optimising them.
For a specific example, https://crates.io/crates/encoding_rs seems to have ~320K of static tables itself, and the ~120K which end up in the final binary of ripgrep are about 3% of the stripped binary size (for cargo build --release
with the default features). This means the bloat-contribution of encoding_rs
is dramatically underestimated: the 35K that cargo bloat --crates --release
lists seems to be at least ~4× smaller than the actual value. In total, on Mac, the various __const
sections make up nearly 20% of ripgrep's stripped size:
$ size -A -t -d rg-stripped | sort -n -k 2
rg-stripped :
section size addr
__mod_init_func 8 4298888344
__nl_symbol_ptr 16 4298887168
__got 56 4298887184
__thread_bss 192 4299084184
__thread_vars 240 4299082408
__thread_data 368 4299083816
__bss 724 4299087136
__stubs 828 4297882320
__la_symbol_ptr 1104 4298887240
__data 1160 4299082656
__stub_helper 1396 4297883148
__common 2744 4299084384
__cstring 10444 4298546272
__unwind_info 35932 4298556716
__gcc_except_tab 105724 4297884544
__const 194056 4298888352
__eh_frame 294496 4298592648
__const 556000 4297990272
__text 2911152 4294971168
Total 4116640
The code is still the largest, but it's not as completely one-sided as that comment suggests.
A more extreme (and slightly less "real-world") example of something using encoding_rs is https://github.com/hsivonen/recode_rs, where various tables end up being 30% of the (non-stripped!) binary size, which is something that bloaty
(the encoding_rs::data
symbols) and size
(the __const
sections) both highlight, but cargo bloat
doesn't at the moment:
$ cargo bloat --release -n 10
Compiling ...
Analyzing target/release/recode_rs
File .text Size Crate Name
27.7% 61.3% 302.3KiB [760 Others]
3.4% 7.6% 37.7KiB encoding_rs encoding_rs::variant::VariantEncoder::encode_from_utf8_raw
3.4% 7.4% 36.7KiB encoding_rs encoding_rs::variant::VariantEncoder::encode_from_utf16_raw
2.9% 6.5% 32.2KiB encoding_rs encoding_rs::variant::VariantDecoder::decode_to_utf8_raw
2.6% 5.9% 28.9KiB encoding_rs encoding_rs::variant::VariantDecoder::decode_to_utf16_raw
1.0% 2.2% 10.8KiB getopts getopts::Options::parse
1.0% 2.1% 10.5KiB [Unknown] _read_line_info
0.9% 1.9% 9.4KiB [Unknown] _stats_arena_print
0.8% 1.9% 9.2KiB std std::sys_common::backtrace::output
0.7% 1.6% 7.9KiB std _je_stats_print
0.7% 1.6% 7.8KiB std _je_mallocx
45.1% 100.0% 493.3KiB .text section size, the file size is 1.1MiB
$ size -A -t -d target/release/recode_rs | sort -k 2 -n
section size addr
target/release/recode_rs :
__mod_init_func 8 4295852808
__nl_symbol_ptr 16 4295852032
__got 40 4295852048
__thread_data 48 4295880392
__thread_vars 96 4295880296
__thread_bss 104 4295880440
__bss 468 4295880544
__stubs 540 4295472448
__la_symbol_ptr 720 4295852088
__stub_helper 916 4295472988
__data 968 4295879328
__common 2600 4295881024
__unwind_info 2784 4295832924
__gcc_except_tab 3460 4295473904
__cstring 10444 4295822480
__eh_frame 16288 4295835712
__const 26512 4295852816
__const 345104 4295477376
__text 501872 4294970576
Total 912988
$ bloaty -d symbols -n 10 target/release/recode_rs
VM SIZE FILE SIZE
-------------- --------------
57.4% 631Ki [1226 Others] 627Ki 57.4%
12.0% 131Ki [__LINKEDIT] 128Ki 11.8%
3.7% 41.0Ki encoding_rs::data::BIG5_UNIFIED_IDEOGRAPH_BYTES::hb0140e06e71bb28e 41.0Ki 3.8%
3.7% 40.9Ki encoding_rs::data::GBK_HANZI_BYTES::h59ac8ab5fd0c5593 40.9Ki 3.7%
3.7% 40.9Ki encoding_rs::data::JIS0208_KANJI_BYTES::hc7e4b09543cf47f7 40.9Ki 3.7%
3.7% 40.9Ki encoding_rs::data::KSX1001_UNIFIED_HANJA_BYTES::h11b55d33a7050459 40.9Ki 3.7%
3.4% 37.8Ki encoding_rs::variant::VariantEncoder::encode_from_utf8_raw::h791216902f079374 37.8Ki 3.5%
3.4% 36.9Ki encoding_rs::data::BIG5_LOW_BITS::h7673f2a02219b92a 36.9Ki 3.4%
3.3% 36.8Ki encoding_rs::variant::VariantEncoder::encode_from_utf16_raw::hee0803b3fb2af4f3 36.8Ki 3.4%
2.9% 32.3Ki encoding_rs::variant::VariantDecoder::decode_to_utf8_raw::hf798d70e99362630 32.3Ki 3.0%
2.6% 29.0Ki encoding_rs::variant::VariantDecoder::decode_to_utf16_raw::h3df9b02e7fbdf4b7 29.0Ki 2.7%
100.0% 1.07Mi TOTAL 1.07Mi 100.0%
from cargo-bloat.
what I actually want to do when I use it is optimize the size of my binary as a whole. So ideally, I'd like cargo bloat to tell me from which libraries everything in my binary originates, so that I can make a decision that is not exclusively based on the text section size.
Dependencies don't really make up much of any other section besides .text
which is where all the code lives. There might be small amounts of data in the .rodata
section if the dep uses a lot of &'static str
s, or in both .eh_frame
and .gcc_except_table
for error cases. But you're talking bytes.
If you're concerned with super small binaries, turning off Rust debug symbols (they're off by default in release builds) and stripping the binary of any other debug symbols is far more effective than scrounging for bytes in anything other than .text
.
Debug symbols (both Rust's and others) are tens of megabytes large. As @RazrFalcon said, they're just public type and function names though.
Here's this repo with debug symbols on, off, and fully stripped
debug=true | debug=false | stripped + debug=false |
---|---|---|
87.0M | 13.0M | 7.2M |
If we look at the stripped version we can see that .text
takes up ~5.2M
kevin@beefcake: ~/Projects/cargo-bloat
➜ size -A -t -d target/release/cargo-bloat
target/release/cargo-bloat :
section size addr
.interp 28 624
.note.ABI-tag 32 652
.note.gnu.build-id 36 684
.gnu.hash 176 720
.dynsym 12096 896
.dynstr 7369 12992
.gnu.version 1008 20362
.gnu.version_r 576 21376
.rela.dyn 151488 21952
.rela.plt 11112 173440
.init 23 184552
.plt 7424 184576
.plt.got 48 192000
.text 5367936 192048 <---- .text
.fini 9 5559984
.rodata 662089 5560000 <---- .rodata
.eh_frame_hdr 100756 6222092
.eh_frame 462984 6322848 <---- .eh_frame
.gcc_except_table 539080 6785832 <---- .gcc_except_table
.tdata 552 9425248
.init_array 16 9425800
.fini_array 8 9425816
.data.rel.ro 113232 9425824
.dynamic 640 9539056
.got 3984 9539696
.data 4625 9543680
.bss 6336 9548320
.comment 96 0
Total 7453759
Next largest is in fact .rodata
, .eh_frame
, and .gcc_except_table
. But I'm not sure how much of that is from this repo, or it's deps...even so they're dwarfed by .text
and debug symbols.
from cargo-bloat.
I'll look is this supported by goblin
.
from cargo-bloat.
Yes, I should note somewhere that it's only the .text
section.
About what kind of debug symbols are you asking? The one from the debug build or the symbols table that also exists in the release build?
from cargo-bloat.
About what kind of debug symbols are you asking? The one from the debug build or the symbols table that also exists in the release build?
The ones in the symbols table that also exist in the release build. I like debug-symbols in release builds while debugging, but I don't like to ship release builds with debug symbols "in general".
from cargo-bloat.
This has been implemented now, no? 😄
kevin@chickenlegs: ~/Projects/cargo-bloat
➜ cargo bloat --release --crates
[.. snip compiling ..]
File .text Size Name
13.0% 33.6% 1.7MiB std
7.1% 18.2% 930.6KiB cargo
5.3% 13.7% 702.9KiB [Unknown]
2.3% 5.9% 304.2KiB libgit2_sys
2.0% 5.2% 263.5KiB toml
1.4% 3.5% 178.8KiB regex
1.2% 3.0% 155.7KiB goblin
1.1% 2.9% 150.4KiB serde_ignored
1.0% 2.6% 134.7KiB curl_sys
0.8% 2.1% 107.1KiB serde_json
0.5% 1.4% 70.4KiB docopt
0.4% 1.1% 56.8KiB regex_syntax
0.3% 0.9% 44.3KiB url
0.3% 0.7% 38.0KiB libssh2_sys
0.3% 0.7% 35.4KiB git2
0.3% 0.7% 34.6KiB serde
0.2% 0.6% 30.7KiB globset
0.2% 0.5% 25.9KiB cargo_bloat
0.2% 0.4% 20.7KiB tar
0.1% 0.3% 15.6KiB aho_corasick
38.8% 100.0% 5.0MiB .text section size, the file size is 12.9MiB
from cargo-bloat.
@kbknapp I thought that he is about this sections.
from cargo-bloat.
@gnzlbg can you explain in details what you need?
from cargo-bloat.
Currently, when I use cargo bloat
, the total size reported by cargo bloat is the size of the text section, which is much smaller than the size of my binary. While cargo bloat allows me to optimize the size of the text section, what I actually want to do when I use it is optimize the size of my binary as a whole. So ideally, I'd like cargo bloat
to tell me from which libraries everything in my binary originates, so that I can make a decision that is not exclusively based on the text section size.
I think debug symbols is a good place to start next, because the largest part of at least my rust binaries are debug symbols. I'd just like to know "where do they come from", that is, from which library do they originate.
I don't know if this is possible though. If it isn't, please just close this issue.
from cargo-bloat.
Currently, when I use cargo bloat, the total size reported by cargo bloat is the size of the text section, which is much smaller than the size of my binary.
It's already fixed.
what I actually want to do when I use it is optimize the size of my binary as a whole
But there is nothing to optimize, imho. The debug section contain names of the methods. That's it. Fewer methods - fewer size.
I think debug symbols is a good place to start next, because the largest part of at least my rust binaries are debug symbols.
I strip my executables, so I don't have such problem.
from cargo-bloat.
We have #66 for .rodata
, but everything else is way to complicated to implement and will not provide much benefits.
from cargo-bloat.
Related Issues (20)
- Doesn't work with custom registries HOT 2
- Default test binary HOT 3
- --release flag fails HOT 4
- Add an option to sort output by crate/fn name instead of size HOT 1
- Discrepancy in reported `.text` size: `cargo bloat` vs GNU Binuils `size` HOT 3
- Cargo bloat doesn't forward compiler errors on build failure HOT 5
- don't suggest -n if all data is shown HOT 1
- Support custom profiles HOT 2
- How do I use cargo bloat with `-Zbuild-std`? HOT 7
- Incorrect target resolving
- Cargo bloaty reports only a tiny portion of the whole binary HOT 5
- How to examine other sections in the `.so` file? HOT 4
- Error: parsing failed cause 'section .symtab is missing'. HOT 10
- Detect `strip = true` in Cargo.toml HOT 3
- It "hangs" sometimes HOT 18
- Total time using cargo bloat --time? HOT 2
- missing releases on github HOT 2
- Info on the Size Calculation HOT 5
- FR: Cargo bloat should allow multiple `--features=foo` flags and merge them HOT 1
- [Q&A] Is there a tool or plugin that uses `cargo bloat` information to point at duplication due to monomorphization? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cargo-bloat.