zombocom / heapy Goto Github PK
View Code? Open in Web Editor NEWGot a Ruby heap dump? Great. Use this tool to see what's in it!
License: MIT License
Got a Ruby heap dump? Great. Use this tool to see what's in it!
License: MIT License
First of all, thanks for the great work!
I've had a hard time debugging a memory leak during several days and could finally find the problem.
heapy
helped a bit, but it lacked one important information about the object types.
Basically printing the counts of objects type per class like done in
drbrain/net-http-persistent#96
Is this possible to be done through heap dumps?
It works without being in a project's Gemfile
~/P/o/Review-collector ❯❯❯ heapy --help ⏎master
Traceback (most recent call last):
5: from /home/braulio/.rvm/gems/ruby-2.5.1/bin/ruby_executable_hooks:24:in `<main>'
4: from /home/braulio/.rvm/gems/ruby-2.5.1/bin/ruby_executable_hooks:24:in `eval'
3: from /home/braulio/.rvm/gems/ruby-2.5.1/bin/heapy:23:in `<main>'
2: from /home/braulio/.rvm/gems/ruby-2.5.1/bin/heapy:23:in `load'
1: from /home/braulio/.rvm/gems/ruby-2.5.1/gems/heapy-0.1.3/bin/heapy:4:in `<top (required)>'
/home/braulio/.rvm/gems/ruby-2.5.1/gems/heapy-0.1.3/bin/heapy:4:in `require': cannot load such file -- heapy (LoadError)
Could not parse {"address":"0x7f3184423a30", "type":"STRING", "class":"0x7f3171af0060", "frozen":true, "bytesize":1438, "value":" ...\" />\n<span style='position: absolute; right: 1em; top: 5px'>\u-016\u-097\u-108\u-115</span>\n</div>\n</form>\n\n</li>\n<li class='pull-right'>\n<div><a class=\"btn bug\" href=\"/bugs/new?locale=en\">Report a bug</a></div>\n</li>\n<li class='pull-right'><a class=\"logout\" href=\"/en/logout\">Log out</a></li>\n</ul>\n</div>\n</div>\n<script>\n $(function() {\n function resizeSearch() {\n var maxWidth = ( $('.navbar').innerWidth() - $('.navbar .nav:first').outerWidth()\n - $('.brand').outerWidth() - $('a.bug').outerWidth() - $('a.logout').outerWidth()\n - 70 /* all the margins */)\n if (maxWidth > parseInt($(this).css(\"width\"), 10)) {\n $(this).css(\"width\", maxWidth)\n }\n }\n \n $('.navbar-search input').on('focus', resizeSearch).on('blur', function() {\n $(this).css(\"width\", \"\")\n })\n })\n</script>\n\n<div class='container'>\n<div class='insets'>\n\u0000\u-122\u-1261�\u0000\u0000\u-096\u-120\u-120", "file":"/usr/local/lib/ruby/2.3.0/psych.rb", "line":379, "method":"parse", "generation":40, "memsize":40, "flags":{"wb_protected":true, "old":true, "uncollectible":true, "marked":true}}
The last version is a few years old, and lacks some important changes:
I was about to add the latter feature myself, only to realize when I forked & cloned the repo that this functionality is already present in the latest master
code. 😉
Thanks for a great tool!
Hi,
Again, thanks for a great gem. 👍 I am investigating a probable leak, have been doing so for about a week or more without any real luck yet... but eventually, I will find it. 😄
One thing that currently puzzles me is that the number of objects, and amount of heap memory being used, varies so greatly depending on where I look.
When I've done a dump using this code:
file = File.open('/tmp/dump.json', 'w')
GC.start
ObjectSpace.dump_all(output: file)
file.close()
...and then ran it via heapy
, I got this output:
$ bundle exec bin/heapy read dump.json
Analyzing Heap
==============
Generation: nil object count: 140137, mem: 0.0 kb
Generation: 53 object count: 31, mem: 12.2 kb
Generation: 54 object count: 5018, mem: 551.0 kb
Generation: 55 object count: 21337, mem: 5383.3 kb
Generation: 56 object count: 18861, mem: 1816.3 kb
Generation: 57 object count: 5087, mem: 503.4 kb
Generation: 58 object count: 18637, mem: 1142.0 kb
Generation: 59 object count: 4873, mem: 4667.3 kb
Generation: 60 object count: 124, mem: 5.5 kb
Generation: 61 object count: 202, mem: 24.1 kb
Generation: 62 object count: 120, mem: 8.9 kb
Generation: 63 object count: 99, mem: 8.4 kb
Generation: 66 object count: 157, mem: 22.4 kb
Generation: 95 object count: 55, mem: 4.0 kb
Generation: 96 object count: 28, mem: 2.2 kb
Generation: 99 object count: 16, mem: 1.3 kb
Generation: 100 object count: 14, mem: 1.0 kb
Generation: 104 object count: 8, mem: 0.6 kb
Generation: 248 object count: 1648, mem: 146.3 kb
Generation: 249 object count: 388, mem: 33.0 kb
Generation: 250 object count: 8, mem: 0.7 kb
Generation: 387 object count: 32, mem: 2.8 kb
Generation: 417 object count: 16, mem: 1.4 kb
Generation: 418 object count: 28, mem: 2.5 kb
Generation: 419 object count: 8, mem: 0.7 kb
Generation: 549 object count: 32, mem: 2.8 kb
Generation: 577 object count: 16, mem: 1.2 kb
Generation: 655 object count: 7, mem: 0.3 kb
Generation: 709 object count: 21, mem: 0.8 kb
Generation: 840 object count: 3, mem: 0.1 kb
Generation: 858 object count: 36, mem: 3.2 kb
Generation: 863 object count: 94, mem: 27.7 kb
Generation: 865 object count: 4, mem: 0.4 kb
Generation: 938 object count: 1, mem: 0.1 kb
Generation: 946 object count: 1, mem: 0.1 kb
Generation: 1018 object count: 16, mem: 1.4 kb
Generation: 1021 object count: 1, mem: 0.0 kb
Generation: 1022 object count: 3940, mem: 283.8 kb
Generation: 1023 object count: 66, mem: 3.8 kb
Generation: 1024 object count: 240, mem: 17.5 kb
Generation: 1025 object count: 937, mem: 68.1 kb
Generation: 1026 object count: 1781, mem: 134.5 kb
Generation: 1027 object count: 24, mem: 1.9 kb
Generation: 1089 object count: 61, mem: 5.5 kb
Generation: 1091 object count: 197, mem: 37.3 kb
Generation: 1092 object count: 4, mem: 0.3 kb
Generation: 1103 object count: 4, mem: 0.3 kb
Generation: 1116 object count: 12, mem: 1.1 kb
Generation: 1117 object count: 24, mem: 2.0 kb
Generation: 1125 object count: 10, mem: 0.4 kb
Generation: 1126 object count: 64, mem: 5.5 kb
Generation: 1127 object count: 991, mem: 2141.3 kb
Generation: 1128 object count: 15, mem: 1.0 kb
Heap total
==============
Generations (active): 53
Count: 225534
Memory: 17083.5 kb
However, when I looked at the GC stats (GC.stat
) from roughly the same point in time, it looked like this:
{
"count": 1127,
"heap_allocated_pages": 2390,
"heap_sorted_length": 3695,
"heap_allocatable_pages": 0,
"heap_available_slots": 974164,
"heap_live_slots": 713354,
"heap_free_slots": 260810,
"heap_final_slots": 0,
"heap_marked_slots": 450378,
"heap_eden_pages": 2390,
"heap_tomb_pages": 0,
"total_allocated_pages": 13653,
"total_freed_pages": 11263,
"total_allocated_objects": 188993554,
"total_freed_objects": 188280200,
"malloc_increase_bytes": 24150864,
"malloc_increase_bytes_limit": 33554432,
"minor_gc_count": 1026,
"major_gc_count": 101,
"remembered_wb_unprotected_objects": 1406,
"remembered_wb_unprotected_objects_limit": 2548,
"old_objects": 430356,
"old_objects_limit": 445420,
"oldmalloc_increase_bytes": 92467016,
"oldmalloc_increase_bytes_limit": 126476361
}
old_object
430356 is quite a lot more than 225534 being reported in the heap dump via heapy
. Shouldn't these be roughly the same?
Also, the total memory being used differs: 17083.5 kb above, whereas ObjectSpace.memsize_of_all
gave a value of 83409077 (is that bytes or kilobytes? Nonetheless, it is differs)
The total RSS size of the Ruby process is at this time 1267 MiB.
(I have done quite a bit of digging trying to find extension-based leaks with jemalloc
etc., but haven't yet concluded that anything there is proven to be leaking. To the contrary, when I ran some isolated tests trying to reproduce the problems, all the allocations seemed to come from ObjectSpace and other Ruby-related allocations. Of course this can still be an extension leaking but the call graphs hasn't pointed me in that direction; can give more details about this if you like.)
Any suggestions are helpful. I am very much wandering in the dark here, looking for something that looks fishy, but... it's very dark tonight. 🌑 😉
memsize
for string is 40b (as any ruby object). But there is also memory consumed for storing actual string content
Maybe use something like this?
x["memsize"] + x["value"].to_s.bytes.count
I've been looking at heapy diff
, and one thing I noticed is that it keys on an object's address
to decide whether it was present in the heap dump we compare against.
Is this reliable? I looked at MRI, and address
is just an object's VALUE
pointer: https://github.com/ruby/ruby/blob/e315f3a1341f123051b75e589b746132c3510079/ext/objspace/objspace_dump.c#L238
The way I understood the Ruby GC to work is that each 40B slot (once populated) always contains an RVALUE
, which is a union type so it can "morph" into a different type. If this object gets GC'ed, it is not removed from the heap page, but rather a flag is cleared that tags this slot (or object) as "empty": https://github.com/ruby/ruby/blob/6ef46f71c743507a0e2ae0eef14dce0539b0ff52/gc.c#L569. This makes a slot reusable by changing its union type, but its memory address does not change.
Wouldn't this mean that if an object is GC'ed between two snapshots and the same slot is reused for a completely different object, it would then be omitted from the heapy diff
, because the slot address already appeared in the first snapshot?
I'm sure I'm missing something but I wanted to make sure I understand how this works. Thanks!
👍
Option to generate following CSV:
generation, objects count, memory, memory-objects ration
So in gnuplot
you can do
gnuplot
gnuplot > set logscale y
gnuplot > plot 'generations.csv' using 1:2 with lines title objects
gnuplot > plot 'generations.csv' using 1:3 with lines title memory
gnuplot > plot 'generations.csv' using 1:4 with lines title ratio
#!/usr/bin/env ruby
require 'oj'
class Analyzer
def initialize(filename)
@filename = filename
end
def real_memmory(arr)
x = arr.first
memsize = x["memsize"] || 0
if x["type"] == "STRING" && !x["value"].nil?
memsize += x["value"].to_s.bytes.count
end
arr.count * memsize
end
def analyze
data = []
File.open(@filename) do |f|
f.each_line do |line|
begin
parsed = Oj.load(line)
data << parsed
rescue Oj::ParseError => e
puts e
puts line
end
end
end
rows = data
.group_by { |x| x["generation"] || 0 }
.sort{ |a,b| b[0] <=> a[0] }
.map{ |x| [x[0], x[1].count, real_memmory(x[1]), real_memmory(x[1])/x[1].count ] }
max = rows.first[0]
new_rows = (0..max).map{ |x| [x, 0, 0, 0] }
rows.each { |x| new_rows[x[0]] = x }
puts new_rows.map{ |x| x.nil? ? "" : x.join(", ") }
end
end
Analyzer.new(ARGV[0]).analyze
Get this diff script as part of the gem http://blog.skylight.io/hunting-for-leaks-in-ruby/
Also, I'm looking for help maintaining this project in the future. I ask anyone interested to please send a PR and mention that you're interested in helping. You can email me if you've previously had a commit merged in this project.
A maintainer will gain commit access and deploy access on rubygems if they don't have it already. A maintainer will be expected to:
➜ project git:(ZUG-0000-memory) gem install heapy
Fetching: heapy-0.1.4.gem (100%)
Successfully installed heapy-0.1.4
1 gem installed
➜ project git:(ZUG-0000-memory) heapy -v
/Users/kevin/.rvm/gems/ruby-2.3.5/gems/heapy-0.1.4/bin/heapy:4:in `require': cannot load such file -- heapy (LoadError)
from /Users/kevin/.rvm/gems/ruby-2.3.5/gems/heapy-0.1.4/bin/heapy:4:in `<top (required)>'
from /Users/kevin/.rvm/gems/ruby-2.3.5/bin/heapy:23:in `load'
from /Users/kevin/.rvm/gems/ruby-2.3.5/bin/heapy:23:in `<main>'
from /Users/kevin/.rvm/gems/ruby-2.3.5/bin/ruby_executable_hooks:24:in `eval'
from /Users/kevin/.rvm/gems/ruby-2.3.5/bin/ruby_executable_hooks:24:in `<main>'
Any ideas why I can't run it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.