Johan Sørensen

Finding leaking Ruby Objects

I’ve spent far too many nights hunting down a huge “memory leak” in Collaboa, which through the power of google and seasoned Ruby hackers lead me across a bunch of little tricks. This article will attempt to summarize those.
There’s some excellent reading to be done in why’s The Fully Upturned Bin, for those curious about how the garbage collector works in Ruby and some common pitfalls.

The scenario in Collaboa was that with each request the memory usage of the dispatchers would grow. By a lot. And by “a lot” I mean easily up to a few megabytes under OSX. Quite embarrasing to say the least.

The first thing you have to do in this case is to try and isolate the memory chugging parts, since Collaboa uses the Subversion Ruby bindings to talk to the repository, that part was the prime suspect.

The Subversion Ruby bindings is a lovely (and I don’t mean that sarcasticly) blend of SWIG generated C, hand-rolled C and Ruby. Really, they’re great and with complete API coverage too (in 1.3).

So, I wrote a little script which did the bare essentials of listing the directory entries in the repository, and indeed the memory kept growing, so I needed to figure out what happened.

Enter ObjectSpace which can be seen as the info kiosk for the garbage collection facility. The guy we want to talk to regarding our problem is #each_object. He does the nitty-gritty job of traversing all the currently alive objects and sends each to the block.

Let’s play with it a bit:


irb(main):001:0> ObjectSpace.each_object{|obj| puts "#{obj.class}: #{obj.inspect}"}
String: "irb/ext/tracer.rb"
String: "irb/ext/history.rb"
String: "\tend\n"
String: "_org"
...
Array: ["@prompt_mode"]
String: "@io"
String: "@irb"
Array: ["@irb", "@io"]
...

Oh yes, lots of objects (about 5000 in fact). let’s see how many Class objects we currently have:


irb(main):002:0> ObjectSpace.each_object(Class){}
=> 312

Let’s see how many there is of each class:


irb(main):001:0> objects = Hash.new(0) 
=> {}
irb(main):002:0> ObjectSpace.each_object{|obj| objects[obj.class] += 1 }
=> 4991
irb(main):003:0> require 'pp'; pp objects.sort_by{|k,v| -v}
[[String, 3303],
 [Array, 811],
 [Class, 312],
 [Hash, 128],
 [MatchData, 121],
 [Regexp, 89],
... and so on

This little thing by John Carter will do just that and a little more. Save it as MemoryProfile.rb and run:
ruby -rMemoryProfile myscript.rb and you’ll get some nice data to look at. Things to look at are unusual large objects. It’ll also dump /proc/pid/status which will give you some memory usage info (but only on linux).

Robert Klemme also shares this little thing:

Observe the following little script:


def ostats(last_stat = nil)
 stats = Hash.new(0)
 ObjectSpace.each_object {|o| stats[o.class] += 1}

 stats.sort {|(k1,v1),(k2,v2)| v2 <=> v1}.each do |k,v|
   printf "%-30s  %10d", k, v
   printf " | delta %10d", (v - last_stat[k]) if last_stat
   puts
 end

 stats
end

stats = nil
c = Class.new
stats = ostats(stats)
250.times{ Class.new }
stats = ostats(stats)

And now we can see there’s 250 more Class objects in the second run:


Class 148
….
Class 398 | delta 250

Add the following to the end:


GC.start
stats = ostats(stats)

…And we’ll see our Class objects in the 250.times block being garbage collected right before the third time we call ostats():


Class 148 | delta -250

Very nice. Could even combine it with some things from MemoryProfile.rb if we where on linux to get memory usage info each time.

So all these little snippets are very helpful for someone like me who’s either a full-on mac zealot and whose workday heavily involves *nix specific things. For the windows users, the Ruby Memory Validator seems golden, but it’s windows only so far. I’ve been told a linux, and then a intel osx version, might see the light of day next year, but until then all I can do I wish.

Oh, and the Collaboa leak? That’s really a different story all together (this post is already too long), but it was ultimately because the APR (the Apache Runtime Library which Subversion uses) memory pool kept growing and growing on me. Clearing that one out each time I wanted a different handle to the repository filesystem made the memory usage constant again.


Comments:

  1. gene tani Says:


    nice summary. Conceivably, you could have lots of Fixnums, True/false/NilObjects being allocated and above techniques wouldn’t catch them but otherwise..

  2. Mauricio Fernandez Says:


    Fixnums/true/false/nil are immediate values and need not be “allocated”, so the fact that they don’t get listed in ObjectSpace.each_object doesn’t really matter. Now, if you have some pathological code like
    def foo; 10_000_000.times{|x| eval("foo#{x}=x")}; lambda{} end; a = foo
    … you have worse problems than Fixnums not being reported ;)

  3. Jesper Laursen Says:


    Nice, Now I just need you to make it Subversion 1.3 compatible :)

    And btw. Merry Christmas

  4. Stefan Kaes Says:


    I think the sleep(n) calls in the provided code snippets are superflous, since the Ruby interpreter is just a single OS thread.

  5. johan Says:


    stefan, you’re right, it’s just a single os thread, so it doesn’t even have an illustrative point..