Memory leak
From HalyardWiki
This page is for notes on our memory leak bug. Any relevant information, such as assumptions we've made, dead ends we've hit, possible ideas to try next, debugging information dumps, etc. should be on this page or linked from here, so we have one place to keep track of the information and one place to refer anyone who might be able to help.
Contents |
[edit] Matthew Flatt's debugging instructions
These are from a personal e-mail I received from Matthew Flatt.
Meanwhile, here's how I track down memory leaks:1. Build 3m with
--enable-backtrace.2. Pick a built-in datatype to use as a kind of radioactive tracer. Create an instance of the datatype and reference it from the data that isn't being GC'ed.
A will executor usually works well as a tracer (because they're rarely used otherwise). So, maybe run
(define w (make-will-executor))in the sandbox.
3. Create several instances of the thing that should be GC'ed (but isn't), each of which should have an associated tracer.
4. Use
(dump-memory-stats '<....>)to get information on why the tracer isn't collected.Example:
(dump-memory-stats '<will-executor>)You should get a path from the root to each tracer, and that path might tell you something interesting. You may still need help understanding the trace, though. If an element in the trace is something like
<#158>, that's an instance of an internal type that doesn't have a name; the number corresponds to an entry instypes.h.
[edit] Experimental engine branch
We have an experimental version of Halyard that uses sandbox.ss and disables all compilation and errortrace machinery:
git clone git://imlsrc.dartmouth.edu/halyard cd halyard git checkout --track -b sandbox origin/sandbox
This version of Halyard is quite slow, because we've disabled all script-compilation code.
[edit] Background information
We've had a serious memory leak in our engine for at least 2 years now, if not more. The symptoms are that you press reload or Control-R to reload the script, and the engine leaks several megabytes of memory; as of now, it's somewhere around 20-50 MBs of memory per reload, depending on how much code is being reloaded and other factors. This means that an engine can quickly eat up all available memory after a dozen or so reloads, and as it takes more memory, garbage collection slows down, since there is more memory to scan.
The memory leak has been around since at least the 0.0.x series of engines, as far as I can recall, but it became much worse after the upgrade to PLT v360 (commit 87e1e1e7). It has been getting steadily worse as time goes on, as our code base in both the engine and our programs grow; the amount of memory leaks seems to be roughly proportional to the amount of code we are loading.
[edit] Halyard's Scheme interface
Our C++/Scheme interface lives in Common/lang/scheme. The important classes are:
- TSchemeInterpreterManager: Created when the application starts, shut down on exit.
- TSchemeInterpreter: Created when a script is loaded, and destroyed before reloading the script or shutting down.
- TSchemePtr: A smart pointer based on an immobile box.
- TSchemeReg: An exception-safe version of MZ_GC_REG.
Conversion to and from Scheme values is performed in TSchemeConv.cpp.
On the Scheme side, the important files are in test/Runtime/halyard:
- loader.ss (or the loader subdirectory): Manages namespaces and/or sandboxes.
- kernel.ss: The guts of our runtime environment. Lots of C++/Scheme interfaces in here.
[edit] Attempted solutions
We've thought we found the issue many times, and in some cases we did find some other bug or a bug that was exacerbating the memory leak, but none of these seem to have fixed the underlying problem.
- Fixed bug in
loader.ssthat caused ourcurrent-load/used-compiledhandler to keep a reference to the previous version on every reload. - Upgraded to PLT v372 with the precise garbage collector
- Fixed a bug where we forgot to clear our
TSchemePtrs in their destructors. - Added custodians and sandboxes to each script load in order to try to forcibly dump all state when we reload and retain as little as possible between reloads.
- Added a count of all of the
TSchemePtrs we retain, and assert that there is exactly 1 (the pointer to thescheme_basic_env) between loads.
[edit] Debugging data
We've tried various approaches to collecting more data on the problem. After running the memory tools from Apple's Instruments on the Mac version of the engine, it turns out that these tools aren't going to be very helpful for tracking down the leak; they only track allocations performed by malloc (and the other standard allocator functions), while the garbage collector uses vm_allocate to allocate memory a page at a time. You can verify that the leaked memory is indeed coming from vm_allocate by inspecting the output of vmmap Halyard after reloading the program; the number that increases after each reload is the amount marked with region type VM_ALLOCATE.
There is some memory debugging machinery built into the precise garbage collector. You can run (dump-memory-stats) to get some memory statistics printed to standard out (which you can get at from Console.app on Mac OS X). Here's an example of the output after a few reloads:
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Begin Dump 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Begin MzScheme3m 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <syntax-code>: 23685 378960 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <application-code>: 57215 2129784 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <unary-application-c: 168281 2692496 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <binary-application-: 90173 1803460 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <sequence-code>: 10313 230600 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <branch-code>: 71216 1424320 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <procedure-code>: 38152 1526080 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <let-value-code>: 5467 131208 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <let-void-code>: 2914 46624 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <letrec-code>: 89 1780 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <let-one-code>: 77208 1235328 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <with-continuation-m: 1333 26660 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <procedure-semi-code: 22 880 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <let-value-semi-code: 939 26292 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <let-void-semi-code>: 724 14480 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <syntax-semi-code>: 278 4448 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <global-variable-cod: 1267022 30408528 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <module-variable-cod: 30043 841204 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <primitive>: 182573 6572356 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <primitive-closure>: 10825 346400 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <procedure>: 4104 49248 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <escape-continuation: 2 64 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <struct>: 39227 927688 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <procedure>: 934796 16939284 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <struct>: 57102 1319344 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <bignum-integer>: 25 400 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <fractional-number>: 3 48 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <complex-number>: 2 32 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <string>: 22630 362080 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <byte-string>: 5757 92112 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <unix-path>: 9164 146624 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <pair>: 2467905 39486480 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <vector>: 215931 7388168 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <inspector>: 54088 865408 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <input-port>: 29 5220 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <output-port>: 8 960 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <macro>: 188185 2258220 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <box>: 48100 577200 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <thread>: 2 1064 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <semaphore>: 2750 55000 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <hash-table>: 36320 1452800 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <weak-box>: 19078 457872 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <ephemeron>: 271 5420 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <struct-type>: 25254 1630560 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <module-path-index>: 112316 3144848 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <set!-transformer>: 28544 342528 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <namespace>: 91861 10288432 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <parameterization>: 331 7944 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <syntax>: 552659 17685088 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <custodian>: 4 256 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <regexp>: 2560 560956 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <hash-table-bucket>: 208150 3330400 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <hash-table>: 187515 6750540 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <wrap-chunk>: 15106 743624 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <struct-property>: 3043 48688 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <case-lambda-code>: 424 12760 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <begin0-code>: 122 2560 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <rename-table>: 3828 137808 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <module-code>: 775 105400 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <lazy-macro>: 22 352 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <resolve-prefix>: 3565 128340 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <security-guard>: 2 48 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <rename-transformer>: 765 9180 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <evt-set>: 1 20 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <channel>: 3 72 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <module-registry>: 5 200 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <thread-set>: 2 64 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <thread-cell>: 11926 238520 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <channel-syncer>: 1 36 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <certifications>: 416224 14984064 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <internal-definition: 8 128 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <internal:lexical-ri: 20 400 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <chain>: 142 3408 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <raw-pair>: 262359 4197744 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,139: 106 5512 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <continuation-prompt: 1809 28944 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,145: 3 533736 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,146: 160 16000 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,148: 1 76 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,149: 1 72 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,152: 2 48 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,154: 859 20616 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,156: 1 124 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,157: 35 2100 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,160: 70 3920 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,164: 30 960 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,168: 7 168 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,170: 27 864 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,178: 10792 259008 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,181: 129482 2589640 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,184: 35 840 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,185: 35 1120 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,190: 2 24 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,191: 31488 881664 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,193: 1 48 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,194: 105 2520 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,201: 14 3808 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,204: 25252 1010080 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,205: 1420 62480 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,207: 752 18048 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,208: 2256 90240 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] unknown,212: 4 160768 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] <ctype>: 25 500 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] End MzScheme3m 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 0: 22486628 of 33554432 bytes used 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [tagged]: 192964952 bytes used in 11789 pages 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [atomic]: 12525176 bytes used in 768 pages 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [array]: 44169132 bytes used in 2802 pages 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [tagged array]: 1024 bytes used in 1 pages 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [xtagged]: 0 bytes used in 0 pages 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [big]: 5826480 bytes used in 350 pages 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Current memory use: 277973392 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Peak memory use after a collection: 255486764 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Allocated (+reserved) page sizes: 297107456 (+163840) 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] # of major collections: 4 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] # of minor collections: 61 4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] # of installed finalizers: 8159
[edit] current-memory-use
Here are some memory use statistics collected from our listener, when running under the "sandbox" build:
(collect-garbage) ==> #<void> (values (current-memory-use) (current-memory-use (current-custodian))) ==> (values 89189140 85929544)
I then reloaded the code (without making any changes) and got the following numbers:
(collect-garbage) ==> #<void> (values (current-memory-use) (current-memory-use (current-custodian))) ==> (values 169492232 164279680)
That's a memory leak of over 74 MB.

