Memory leak

From HalyardWiki

Jump to: navigation, search

This page is for notes on our memory leak bug. Any relevant information, such as assumptions we've made, dead ends we've hit, possible ideas to try next, debugging information dumps, etc. should be on this page or linked from here, so we have one place to keep track of the information and one place to refer anyone who might be able to help.

Contents

[edit] Matthew Flatt's debugging instructions

These are from a personal e-mail I received from Matthew Flatt.

Meanwhile, here's how I track down memory leaks:

1. Build 3m with --enable-backtrace.

2. Pick a built-in datatype to use as a kind of radioactive tracer. Create an instance of the datatype and reference it from the data that isn't being GC'ed.

A will executor usually works well as a tracer (because they're rarely used otherwise). So, maybe run

(define w (make-will-executor))

in the sandbox.

3. Create several instances of the thing that should be GC'ed (but isn't), each of which should have an associated tracer.

4. Use (dump-memory-stats '<....>) to get information on why the tracer isn't collected.

Example:

(dump-memory-stats '<will-executor>)

You should get a path from the root to each tracer, and that path might tell you something interesting. You may still need help understanding the trace, though. If an element in the trace is something like <#158>, that's an instance of an internal type that doesn't have a name; the number corresponds to an entry in stypes.h.

[edit] Experimental engine branch

We have an experimental version of Halyard that uses sandbox.ss and disables all compilation and errortrace machinery:

git clone git://imlsrc.dartmouth.edu/halyard
cd halyard
git checkout --track -b sandbox origin/sandbox

This version of Halyard is quite slow, because we've disabled all script-compilation code.

[edit] Background information

We've had a serious memory leak in our engine for at least 2 years now, if not more. The symptoms are that you press reload or Control-R to reload the script, and the engine leaks several megabytes of memory; as of now, it's somewhere around 20-50 MBs of memory per reload, depending on how much code is being reloaded and other factors. This means that an engine can quickly eat up all available memory after a dozen or so reloads, and as it takes more memory, garbage collection slows down, since there is more memory to scan.

The memory leak has been around since at least the 0.0.x series of engines, as far as I can recall, but it became much worse after the upgrade to PLT v360 (commit 87e1e1e7). It has been getting steadily worse as time goes on, as our code base in both the engine and our programs grow; the amount of memory leaks seems to be roughly proportional to the amount of code we are loading.

[edit] Halyard's Scheme interface

Our C++/Scheme interface lives in Common/lang/scheme. The important classes are:

  • TSchemeInterpreterManager: Created when the application starts, shut down on exit.
  • TSchemeInterpreter: Created when a script is loaded, and destroyed before reloading the script or shutting down.
  • TSchemePtr: A smart pointer based on an immobile box.
  • TSchemeReg: An exception-safe version of MZ_GC_REG.

Conversion to and from Scheme values is performed in TSchemeConv.cpp.

On the Scheme side, the important files are in test/Runtime/halyard:

  • loader.ss (or the loader subdirectory): Manages namespaces and/or sandboxes.
  • kernel.ss: The guts of our runtime environment. Lots of C++/Scheme interfaces in here.

[edit] Attempted solutions

We've thought we found the issue many times, and in some cases we did find some other bug or a bug that was exacerbating the memory leak, but none of these seem to have fixed the underlying problem.

  • Fixed bug in loader.ss that caused our current-load/used-compiled handler to keep a reference to the previous version on every reload.
  • Upgraded to PLT v372 with the precise garbage collector
  • Fixed a bug where we forgot to clear our TSchemePtrs in their destructors.
  • Added custodians and sandboxes to each script load in order to try to forcibly dump all state when we reload and retain as little as possible between reloads.
  • Added a count of all of the TSchemePtrs we retain, and assert that there is exactly 1 (the pointer to the scheme_basic_env) between loads.

[edit] Debugging data

We've tried various approaches to collecting more data on the problem. After running the memory tools from Apple's Instruments on the Mac version of the engine, it turns out that these tools aren't going to be very helpful for tracking down the leak; they only track allocations performed by malloc (and the other standard allocator functions), while the garbage collector uses vm_allocate to allocate memory a page at a time. You can verify that the leaked memory is indeed coming from vm_allocate by inspecting the output of vmmap Halyard after reloading the program; the number that increases after each reload is the amount marked with region type VM_ALLOCATE.

There is some memory debugging machinery built into the precise garbage collector. You can run (dump-memory-stats) to get some memory statistics printed to standard out (which you can get at from Console.app on Mac OS X). Here's an example of the output after a few reloads:

4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Begin Dump 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Begin MzScheme3m 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]          <syntax-code>:      23685     378960 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]     <application-code>:      57215    2129784 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <unary-application-c:     168281    2692496 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <binary-application-:      90173    1803460 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]        <sequence-code>:      10313     230600 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]          <branch-code>:      71216    1424320 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]       <procedure-code>:      38152    1526080 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]       <let-value-code>:       5467     131208 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]        <let-void-code>:       2914      46624 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]          <letrec-code>:         89       1780 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]         <let-one-code>:      77208    1235328 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <with-continuation-m:       1333      26660 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <procedure-semi-code:         22        880 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <let-value-semi-code:        939      26292 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <let-void-semi-code>:        724      14480 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]     <syntax-semi-code>:        278       4448 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <global-variable-cod:    1267022   30408528 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <module-variable-cod:      30043     841204 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            <primitive>:     182573    6572356 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]    <primitive-closure>:      10825     346400 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            <procedure>:       4104      49248 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <escape-continuation:          2         64 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]               <struct>:      39227     927688 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            <procedure>:     934796   16939284 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]               <struct>:      57102    1319344 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]       <bignum-integer>:         25        400 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]    <fractional-number>:          3         48 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]       <complex-number>:          2         32 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]               <string>:      22630     362080 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]          <byte-string>:       5757      92112 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            <unix-path>:       9164     146624 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]                 <pair>:    2467905   39486480 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]               <vector>:     215931    7388168 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            <inspector>:      54088     865408 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]           <input-port>:         29       5220 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]          <output-port>:          8        960 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]                <macro>:     188185    2258220 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]                  <box>:      48100     577200 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]               <thread>:          2       1064 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            <semaphore>:       2750      55000 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]           <hash-table>:      36320    1452800 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]             <weak-box>:      19078     457872 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            <ephemeron>:        271       5420 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]          <struct-type>:      25254    1630560 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]    <module-path-index>:     112316    3144848 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]     <set!-transformer>:      28544     342528 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            <namespace>:      91861   10288432 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]     <parameterization>:        331       7944 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]               <syntax>:     552659   17685088 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            <custodian>:          4        256 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]               <regexp>:       2560     560956 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]    <hash-table-bucket>:     208150    3330400 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]           <hash-table>:     187515    6750540 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]           <wrap-chunk>:      15106     743624 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]      <struct-property>:       3043      48688 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]     <case-lambda-code>:        424      12760 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]          <begin0-code>:        122       2560 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]         <rename-table>:       3828     137808 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]          <module-code>:        775     105400 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]           <lazy-macro>:         22        352 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]       <resolve-prefix>:       3565     128340 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]       <security-guard>:          2         48 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <rename-transformer>:        765       9180 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]              <evt-set>:          1         20 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]              <channel>:          3         72 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]      <module-registry>:          5        200 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]           <thread-set>:          2         64 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]          <thread-cell>:      11926     238520 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]       <channel-syncer>:          1         36 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]       <certifications>:     416224   14984064 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <internal-definition:          8        128 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <internal:lexical-ri:         20        400 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]                <chain>:        142       3408 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]             <raw-pair>:     262359    4197744 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,139:        106       5512 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]   <continuation-prompt:       1809      28944 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,145:          3     533736 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,146:        160      16000 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,148:          1         76 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,149:          1         72 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,152:          2         48 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,154:        859      20616 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,156:          1        124 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,157:         35       2100 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,160:         70       3920 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,164:         30        960 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,168:          7        168 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,170:         27        864 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,178:      10792     259008 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,181:     129482    2589640 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,184:         35        840 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,185:         35       1120 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,190:          2         24 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,191:      31488     881664 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,193:          1         48 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,194:        105       2520 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,201:         14       3808 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,204:      25252    1010080 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,205:       1420      62480 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,207:        752      18048 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,208:       2256      90240 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]            unknown,212:          4     160768 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902]                <ctype>:         25        500 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] End MzScheme3m 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 0: 22486628 of 33554432 bytes used 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [tagged]: 192964952 bytes used in 11789 pages 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [atomic]: 12525176 bytes used in 768 pages 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [array]: 44169132 bytes used in 2802 pages 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [tagged array]: 1024 bytes used in 1 pages 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [xtagged]: 0 bytes used in 0 pages 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Generation 1 [big]: 5826480 bytes used in 350 pages 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Current memory use: 277973392 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Peak memory use after a collection: 255486764 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] Allocated (+reserved) page sizes: 297107456 (+163840) 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] # of major collections: 4 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] # of minor collections: 61 
4/25/08 11:59:15 PM [0x0-0x358358].Halyard[35902] # of installed finalizers: 8159 

[edit] current-memory-use

Here are some memory use statistics collected from our listener, when running under the "sandbox" build:

(collect-garbage)
==> #<void>

(values (current-memory-use) (current-memory-use (current-custodian)))
==> (values 89189140 85929544)

I then reloaded the code (without making any changes) and got the following numbers:

(collect-garbage)
==> #<void>

(values (current-memory-use) (current-memory-use (current-custodian)))
==> (values 169492232 164279680)

That's a memory leak of over 74 MB.

Personal tools