Edge of the Stack: Find Memleaks Without Process Restarts

Alexey Odinokov - November 18, 2013

Articles and tutorials in the "Edge of the Stack" series cover fundamental programming issues and concerns that might not come up when dealing with OpenStack directly, but are certainly relevant to the OpenStack ecosystem. For example, drivers are often written in C rather than Python, leading to concerns about memory leaks and other C-specific issues. The "Edge of the Stack" series is meant to provide information on these peripheral, but still important areas.

Memory leaks are one of most frequent problems that plague programs written in C. And, there are a lot of different ways to solve this problem. You can use profilers, for example, valgrind. But what do you do if you need to find memory leaks on the embedded system or server without special software? This article describes one of the solutions--using special glibc functions mtrace and muntrace.

To get some information about these functions, just type “man 3 mtrace” in the Unix console or see the Wikipedia definition.

The most common way to use them is shown below.

#include <stdlib.h>
#include <mcheck.h>

int main(void) { 

       mtrace(); /* Starts the recording of memory allocations and releases */

       int* a = NULL;

       a = malloc(sizeof(int)); /* Allocates memory and assigns it to the pointer */
       if (a == NULL) {
               return 1; /* error */
       }

       free(a); /* Frees the allocated memory to avoid leaks */
       muntrace();

       return 0; /* exit */

}

To make mtrace run, you should also export the MALLOC_TRACE variable before the start, as in the following example:

MALLOC_TRACE=/home/YourUserName/path/to/program/MallocTraceOutputFile.txt
export MALLOC_TRACE;

If you run this application, you will see /home/YourUserName/path/to/program/MallocTraceOutputFile.txt appear. This file will contain a log of alloc and free invocations. Now you can find all of the places where memory leaks happened using the mtrace script (man 1 mtrace):

 > mtrace /home/YourUserName/path/to/program/MallocTraceOutputFile.txt
- 0x011ed0e0 Free 2 was never alloc'd 0x2dfbab8c
- 0x011d22a0 Free 302 was never alloc'd 0x2df8b518
- 0x01a2f898 Free 19840 was never alloc'd 0x2e5791c4

Memory not freed:
-----------------
   Address 	Size 	Caller
0x011d22a0	0x168     at 0x2df8c180
0x011ed0e0  	0xf     at 0x2dfa8340
0x01a2f898 	0x10     at 0x2e57934c
0x01ad0148 	0x20     at 0x680418
0x01ad0170 	0x20     at 0x680418
0x01ad01c0 	0x34     at 0x2e502424
0x01ad01f8 	0x14     at 0x2e502424
0x01ad0210	0x400     at 0x2e502424
0x01ad0618	0xccc     at 0x506fa4

It’s a pity, but the mtrace script doesn’t show the debug info (filename and line number). You have several ways to find this information:

If you’ve compiled your application with the -g flag and didn’t strip it, you could just open /home/YourUserName/path/to/program/MallocTraceOutputFile.txt and find the line with same “address” and “caller” what are you interested in, for example:
```
for report line
0x01a2f898 	0x10  at 0x2e57934c
you can find line
@ /isan/lib/libavl.so:(avl_probe+0x40)[0x2e57934c] + 0x1a2f898 0x10
Now you now that allocation happened in avl_probe function
```
If you run a stripped binary file (for example, there is no place on the hardware), this information will be absent from the log file.
But if you have not stripped the binary file you, can use addr2line (man 1 addr2line) to convert the address to source filename:linenumber format. Just use addr2line -e <exename> <addr>. This won’t work if the allocation is invoked in the library. The other way is to use the GNU Debugger (gdb). Also, gdb helps you check the compiled applications that don’t use mtrace.

If you have a daemon that starts on system startup you can attach gdb to it, call mtrace, and get the memory login runtime:

> pidof cfgmgr
1057
> gdb
(gdb) attach 1057
Attaching to process 1057
Reading symbols from /itasca/bin/cfgmgr...done.
Using host libthread_db library "/lib/tls/libthread_db.so.1".
Reading symbols from /lib/tls/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 0x2e778310 (LWP 1057)]
[New Thread 0x3c26e4d0 (LWP 1086)]
[New Thread 0x3b9ff4d0 (LWP 1085)]
[New Thread 0x3b0c94d0 (LWP 1084)]
[New Thread 0x3a8c94d0 (LWP 1083)]
[New Thread 0x322024d0 (LWP 1075)]
[New Thread 0x305e14d0 (LWP 1074)]
[New Thread 0x2fde14d0 (LWP 1073)]
Loaded symbols for /lib/tls/libpthread.so.0
...<some info was removed>...
0x2aae0168 in pthread_join () from /lib/tls/libpthread.so.0
(gdb)

Now the gdb has attached to the cfgmgr. We should set MALLOC_TRACE and call mtrace:

(gdb)  call setenv("MALLOC_TRACE", "/mnt/cf/MallocTrace.txt",1)
(gdb) call mtrace()
$4 = 0

If you have opened another console to reproduce leaks, you can just continue with the process:

(gdb) continue
Continuing.

To stop and call muntrace, just hit Ctrl+C and do the following:

(gdb) call muntrace()
$5 = 6774264
(gdb) detach
(gdb) quit

Now you can analyze your /mnt/cf/MallocTrace.txt.

If you don’t have another console to reproduce leaks, you should detach the daemon and quit after the mtrace invocation. To stop the logs, you need to attach a daemon one more time and call muntrace. Do not forget to detach gdb later.

It’s very difficult to analyze logs. The main reason--there is no stacktrace. I’ve found only one way so far. To get the stacktrace, you should use gdb breakpoints, as in the following example:

(gdb) break *0x2e502424
Breakpoint 1 at 0x2e502424
(gdb) continue
Continuing.
[Switching to Thread 0x3b9ff4d0 (LWP 1085)]

Breakpoint 1, 0x2e502424 in g_malloc () from /isan/lib/libglib-2.0.so.0
(gdb) bt
#0  0x2e502424 in g_malloc () from /isan/lib/libglib-2.0.so.0
#1  0x2e4e7a7c in g_hash_table_new_full () from /isan/lib/libglib-2.0.so.0
#2  0x2dd5b9c8 in acfg_elem_info_create () from /isan/lib/libacfg.so
#3  0x2dd5bab4 in acfg_elem_add () from /isan/lib/libacfg.so
#4  0x0080a478 in cm_process_acfg_msg (msg_ref=0x3b9f9ef4)
	at ../security/cfgmgr/cm_acfg_gen.c:21801
#5  0x005b0d58 in cm_process_mts_msg_show (msg=0x3b9f9ef4)
	at ../security/cfgmgr/cm_main.c:1652
#6  0x0094cff4 in cm_ui_subsys_thread (arg=)
	at ../security/cfgmgr/cm_ui_interface.c:83
#7  0x2aade590 in start_thread () from /lib/tls/libpthread.so.0
#8  0x2e005dcc in __thread_start () from /lib/tls/libc.so.6
Backtrace stopped: frame did not save the PC

So, there’s the stacktrace.

To round things out, I’ll explain some possible improvements for this method.

Large log file size are not allowed on some systems, for example, on embedded systems. To avoid their usage, we could create a FIFO object (named pipe in Unix-like systems) and attach the mtrace script to it.

mkfifo /tmp/MallocTrace.txt
mtrace /tmp/MallocTrace.txt > /mnt/cf/FinalResult.txt &
... our manipulation with gdb

I’ve mentioned that the final mtrace result doesn’t contain the caller name. So, it’s better to improve the mtrace script to show this information. (See the fixed mtrace script.)
The next improvement is to show trace any time you want (do not call muntrace). I’d like to improve mtrace script setting signal handler to show run-time information about leaks. It will allow us to trace memory for a long time. (See the fixed mtrace script.)

I’ve also modified the script to show links between the alloc and free invocations, which I think you will find helpful. (See the modified script.)

legend
alloc: <place of allocation> <times of allocation>:
      alloc_free: <times of 'free' invocation> -> <place of 'free' invocation (previously allocated)> (overall <overall times of 'free' invocation>)

The last working script follows:

mkfifo /tmp/mtrace.fifo
/mnt/cf/mtracemap.pl /tmp/mtrace.fifo &
pidof cfgmgr
1065
/itasca/bin/gdb
attach 1065
call setenv("MALLOC_TRACE", "/tmp/mtrace.fifo",1)
call mtrace()
detach
quit

...do something to get leaks...
e.g.
checkpoint rollback new
y
checkpoint rollback xxx
y

/itasca/bin/gdb
attach 1065
call muntrace()
detach
quit

Now you’ll be able to find memory leaks in your programs using mtrace. Tracing memory allocations is very useful when you can’t use special software. Also it can save your time if you’ve detected a leak suspect. In that case, just put the debug output into the suspicious segment of code and execute the program.

I hope that you find this information helpful. You can take a look at the final code in appendixes A and B.

Trust the cloud native infrastructure experts

Mirantis OpsCare and OpsCare Plus

LEARN MORE

FREE TRIAL:

Try Mirantis Kubernetes Engine for Free

Simple, flexible, secure, and scalable container orchestration.

TRY IT FREE

Edge of the Stack: Find Memleaks Without Process Restarts

Recommended posts

Mirantis Container Runtime fixes Docker Engine vulnerability affecting upstream Moby

Kubernetes vs. Philippine Power Outages - On setting up k0s over Tailscale

How to Add a Cluster to Lens: A Step-by-Step Guide

Choose your cloud native journey.

Cloud Native & Coffee

Join Our Exclusive Newsletter

Trust the cloud native infrastructure experts

Try Mirantis Kubernetes Engine for Free

Digital Self-Determination

Services

Platform

Company