"new" not failing correctly, malloc sometimes not failing correctly, and sometimes failing too soon

24 Mar 2014

So a nice cryptic title.

Problem 1: When 'new' is called, it is supposed to return a NULL pointer when it fails. Currently it is simply crashing. Malloc does work correctly.

Which brings us to problem 2: I lied. On the KL05 also malloc which fails crashes. I guess uARM vs regular ARM.

Those are the easy ones, now the ones I can also probably use some feedback on. The issue, I got a program which runs out of memory sooner than I would expect. I won't bore you with the program, so lets go to the test program:

include "mbed.h" 

int main() {
    printf("Start\r\n");
    int i = 1024;
    printf("i is at %d\r\n", &i);
    while(1) {
        void* test = malloc(i)];
        int i2 = 2;
        printf("Mallocced %d bytes at %d, diff = %d\r\n", i, test, (uint32_t)(&i2) - (uint32_t)test);
        if (test == NULL)
            break;
    }
    printf("SP = %d\r\n", __current_sp());
}

Now from what I understand of the memory model, i and i2 start at the top, everything new which is malloced/new'ed start at the bottom. When they touch each other you are screwed (technically mallocing should be about wherever it fits, but in this simple case that description should be correct afaik.

Now the results, on the KL05Z: (with 512 bytes increment, since it is fairly small):

Start
i is at 536873952
Mallocced 512 bytes at 536870256, diff = 3700
Mallocced 512 bytes at 536870776, diff = 3180
Mallocced 512 bytes at 536871296, diff = 2660
Mallocced 512 bytes at 536871816, diff = 2140
Mallocced 512 bytes at 536872336, diff = 1620
Mallocced 512 bytes at 536872856, diff = 1100
Mallocced 512 bytes at 536873376, diff = 580

It never returns the NULL pointer (here it could also be crashing on the printf, but I tried other increments also). This looks fine to me.

Now on the KL46Z, but same for some others:

i is at 536895456
Mallocced 1024 bytes at 536863416, diff = 32044
Mallocced 1024 bytes at 536864448, diff = 31012
Mallocced 1024 bytes at 536865480, diff = 29980
Mallocced 1024 bytes at 536866512, diff = 28948
Mallocced 1024 bytes at 536867544, diff = 27916
Mallocced 1024 bytes at 536868576, diff = 26884
Mallocced 1024 bytes at 536869608, diff = 25852
Mallocced 1024 bytes at 536870640, diff = 24820
Mallocced 1024 bytes at 536871672, diff = 23788
Mallocced 1024 bytes at 536872704, diff = 22756
Mallocced 1024 bytes at 536873736, diff = 21724
Mallocced 1024 bytes at 536874768, diff = 20692
Mallocced 1024 bytes at 536875800, diff = 19660
Mallocced 1024 bytes at 536876832, diff = 18628
Mallocced 1024 bytes at 536877864, diff = 17596
Mallocced 1024 bytes at 536878896, diff = 16564
Mallocced 1024 bytes at 536879928, diff = 15532
Mallocced 1024 bytes at 536880960, diff = 14500
Mallocced 1024 bytes at 536881992, diff = 13468
Mallocced 1024 bytes at 536883024, diff = 12436
Mallocced 1024 bytes at 536884056, diff = 11404
Mallocced 1024 bytes at 536885088, diff = 10372
Mallocced 1024 bytes at 536886120, diff = 9340
Mallocced 1024 bytes at 536887152, diff = 8308
Mallocced 1024 bytes at 536888184, diff = 7276
Mallocced 1024 bytes at 536889216, diff = 6244
Mallocced 1024 bytes at 0, diff = 536895460
SP = 536895456

So the good part is that is nicely returns a NULL pointer. The bad news is that it does so when there should be over 5kB of memory left. For the KL25Z and LPC1768 I get values between 4kB and 5kB of memory left.

How is this possible. Even if there would be overhead, how can it be there? i2 is created right there at the bottom of the stack. Test is malloced right there at the top of the heap. Is there some overhead I am not aware of that accounts for 4-5kB of SRAM memory? Did Sam add a bitcoin miner to the code? Is the NSA watching us?

So TL;DR: New doesn't fail correctly, malloc not on uARM (that might be by design, dunno), and something is stealing my memory. It is by the way done with latest mbed lib, non-beta compiler. Few versions older mbed lib has the same behavior.

24 Mar 2014

I've noticed this before on the LPC11U24 and LPC11U35, heap and stack collision is definitely broken/disabled. I stumbled across a forum post from a few years ago about heap and stack collision being broken when the RTOS was in use, and that it had to be disabled to make it work. I'm guessing they may have disabled it across the board at that time in lieu of only disabling it if the RTOS was in use.

25 Mar 2014

Indeed I know in the past it was broken, and in RTOS still kinda. But while I first used RTOS, I actually moved my program away from it since I suspected it of stealing my memory. And sure it uses a bunch of memory, but definately not the source of my memory dissapearing.

Now I don't have RTOS in use anymore, but in the past as you say they had broken heap/stack collision by disabling the check (although that was fixed long time ago). And in theory that could still account for "new" crashing. But my problem here is actually the opposite: It returns a NULL pointer long before I see a reason for it to return one. From the ARM information center:

Quote:

One-region model

The application stack and heap grow towards each other in the same region of memory. See Figure 7. In this run-time memory model, the heap is checked against the value of the stack pointer when new heap space is allocated, for example, when malloc() is called.

From this I read it should keep allocating as long as the heap will not collide with the current stack pointer.

However what I get is:

Mallocced 2048 bytes at 268436144, Left over = 25864
Mallocced 2048 bytes at 268438200, Left over = 23808
Mallocced 2048 bytes at 268440256, Left over = 21752
Mallocced 2048 bytes at 268442312, Left over = 19696
Mallocced 2048 bytes at 268444368, Left over = 17640
Mallocced 2048 bytes at 268446424, Left over = 15584
Mallocced 2048 bytes at 268448480, Left over = 13528
Mallocced 2048 bytes at 268450536, Left over = 11472
Mallocced 2048 bytes at 268452592, Left over = 9416
Mallocced 2048 bytes at 268454648, Left over = 7360
Mallocced 2048 bytes at 268456704, Left over = 5304
Mallocced 2048 bytes at 0, Left over = 268462008

This one is slightly modified compared to previous one. Left over is the current stack pointer, minus the heap pointer, minus the amount I am allocating: It is how much memory should be left over after that allocation. So I have over 5kB left, yet I cannot allocate another 2kB? Why not? Also 4kB of memory has been placed on the stack by me here, which is why it starts at 26k left after the first malloc.

There is also the two-region model, which is less flexible and not used with the mbed setup. However the values are set to be able to use it, and they are set such that the heap is allowed to grow up to the heap limit, which is the initial stack pointer. Since I defined a 4kB block of memory on the stack, if I enable the two-region memory this is the result:

#pragma import __use_two_region_memory

Mallocced 2048 bytes at 268436144, Left over = 25864
Mallocced 2048 bytes at 268438200, Left over = 23808
Mallocced 2048 bytes at 268440256, Left over = 21752
Mallocced 2048 bytes at 268442312, Left over = 19696
Mallocced 2048 bytes at 268444368, Left over = 17640
Mallocced 2048 bytes at 268446424, Left over = 15584
Mallocced 2048 bytes at 268448480, Left over = 13528
Mallocced 2048 bytes at 268450536, Left over = 11472
Mallocced 2048 bytes at 268452592, Left over = 9416
Mallocced 2048 bytes at 268454648, Left over = 7360
Mallocced 2048 bytes at 268456704, Left over = 5304
Mallocced 2048 bytes at 268458760, Left over = 3248
Mallocced 2048 bytes at 268460816, Left over = 1192
Mallocced 2048 bytes at 268462872, Left over = -864
Mallocced 2048 bytes at 268464928, Left over = -2920
Mallocced 2048 bytes at 0, Left over = 268462008

So now it happily stays allocating until it reaches the top of the SRAM memory (the initial stack pointer), and I have effectively disabled the collision detection. Of course that is far from ideal, on the other hand if they collide generally your program is going to crash anyway (you can check if a NULL pointer is returned on large allocations of memory, but you cannot check each small one, and new variables on the stack don't have collision detection anyway: They can be happily placed on the heap memory).

Of course this is a very simple program, normally it will crash once the heap and stack collide, only because I am just allocating the memory here, and it is only colliding with an unused array, nothing happens. So later I hope I am able again to try what is going to happen with my program which lacked memory when I set it to use the two-region memory model: Is there actually a reason why the correct model stops 4-5kB before it should stop, or do I simply have that much more memory when it is set on two-region model.

The only thing I can think of myself, is that it is a 'safety-margin' which is built-in: So the stack always has plenty of space to grow, since the stack has no collision detection. However then why isn't this indicated in the arm info center (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0471c/CHDJEDFG.html), and secondly, 4-5kB as safety margin is a ridiculous large amount on a device which as 16kB memory total. 512B sounds more reasonable to me.

Still would be nice to have some input from mbed staff, or someone else who actually knows how it is supposed to work :)

26 Mar 2014

Hello erik,

does new return null ? I can't find how ARMCC is set in the online compiler regarding exceptions, but new might throw an exception (if exceptions are enabled), or just signal the end of the application. Use new with std::nothrow by default (can't recall if ARM lib does support new with no exceptions). As result, malloc is preferred in "our" world.

Regarding failing malloc prematurely. I am not able to find ARMCC malloc implementation so I was stepping through disassmebly a bit. The rt_heap_expand function subtracts 0x1000 from the stack pointer, so therefore it leaves some space as you have noticed. That would explain 4kB of difference, plus some overhead inside and we are at the numbers you shared.

One more thing to consider, I believe printf uses malloc to allocate internal buffers, that can also reduce the size of heap.

Regards,
0xc0170

26 Mar 2014

Hey,

Thanks for the reply. New does not return NULL and just crashes (or signals end of application :P). Exceptions should be disabled inside the online compiler at least.

Printf also uses malloc indeed, which is why sometimes I had it crash, simply because it would succeed in allocating my test array, but then the printf function fails. But since the test array was allocated first, and nicely returns a NULL pointer when it failed, I knew it couldn't be printf being a problem: if printf in the function itself mallocs data this does not affect the mallocing of the array. If it mallocs the first time it runs data, then this just takes some heap space, but since I calculated the amount of space left using the stack pointer and the returned malloc pointer, this is already taken into account.

But then we come at the interesting part, the disassembly. So it substracts 4kB from the stack pointer, add some overhead, and that i was mallocing fairly large blocks, and we indeed exactly get what I observed. Now I assume that is indeed so the stack still has some space to grow, since the stack has no collision detection.

Now I don't know what others think about it, but to me reserving 4kB of memory on a device with 16kB of memory just to be sure your stack doesn't collide seems enormous overkill to me. Especially since in practise the result is simply that you have 4kB less memory before your program stops functioning. And the question then: Can this is changed? Or would the entire heap_extend function be reimplemented.

26 Mar 2014

Erik - wrote:

...but in the past as you say they had broken heap/stack collision by disabling the check (although that was fixed long time ago).

When I tested heap/stack collision detection a few months ago, it crashed in the most fantastic way with gibberish spewing all over the terminal and everything. I just re-tested it using your code, and sure enough, unless the LPC11U24 has 31KB of RAM that nobody knows about, I'd say it's still broken:

Output

Start
i is at 268443616
Mallocced 1024 bytes at 268435824, diff = 7796
Mallocced 1024 bytes at 268436856, diff = 6764
Mallocced 1024 bytes at 268437888, diff = 5732
Mallocced 1024 bytes at 268438920, diff = 4700
Mallocced 1024 bytes at 268439952, diff = 3668
Mallocced 1024 bytes at 268440984, diff = 2636
Mallocced 1024 bytes at 268442016, diff = 1604
Mallocced 1024 bytes at 268443048, diff = 572
Mallocced 1024 bytes at 268444080, diff = -460
Mallocced 1024 bytes at 268445112, diff = -1492
Mallocced 1024 bytes at 268446144, diff = -2524
Mallocced 1024 bytes at 268447176, diff = -3556
Mallocced 1024 bytes at 268448208, diff = -4588
Mallocced 1024 bytes at 268449240, diff = -5620
Mallocced 1024 bytes at 268450272, diff = -6652
Mallocced 1024 bytes at 268451304, diff = -7684
Mallocced 1024 bytes at 268452336, diff = -8716
Mallocced 1024 bytes at 268453368, diff = -9748
Mallocced 1024 bytes at 268454400, diff = -10780
Mallocced 1024 bytes at 268455432, diff = -11812
Mallocced 1024 bytes at 268456464, diff = -12844
Mallocced 1024 bytes at 268457496, diff = -13876
Mallocced 1024 bytes at 268458528, diff = -14908
Mallocced 1024 bytes at 268459560, diff = -15940
Mallocced 1024 bytes at 268460592, diff = -16972
Mallocced 1024 bytes at 268461624, diff = -18004
Mallocced 1024 bytes at 268462656, diff = -19036
Mallocced 1024 bytes at 268463688, diff = -20068
Mallocced 1024 bytes at 268464720, diff = -21100
Mallocced 1024 bytes at 268465752, diff = -22132
Mallocced 1024 bytes at 268466784, diff = -23164
26 Mar 2014

And that is the up to date lib? (just making sure) That is again the problem of never getting a NULL pointer. Different from the other problems here, but good chance it is at least partially related.

Also weird that it doesn't crash earlier: Malloc also has to write a few bytes with information. You would think writing that to non-existent memory causes a crash.

26 Mar 2014

Erik - wrote:

And that is the up to date lib? That is again the problem of never getting a NULL pointer. Different from the other problems here, but good chance it is at least partially related.

Also weird that it doesn't crash earlier: Malloc also has to write a few bytes with information. You would think writing that to non-existent memory causes a crash.

Yes, revision 81.

26 Mar 2014

In this companion thread, you'll see a link I found into a reference for the ARM compiler in how it operates per the C++ standard that new does not return NULL, it throws an exception instead. I don't know what's under the covers in the C++ edition, but the interface is "class-like" and constructors don't return values...

01 Aug 2014

Hi everyone, Im a newbie to this forum.I found this thread to be related to my issue which im currently debugging now. Please take a look at this thread created by me https://mbed.org/questions/4220/Different-behaviors-of-New-operator-Dyna/ and requesting you to share your thoughts on this..