General C Help; Memory Management, Passing Pointers, Designing Objects

# 10 Mar 2014

Hi everyone, I'm hoping someone versed in C can help me to understand some basic concepts.

I seem to run into trouble when I try to factor my code into nice neat objects intended to encapsulate some functionality.

I also am unclear on when I need to use "delete" to clean up memory I've allocated.

In general, I'm trying to do simple things, like write an object with some methods to gather MCU data and provide a string that can be written to a serial instance. What's the best way to go about something like this?

Suppose I want to be able to write something like (pseudo code):

  MessageBuilder builder = new MessageBuilder();
  char* result = "";
  builder.getSomeData(param1, param2, result);
  pc.printf(result);
  delete result;

So I've defined a "MessageBuilder" object. This object offers a "getSomeData()" method that requires two input parameters and then fills a "string" output parameter. How should the method be written? How should the result variable be passed into the method? Should I be calling "delete result;" after I'm done? If I assign the value to result to some other "string", did I make a copy or will a call to "delete" corrupt my original result variable? Is an output parameter the way to go, or should the method return the result (I get non-pod class passed through an ellipsis, or something like that, warning when I try to return a "string" or the program just locks up).

I'm just not clear on when to use char[xxx], char*, or <string> and what the ramifications of each might be (or how to cleanup after myself properly when using any one of the three). I'm also a little confused on when a variable declaration only creates a pointer and when it also generates a value (how exactly value types and reference types are handled and where arrays fit in).

I know I have code that "works", but it is subject to random lockups which I'm pretty sure are related to poor memory management on my part. I'm pretty keen in a managed environment like .Net so have a solid grasp on the concepts involved, I just don't entirely understand how it works in low level C in the mBed environment (the mBed firmware is performing a certain amount of memory management, right?).

I'm also not clear on the allowed usage of statements like printf. I believe I remember reading some caveats about it but am not clear if I can, say, create a serial instance, pass a reference to it to another class instance, and then have that class execute its printf() statement. For example, suppose the above code was more like:

  MessageBuilder builder = new MessageBuilder();
  char* result = "";
  builder.getSomeData(param1, param2, result);
  remote.write(result);
  delete result;

And then "remote" was an instance of an object with a method like:

 void MyObject::write(char* data) {
     radio.printf(data);
  }

Where "radio" is a class level instance of Serial. Is it valid to pass the char* off like that and call printf() on radio?

Would anyone be willing to work with me briefly to help me understand some of these basic concepts?

Hi everyone, I'm hoping someone versed in C can help me to understand some basic concepts. I seem to run into trouble when I try to factor my code into nice neat objects intended to encapsulate some functionality. I also am unclear on when I need to use "delete" to clean up memory I've allocated. In general, I'm trying to do simple things, like write an object with some methods to gather MCU data and provide a string that can be written to a serial instance. What's the best way to go about something like this? Suppose I want to be able to write something like (pseudo code): <<code>> MessageBuilder builder = new MessageBuilder(); char* result = ""; builder.getSomeData(param1, param2, result); pc.printf(result); delete result; <</code>> So I've defined a "MessageBuilder" object. This object offers a "getSomeData()" method that requires two input parameters and then fills a "string" output parameter. How should the method be written? How should the result variable be passed into the method? Should I be calling "delete result;" after I'm done? If I assign the value to result to some other "string", did I make a copy or will a call to "delete" corrupt my original result variable? Is an output parameter the way to go, or should the method return the result (I get non-pod class passed through an ellipsis, or something like that, warning when I try to return a "string" or the program just locks up). I'm just not clear on when to use char[xxx], char*, or <string> and what the ramifications of each might be (or how to cleanup after myself properly when using any one of the three). I'm also a little confused on when a variable declaration only creates a pointer and when it also generates a value (how exactly value types and reference types are handled and where arrays fit in). I know I have code that "works", but it is subject to random lockups which I'm pretty sure are related to poor memory management on my part. I'm pretty keen in a managed environment like .Net so have a solid grasp on the concepts involved, I just don't entirely understand how it works in low level C in the mBed environment (the mBed firmware is performing a certain amount of memory management, right?). I'm also not clear on the allowed usage of statements like printf. I believe I remember reading some caveats about it but am not clear if I can, say, create a serial instance, pass a reference to it to another class instance, and then have that class execute its printf() statement. For example, suppose the above code was more like: <<code>> MessageBuilder builder = new MessageBuilder(); char* result = ""; builder.getSomeData(param1, param2, result); remote.write(result); delete result; <</code>> And then "remote" was an instance of an object with a method like: <<code>> void MyObject::write(char* data) { radio.printf(data); } <</code>> Where "radio" is a class level instance of Serial. Is it valid to pass the char* off like that and call printf() on radio? Would anyone be willing to work with me briefly to help me understand some of these basic concepts?

Erik -

# 10 Mar 2014

Everytime you use new, or malloc, you require to remove it again by delete, or free in the case of malloc. Well unless it is only made once and never should be removed.

However you often do not need to use new. The correct way to use new btw is:

MessageBuilder *builder = new MessageBuilder();

Since it returns a pointer to an object of type MessageBuilder. However what you can also do is:

MessageBuilder builder;

This does the same. Now sometimes you need new, but generally this is the way to go. This is only valid within the section it was created: so the good part is that it is automatically deleted for you. If you for example make it inside a function, it will be gone when the function finishes. The bad part is that it is deleted even if you don't want it (then you need new, but usually you just define it at a higher level).

You can indeed pass a serial instance to another function/class. The correct way to do this is only passing a pointer to the serial instance (Make sure this serial instance is not destroyed while it is needed! For example if you make a Serial object in a function, pass it to a class to use later, and the function ends, then also the pointer becomes incorrect).

Also take into account when you are using a pointer to an object, you call functions using '->'. For example:

Serial pc(USBTX, USBRX);
functionWhichUsesSerial(&pc);  //pass pointer to pc

void functionWhichUsesSerial(Serial *serial) {
  serial->printf("Hello\r\n");
}

In general using <string> is not a good idea on a microcontroller: It is fairly memory intensive, a char array is better. If you do char* test;, you define a pointer to a char, without it actually pointing to anything. If you do char test[100];, then test is also a pointer, but it points to the first element of the array of 100 which is reserved in memory. (Again, if you do this in a function, it will be automatically removed once the function ends).

In general I think it is a good idea to try to start with a bit more the basic. Dynamic memory allocation can be very useful, but not something you need that much at the beginning.

Everytime you use new, or malloc, you require to remove it again by delete, or free in the case of malloc. Well unless it is only made once and never should be removed. However you often do not need to use new. The correct way to use new btw is: <<code>> MessageBuilder *builder = new MessageBuilder(); <</code>> Since it returns a pointer to an object of type MessageBuilder. However what you can also do is: <<code>> MessageBuilder builder; <</code>> This does the same. Now sometimes you need new, but generally this is the way to go. This is only valid within the section it was created: so the good part is that it is automatically deleted for you. If you for example make it inside a function, it will be gone when the function finishes. The bad part is that it is deleted even if you don't want it (then you need new, but usually you just define it at a higher level). You can indeed pass a serial instance to another function/class. The correct way to do this is only passing a pointer to the serial instance (Make sure this serial instance is not destroyed while it is needed! For example if you make a Serial object in a function, pass it to a class to use later, and the function ends, then also the pointer becomes incorrect). Also take into account when you are using a pointer to an object, you call functions using '->'. For example: <<code>> Serial pc(USBTX, USBRX); functionWhichUsesSerial(&pc); //pass pointer to pc <</code>> <<code>> void functionWhichUsesSerial(Serial *serial) { serial->printf("Hello\r\n"); } <</code>> In general using <string> is not a good idea on a microcontroller: It is fairly memory intensive, a char array is better. If you do char* test;, you define a pointer to a char, without it actually pointing to anything. If you do char test[100];, then test is also a pointer, but it points to the first element of the array of 100 which is reserved in memory. (Again, if you do this in a function, it will be automatically removed once the function ends). In general I think it is a good idea to try to start with a bit more the basic. Dynamic memory allocation can be very useful, but not something you need that much at the beginning.

Reed Kimble

# 10 Mar 2014

Thank you so much! This is a great help, but I have some follow up questions (hope that's ok!).

If I understood you correctly about "new", you say that within a method:

MessageBuilder builder;

Will fall out of scope and be destroyed automatically when the function ends. However,

 MessageBuilder *builder = new MessageBuilder();

Will leave the builder instance alive even after the currently executing method exits? Did I get that right, or misunderstand?

If I use char* str = char[xxx] within a function, the value that str points to will be lost when the function ends, correct?

If I declare my method as

void myMethod(char* result)

How do I allocate the memory for result? Suppose that the caller does not know how many characters result will be? The code within myMethod determines the final size and contents of the result parameter. Do I need to initialize the array to some arbitrarily large enough value? Here is where I resorted to just using string because I didn't need to worry about the length...

Erik -

# 11 Mar 2014

First part: correct.

Second part, I don't think that is correct syntax, simply use:

char str[xxx];

And your function: the advantage of doing it like that, instead of returning a pointer (this is a way better method than returning a pointer, otherwise you have always to remember deallocating it), is that the caller needs do it, and the function itself doesn't need to worry about it:

char result[100];
myMethod(result);

This will do it fine. And yes you need to initialize the array to be always larger than what myMethod requires (if that is dynamic you can make a second argument which is the length). A string has advantages, and I do see you have the LPC1768 which has quite some memory, but I haven't used it myself, so not familiar with it.

Andy A

# 11 Mar 2014

void someFunction() {
char result[100]
MessageBuilder builder;
builder.getSomeData(param1, param2, result);
remote.write(result);
}

MessageBuilder::getSomeData(int param1, int param2, char *result) {
sprintf(result,"The values were %d and %d",param1,param2);
}

All works fine and will automatically de-allocate all the memory used when the function exits.

There are however 2 potential issues, both to do with the size of result[]. Keep in mind that neither of these are really issues for typical home/learners code, just things to keep in mind for the future so that you don't slip into bad habits.

Firstly and not too critical is that result will always cause 100 bytes to be allocated even if you only end up needing a few. Probably not an issue in this situation but something to keep in mind if memory starts getting tight or if you are creating arrays of something larger than a byte.

The second potential issue is that there is no way for builder.getSomeData() to know how big the results string can be. If it tries to go past the size allocated then all sorts of nasty things can happen, normally things simply crash but in theory this is a potential security hole. As I said, probably not much of a problem for you right now, you weren't expecting someone to try to hack your mbed over the internet were you? However as a style issue this is always worth keeping in mind, if nothing else it could cause random crashes that are hard to track down e.g. it would work fine normally but then when the values passed are very large the string gets longer and overflows.

The simple solution is to make result[] far bigger than you'll ever need, see issue 1 for the problem with that.

The other option is to also pass the max size to the function.

#define _maxMessageSize_ 100

somefunction{
...
char result[_maxMessageSize_ ]
builder.getSomeData(param1, param2, result, _maxMessageSize_ );
...
}

MessageBuilder::getSomeData(int param1, int param2, char *result, int maxLength) {
snprintf(result,maxLength,"The values were %d and %d",param1,param2);
}

Generally for mbed type code this is the route I'd recommend going, it makes it easy to change the amount of memory used while protecting against buffer overflows.

The other option is to allocate the memory in your object (either by defining a char array in the object or by dynamically allocating it as needed) and then make sure the object destructor cleans things up:

class MessageBuilder {
public:
MessageBuilder();
~MessageBuilder();
char *getSomeData(int param1, int param2);
private:
char *messageBuffer;
int bufferSize;
}

MessageBuilder::MessageBuilder() { // constructor, flag buffer as not being allocated.
messageBuffer= NULL;
bufferSize  = 0;
}

MessageBuilder::~MessageBuilder() { // destructor, free any allocated buffer.
if (messageBuffer!=NULL)
  free(messageBuffer);
}

char *MessageBuilder::getSomeData(int param1, int param2) {
int bufferSizeNeeded = // calculate how big the buffer needs to be (or just a constant number)

if (bufferSize < bufferSizeNeeded ) { // check if we already have a buffer big enough. If so keep it to save on freeing and allocating memory
  if (messageBuffer!=NULL) // buffer is too small
    free(messageBuffer);

  messageBuffer = (char*)malloc(bufferSizeNeeded * sizeof(char));
  if (messageBuffer == NULL) {  // FAILED TO ALLOCATE MEMORY...
    bufferSize = 0;
    return NULL;
  } else
  bufferSize  = bufferSizeNeeded;
}  

// buffer is now allocated
  sprintf(messageBuffer,".......");

  return messageBuffer;
}

As you can see this is a lot more complex than simply defining a char array in the calling object and is probably overkill for your use.

Also keep in mind that as soon as the MessageBuilder object is destroyed (either at the end of the function or when you call delete depending on how it was created) then the pointer it returned will no longer be valid, that pointer will point to a block of memory that has been freed and so could have been overwritten. Also due to the way the buffer is reused calling getSomeData() a second time will invalidate the result of the first call (there are ways around this but they make memory management more of a pain) So why do it this way at all? Because it puts all of the memory allocation associated with the object into that object which for larger projects is normally a nice thing.

<<code>> void someFunction() { char result[100] MessageBuilder builder; builder.getSomeData(param1, param2, result); remote.write(result); } MessageBuilder::getSomeData(int param1, int param2, char *result) { sprintf(result,"The values were %d and %d",param1,param2); } <</code>> All works fine and will automatically de-allocate all the memory used when the function exits. There are however 2 potential issues, both to do with the size of result[]. Keep in mind that neither of these are really issues for typical home/learners code, just things to keep in mind for the future so that you don't slip into bad habits. Firstly and not too critical is that result will always cause 100 bytes to be allocated even if you only end up needing a few. Probably not an issue in this situation but something to keep in mind if memory starts getting tight or if you are creating arrays of something larger than a byte. The second potential issue is that there is no way for builder.getSomeData() to know how big the results string can be. If it tries to go past the size allocated then all sorts of nasty things can happen, normally things simply crash but in theory this is a potential security hole. As I said, probably not much of a problem for you right now, you weren't expecting someone to try to hack your mbed over the internet were you? However as a style issue this is always worth keeping in mind, if nothing else it could cause random crashes that are hard to track down e.g. it would work fine normally but then when the values passed are very large the string gets longer and overflows. The simple solution is to make result[] far bigger than you'll ever need, see issue 1 for the problem with that. The other option is to also pass the max size to the function. <<code>> #define _maxMessageSize_ 100 somefunction{ ... char result[_maxMessageSize_ ] builder.getSomeData(param1, param2, result, _maxMessageSize_ ); ... } MessageBuilder::getSomeData(int param1, int param2, char *result, int maxLength) { snprintf(result,maxLength,"The values were %d and %d",param1,param2); } <</code>> Generally for mbed type code this is the route I'd recommend going, it makes it easy to change the amount of memory used while protecting against buffer overflows. The other option is to allocate the memory in your object (either by defining a char array in the object or by dynamically allocating it as needed) and then make sure the object destructor cleans things up: <<code>> class MessageBuilder { public: MessageBuilder(); ~MessageBuilder(); char *getSomeData(int param1, int param2); private: char *messageBuffer; int bufferSize; } MessageBuilder::MessageBuilder() { // constructor, flag buffer as not being allocated. messageBuffer= NULL; bufferSize = 0; } MessageBuilder::~MessageBuilder() { // destructor, free any allocated buffer. if (messageBuffer!=NULL) free(messageBuffer); } char *MessageBuilder::getSomeData(int param1, int param2) { int bufferSizeNeeded = // calculate how big the buffer needs to be (or just a constant number) if (bufferSize < bufferSizeNeeded ) { // check if we already have a buffer big enough. If so keep it to save on freeing and allocating memory if (messageBuffer!=NULL) // buffer is too small free(messageBuffer); messageBuffer = (char*)malloc(bufferSizeNeeded * sizeof(char)); if (messageBuffer == NULL) { // FAILED TO ALLOCATE MEMORY... bufferSize = 0; return NULL; } else bufferSize = bufferSizeNeeded; } // buffer is now allocated sprintf(messageBuffer,"......."); return messageBuffer; } <</code>> As you can see this is a lot more complex than simply defining a char array in the calling object and is probably overkill for your use. Also keep in mind that as soon as the MessageBuilder object is destroyed (either at the end of the function or when you call delete depending on how it was created) then the pointer it returned will no longer be valid, that pointer will point to a block of memory that has been freed and so could have been overwritten. Also due to the way the buffer is reused calling getSomeData() a second time will invalidate the result of the first call (there are ways around this but they make memory management more of a pain) So why do it this way at all? Because it puts all of the memory allocation associated with the object into that object which for larger projects is normally a nice thing.

Reed Kimble

# 11 Mar 2014

Hi everyone, and thanks SO MUCH to Erik and Andrew for their help so far!

I recognize why most people always suggest the short and simple route and why Andrew's second example would generally be considered "overkill" for a beginner/learning application. But for me, the "overkill" solution is exactly what I'm looking for... here's why:

I'm only a pseudo-beginner LOL. I actually have many years of development experience in high-level languages and have been working with MCUs for quite some time as well. So I've already done many of the simple things and understand the basic concepts of memory management when everything is simple and straightforward within a single code file. My problem is that I naturally think in an object-oriented manner and it is actually harder for me to write an application without the object modeling. So for my own sanity and productivity, I need to be able to write the "overkill" version of the code in this example (which, to me, doesn't look like overkill at all, but rather, like the way it should to be done!).

To be honest, I am not currently working with an mBed. We started the design there, but now have a custom board hosting the 1768. Our device is already being tested in our production line, and is working well from a functionality standpoint, but is subject to the occasional lockup. With the above information I can see where I am probably accessing corrupt memory (reading values that have been "freed" and potentially overwritten).

In my real application there will be a couple of objects like the example "MessageBuilder". These objects would have a single instance which would be declared in the main code file and would persist for the life of the application. Technically they could be atomic classes if I knew that there was such a thing in C or how to create one. I'd like one class that encapsulates the functionality for generating the string message to be sent to the Wifly, and another class which manages the Wifly itself. (I started out using the mBed supplied Wifly library, but found it didn't perform well for my purposes, required several tweaks to work with firmware 4, and since I'm using the Wifly as a UDP bridge I opted for my own simple control code).

In Andrew's second example, I understand that I am keeping one block of memory (whose size can dynamically grow) and returning it whenever getSomeData() is called. I understand that my instance of MessageBuilder must survive for as long as I want to use the result of a call to getSomeData(). I also understand that I can only have one valid result from getSomeData() at one time.

So what is that workaround that would let me have multiple results from getSomeData()? To be honest, my real app would most likely be done with the result of the method call right after receiving it so I probably don't need more than one at a time, but I'd like to know just for future reference. Would the answer be to allocate a second char array the size of the result and then copy the memory from the result to the new array? What about what Erik said about using "new"? Could I allocate the memory for result within the getSomeData() method itself by using "new" (and accepting that the caller has to "delete" the result of the method call when it is finished with it)?

char *MessageBuilder::getSomeData(int param1, int param2) {
int bufferSizeNeeded = // calculate how big the buffer needs to be
 
messageBuffer = new char[bufferSizeNeeded];
 
// buffer is now allocated
  sprintf(messageBuffer,".......");
 
  return messageBuffer;
}

One other question... On the out parameter versus the return value, is this choice determined by where the memory for the result is allocated? By that I mean here

MessageBuilder::getSomeData(int param1, int param2, char *result, int maxLength) {
snprintf(result,maxLength,"The values were %d and %d",param1,param2);
}

Result is filled through an output parameter and in this case the result variable was allocated outside of the method call.

Whereas in this code

char *MessageBuilder::getSomeData(int param1, int param2) {

Result is the return value and the memory was allocated inside the object instance providing this method.

This is another thing about C that confuses me... when to use a return value and when to use an output parameter.

P.S. I just caught that "snprintf"... not familiar with that one and will have to look it up - I've only been using "sprintf".

I should also note that passing a length parameter is not desirable as the method in question determines the size and the caller doesn't know what it might be - it could be 12 characters, or 200... at least, I would ideally like to have that kind of flexibility (I'm confident that as long as I know how to clean up after myself properly, which I think I now do, I will not have to worry about running out of memory based on what I have available and the real-world execution of the code; that is to say, while I don't know whether one message is 12 or 200 characters, and may not know exactly how many messages I have at one time, I always know that the total number of active messages does not exceed my available memory, even if they are all 200 characters long).

Hi everyone, and thanks SO MUCH to Erik and Andrew for their help so far! I recognize why most people always suggest the short and simple route and why Andrew's second example would generally be considered "overkill" for a beginner/learning application. But for me, the "overkill" solution is exactly what I'm looking for... here's why: I'm only a pseudo-beginner LOL. I actually have many years of development experience in high-level languages and have been working with MCUs for quite some time as well. So I've already done many of the simple things and understand the basic concepts of memory management when everything is simple and straightforward within a single code file. My problem is that I naturally think in an object-oriented manner and it is actually harder for me to write an application without the object modeling. So for my own sanity and productivity, I need to be able to write the "overkill" version of the code in this example (which, to me, doesn't look like overkill at all, but rather, like the way it //should// to be done!). To be honest, I am not currently working with an mBed. We started the design there, but now have a custom board hosting the 1768. Our device is already being tested in our production line, and is working well from a functionality standpoint, but is subject to the occasional lockup. With the above information I can see where I am probably accessing corrupt memory (reading values that have been "freed" and potentially overwritten). In my real application there will be a couple of objects like the example "MessageBuilder". These objects would have a single instance which would be declared in the main code file and would persist for the life of the application. Technically they could be atomic classes if I knew that there was such a thing in C or how to create one. I'd like one class that encapsulates the functionality for generating the string message to be sent to the Wifly, and another class which manages the Wifly itself. (I started out using the mBed supplied Wifly library, but found it didn't perform well for my purposes, required several tweaks to work with firmware 4, and since I'm using the Wifly as a UDP bridge I opted for my own simple control code). In Andrew's second example, I understand that I am keeping one block of memory (whose size can dynamically grow) and returning it whenever getSomeData() is called. I understand that my instance of MessageBuilder must survive for as long as I want to use the result of a call to getSomeData(). I also understand that I can only have one valid result from getSomeData() at one time. So what is that workaround that would let me have multiple results from getSomeData()? To be honest, my real app would most likely be done with the result of the method call right after receiving it so I probably don't need more than one at a time, but I'd like to know just for future reference. Would the answer be to allocate a second char array the size of the result and then copy the memory from the result to the new array? What about what Erik said about using "new"? Could I allocate the memory for result within the getSomeData() method itself by using "new" (and accepting that the caller has to "delete" the result of the method call when it is finished with it)? <<code>> char *MessageBuilder::getSomeData(int param1, int param2) { int bufferSizeNeeded = // calculate how big the buffer needs to be messageBuffer = new char[bufferSizeNeeded]; // buffer is now allocated sprintf(messageBuffer,"......."); return messageBuffer; } <</code>> One other question... On the out parameter versus the return value, is this choice determined by where the memory for the result is allocated? By that I mean here <<code>> MessageBuilder::getSomeData(int param1, int param2, char *result, int maxLength) { snprintf(result,maxLength,"The values were %d and %d",param1,param2); } <</code>> Result is filled through an output parameter and in this case the result variable was allocated outside of the method call. Whereas in this code <<code>> char *MessageBuilder::getSomeData(int param1, int param2) { <</code>> Result is the return value and the memory was allocated inside the object instance providing this method. This is another thing about C that confuses me... when to use a return value and when to use an output parameter. P.S. I just caught that "snprintf"... not familiar with that one and will have to look it up - I've only been using "sprintf". I should also note that passing a length parameter is not desirable as the method in question determines the size and the caller doesn't know what it might be - it could be 12 characters, or 200... at least, I would ideally like to have that kind of flexibility (I'm confident that as long as I know how to clean up after myself properly, which I think I now do, I will not have to worry about running out of memory based on what I have available and the real-world execution of the code; that is to say, while I don't know whether one message is 12 or 200 characters, and may not know exactly how many messages I have at one time, I always know that the total number of active messages does not exceed my available memory, even if they are all 200 characters long).

Erik -

# 11 Mar 2014

[quote]So what is that workaround that would let me have multiple results from getSomeData()? To be honest, my real app would most likely be done with the result of the method call right after receiving it so I probably don't need more than one at a time, but I'd like to know just for future reference. Would the answer be to allocate a second char array the size of the result and then copy the memory from the result to the new array? What about what Erik said about using "new"? Could I allocate the memory for result within the getSomeData() method itself by using "new" (and accepting that the caller has to "delete" the result of the method call when it is finished with it)?[/quote] You could make the new data with new/malloc, and have as return variable a pointer to it. Then the caller needs to delete it indeed. But I really would advice against that method: it is incredibly easy to make memory leaks that way.

If you know roughly the size of what you need, it is imo easiest to simply allocate it in the caller (generally you can do it statically, so without new and have it automatically destroyed, but you can also do it with new), and give a pointer to that as argument to the function. If you do it statically you cannot get a memory leak that way. If you do it dynamically with new it is still alot easier to see where something is allocated, so that you also should de-allocate it.

So this is also related to your next paragraphs: yes the difference is where the memory is allocated. Advantage of doing it in the function is that you can there exactly allocate what you require. If this is variable, this is obviously an advantage. If it isn't variable it is irrelevant. Downside is that imo it is harder that way to keep track of where you allocate new memory, increasing risks regarding both de-allocating memory too soon, or forgetting it.

When to use return, and when an extra argument really depends on your situation (short disclaimer, I just do it as hobby which I teached mainly myself with the interwebs, I am not a proffesional programmer). If you just have a function which simply returns a single value, you return that. More complex functions where you return arrays, several values, etc, you use arguments. Then you can return a status (so for example 0 if you it was succesfull, something else if the function failed) if you want to.

What you can also do is make for example a container class, which contains the length and the data. It looks a bit then like using string, but it is probably alot more efficient (although string is still an option if you have the space). Then you return in your function an object of that class. And you make sure all the dynamic memory stuff is done in that class, then you cannot make a memory leak somewhere else. So very short pseudo code example with lot of syntax errors:

class ContainerClass {
public:
  ContainerClass(int length) {
    data = new char[length];
    _length = length;
  }
  ~ContainerClass() {
      delete(data);
    }
  char* data;
  int length;
}

ContainerClass value = getSomeData();
value.data[0] = ...

ContainerClass getSomeData(void) {
  int calcLength = 10+30+20/2;
  ContainerClass retval(calcLength);
  retval.data[0] = 59;
  return retval;
}

If that is worth it/handy, depends on your exact situation :).

I haven't used anything like that myself, so I just wrote something like something I once saw + what makes sense to me. But the advantage of this is that once you wrote your ContainerClass you don't have to dynamically allocate any memory yourself. (I am sure there are similar standard C++ options to do it, but generally similar to string they are alot less efficient on memory. For example I use in a library a LinkedList. There is an mbed lib which does this, and C++ can also do something similar enough by default. But the difference in memory consumption was huge).

Edit: Technically the class doesn't require to store the length variable, but generally it is handy to know how long the array is, even when it for example always ends with a null terminator.

[quote]So what is that workaround that would let me have multiple results from getSomeData()? To be honest, my real app would most likely be done with the result of the method call right after receiving it so I probably don't need more than one at a time, but I'd like to know just for future reference. Would the answer be to allocate a second char array the size of the result and then copy the memory from the result to the new array? What about what Erik said about using "new"? Could I allocate the memory for result within the getSomeData() method itself by using "new" (and accepting that the caller has to "delete" the result of the method call when it is finished with it)?[/quote] You could make the new data with new/malloc, and have as return variable a pointer to it. Then the caller needs to delete it indeed. But I really would advice against that method: it is incredibly easy to make memory leaks that way. If you know roughly the size of what you need, it is imo easiest to simply allocate it in the caller (generally you can do it statically, so without new and have it automatically destroyed, but you can also do it with new), and give a pointer to that as argument to the function. If you do it statically you cannot get a memory leak that way. If you do it dynamically with new it is still alot easier to see where something is allocated, so that you also should de-allocate it. So this is also related to your next paragraphs: yes the difference is where the memory is allocated. Advantage of doing it in the function is that you can there exactly allocate what you require. If this is variable, this is obviously an advantage. If it isn't variable it is irrelevant. Downside is that imo it is harder that way to keep track of where you allocate new memory, increasing risks regarding both de-allocating memory too soon, or forgetting it. When to use return, and when an extra argument really depends on your situation (short disclaimer, I just do it as hobby which I teached mainly myself with the interwebs, I am not a proffesional programmer). If you just have a function which simply returns a single value, you return that. More complex functions where you return arrays, several values, etc, you use arguments. Then you can return a status (so for example 0 if you it was succesfull, something else if the function failed) if you want to. What you can also do is make for example a container class, which contains the length and the data. It looks a bit then like using string, but it is probably alot more efficient (although string is still an option if you have the space). Then you return in your function an object of that class. And you make sure all the dynamic memory stuff is done in that class, then you cannot make a memory leak somewhere else. So very short pseudo code example with lot of syntax errors: <<code>> class ContainerClass { public: ContainerClass(int length) { data = new char[length]; _length = length; } ~ContainerClass() { delete(data); } char* data; int length; } <</code>> <<code>> ContainerClass value = getSomeData(); value.data[0] = ... <</code>> <<code>> ContainerClass getSomeData(void) { int calcLength = 10+30+20/2; ContainerClass retval(calcLength); retval.data[0] = 59; return retval; } <</code>> If that is worth it/handy, depends on your exact situation :). I haven't used anything like that myself, so I just wrote something like something I once saw + what makes sense to me. But the advantage of this is that once you wrote your ContainerClass you don't have to dynamically allocate any memory yourself. (I am sure there are similar standard C++ options to do it, but generally similar to string they are alot less efficient on memory. For example I use in a library a LinkedList. There is an mbed lib which does this, and C++ can also do something similar enough by default. But the difference in memory consumption was huge). Edit: Technically the class doesn't require to store the length variable, but generally it is handy to know how long the array is, even when it for example always ends with a null terminator.

Reed Kimble

# 11 Mar 2014

Thank you for the detail Erik. While I recognize the potential to delete memory too soon, or not at all, when using new/malloc within a function, I'm not to frightened by the prospect. The way I structure my code makes it generally obvious when something has been allocated and when it is known to no longer be needed. And if the code ever evolves into something which makes this questionable, I have no problem writing a little memory manager class to handle allocations for everything in the application.

On parameters and return values... so there are no rules that say you should use one versus the other for specific data types?

When I write high-level code, I rarely use out parameters. A function will typically return a single value if that is its purpose (a function designed to calculate some specific integer, float, or string output for instance), or an instance of another object which encapsulates multiple values if the function has more than one result value. The exception is any kind of "TryDoSomething" function which would return a Boolean success flag and use a out parameter to fill the actual result value when the function returned True.

In C, I would return an int or bool from a function with a single value, use out parameters for any other data type, and use a custom struct if there were multiple related return values. But from what I'm understanding here, so long as the memory allocation is done correctly, you can return a pointer to any new object from a function and do not "need" to use out parameters for anything in particular.

So if that is correct, then I'm just left with understanding the difference between "new" and "malloc". What is the difference (just in general) and when is each appropriate?

Andy A

# 11 Mar 2014

Multiple valid return strings from the example...

If this would be a rare instance then rather than building the complexity into the object I'd be tempted to let the calling function to make a copy of the string which it was then responsible for cleaning up.

e.g.

char *tempString = builder.getSomeData();
int stringLengh = strlen(tempString)
char *permanentString = (char *)malloc(stringLengh*sizeof(char));
strcpy(permanentString ,tempString);

The caller is then responsible for freeing permanentString when it is done with. Clearly measuring the size of the string returned here is a bit inefficient, it would make more sense to have the getSomeData() function to also return the string length, sprintf (and snprintf) returns the length of the final string so it is easily available.

If you did want it all in the object then you'd need a second function, you could either have a second version of getSomeData() (getSomeDataForever() maybe) that always created a new buffer and then internally tracked all the created buffers in order to delete them. Or you could have a function that made the current buffer permanent so that the next call would create a new one. Again it would need to track this in order to clean things up. Personally I'd use a std::list container class to track the buffers created but there are a number of ways of doing that.

When to use a return value and when to use a parameter... To a certain extent this comes down to personal preference. If you're allocating the memory in the calling function (i.e. defining a char array) then you need to pass it as a parameter, if the function is allocating the memory then you can do it either way.

Returning it as a return value is more useful if you want to check something passed or failed. For that reason you may actually do both.

e.g.

char *MessageBuilder::getSomeData(int param1, int param2, char **result) {
...
*result = messageBuffer;
return messageBuffer;
}

...
char *tmpString
if (getSomeData(&tmpString)) {
  // output tmpString;
}

In this case the return string parameter is passed as a pointer to a pointer so that the function can change it to point at the message buffer. Pointers to pointers can seems a little weird if you're not used to it but they have their uses.

Clearly you could do exactly the same functionality with just a return value or a return parameter but by having both you can save a line of code. That doesn't sound much of a big deal but if you have lots of lines which are basically a = b; then they distract from the lines that actually do something.

This is making use of the fact that in c any non-zero value is true and so a null pointer is always false. But the same code would work just as well if the return value was a length or a success/failure flag.

Personally if I was to modify the code to return the string length then I'd make that the returned value since most of the standard c string functions return the string length and consistency is always nice.

The only real issue with return values is returning structures or objects, then you always want to make it a pointer you pass to the function or return (by any method) and not the object itself. Anything less that 32 bit's this is a moot point since a pointer is 32 bits but for large blocks it can get important. Similarly if you find you are passing a large number of parameters then consider putting them in a structure and passing a pointer to the structure, it cuts down the amount of pushing and pulling values to/from the stack.

snprintf is simply sprintf but with a maximum length for safety. When you pass a length it's only ever a maximum length not a required length, snprintf won't touch any bytes beyond the minimum length needed for the result. The only reason for giving it a maximum is to protect against buffer overflows. There is also a strncpy which only copies a maximum of n characters for similar reasons. I didn't use it in the code above because the structure of the code ensures that the destination is large enough to copy the string into and so strcpy is safe (although I should check that the malloc didn't return NULL).

Multiple valid return strings from the example... If this would be a rare instance then rather than building the complexity into the object I'd be tempted to let the calling function to make a copy of the string which it was then responsible for cleaning up. e.g. <<code>> char *tempString = builder.getSomeData(); int stringLengh = strlen(tempString) char *permanentString = (char *)malloc(stringLengh*sizeof(char)); strcpy(permanentString ,tempString); <</code>> The caller is then responsible for freeing permanentString when it is done with. Clearly measuring the size of the string returned here is a bit inefficient, it would make more sense to have the getSomeData() function to also return the string length, sprintf (and snprintf) returns the length of the final string so it is easily available. If you did want it all in the object then you'd need a second function, you could either have a second version of getSomeData() (getSomeDataForever() maybe) that always created a new buffer and then internally tracked all the created buffers in order to delete them. Or you could have a function that made the current buffer permanent so that the next call would create a new one. Again it would need to track this in order to clean things up. Personally I'd use a std::list container class to track the buffers created but there are a number of ways of doing that. When to use a return value and when to use a parameter... To a certain extent this comes down to personal preference. If you're allocating the memory in the calling function (i.e. defining a char array) then you need to pass it as a parameter, if the function is allocating the memory then you can do it either way. Returning it as a return value is more useful if you want to check something passed or failed. For that reason you may actually do both. e.g. <<code>> char *MessageBuilder::getSomeData(int param1, int param2, char **result) { ... *result = messageBuffer; return messageBuffer; } ... char *tmpString if (getSomeData(&tmpString)) { // output tmpString; } <</code>> In this case the return string parameter is passed as a pointer to a pointer so that the function can change it to point at the message buffer. Pointers to pointers can seems a little weird if you're not used to it but they have their uses. Clearly you could do exactly the same functionality with just a return value or a return parameter but by having both you can save a line of code. That doesn't sound much of a big deal but if you have lots of lines which are basically a = b; then they distract from the lines that actually do something. This is making use of the fact that in c any non-zero value is true and so a null pointer is always false. But the same code would work just as well if the return value was a length or a success/failure flag. Personally if I was to modify the code to return the string length then I'd make that the returned value since most of the standard c string functions return the string length and consistency is always nice. The only real issue with return values is returning structures or objects, then you always want to make it a pointer you pass to the function or return (by any method) and not the object itself. Anything less that 32 bit's this is a moot point since a pointer is 32 bits but for large blocks it can get important. Similarly if you find you are passing a large number of parameters then consider putting them in a structure and passing a pointer to the structure, it cuts down the amount of pushing and pulling values to/from the stack. snprintf is simply sprintf but with a maximum length for safety. When you pass a length it's only ever a maximum length not a required length, snprintf won't touch any bytes beyond the minimum length needed for the result. The only reason for giving it a maximum is to protect against buffer overflows. There is also a strncpy which only copies a maximum of n characters for similar reasons. I didn't use it in the code above because the structure of the code ensures that the destination is large enough to copy the string into and so strcpy is safe (although I should check that the malloc didn't return NULL).

Andy A

# 11 Mar 2014

<<quote>>So if that is correct, then I'm just left with understanding the difference between "new" and "malloc". What is the difference (just in general) and when is each appropriate? <</quote>>

The main difference is that new is c++, malloc is c.

New will return a pointer to a new instance of an object after calling that objects constructor function (and the constructors for any objects that that object inherits from). All objects must have a default constructor which doesn't take any parameters (if you don't explicitly create one then the compiler will create an empty one) but you can also have constructors which take parameters if you want. As you can probably appreciate new can be a fairly expensive function to call depending on the complexity of the constructor and any classes it inherits from.

malloc() will return a void pointer to a block of memory on the heap of the size requested which must then be cast to the type required (e.g. char*). The memory itself isn't initialized or touched in any way, it is simply marked as being reserved until it is released with free. Still not stunningly fast, dynamic memory allocation never is, but faster than new.

Similarly delete is c++ and will call the destructor for the class before freeing the memory allocated by new, this allows you to free/destroy any memory allocated within that instance of the class. free() is c and will simply mark the memory as being available without executing any other code.

So new for c++ classes, malloc for arrays, structures and any other unmanaged blocks of memory.

One issue Erik didn't mention is that in addition to being safer static memory allocation is generally faster. The down side is that it lacks flexibility and in some systems stack size is a limitation.

Reed Kimble

# 11 Mar 2014

Andrew this is brilliant! I'm so thankful that you seem to understand exactly what I am trying to ask and are able to frame your response in such a way that it is perfectly clear to my way of thinking.

A pointer-to-a-pointer actually makes perfect sense because that is very much what .Net actually does. I just didn't know how to write it in C or if that was even the correct way to go about resolving my issue. I see now that it very well could be just what I needed.

I'm not sure that I'm entirely clear on * and &... if I am declaring a pointer I use * and when I am passing a pointer I use &... is it that simple? So...

MyStruct getAResult(int param1) {
   MyStruct result = (MyStruct *)malloc(sizeof(MyStruct));
   //...set result fields
   return &result;
}

...would that be correct? ~~Still a little fuzzy on new versus malloc...~~ (EDIT: reading your post on the subject now! I think that will make it all clear)

Also, one other question on scope...

Does every code block have its own scope, or is it limited to the class and method level?

In this pseudo program:

int stateParam1;
int stateParam2;
char* result;
WiflyDevice wifly;

while(1) {
   updateState();
   result = makeSomeDecision(stateParam1, stateParam2); //result value allocated with new/malloc inside method
   if(strncmp(result, "expected start", 14) == 0) {
      char output[128];
      char* strValue  = determineSomeStringValue(); //memory allocated with new/malloc inside method
      int numValue = determineSomeNumberValue();
      sprintf(output, "<cmdname strvalue=%s,numvalue=%i, strValue, numValue);
      delete strValue; //or, free
      wifly.write(output);
   }
   delete result; //or, free result
}

What happens to "output" in this code when the if{} block ends? Does the variable fall out of scope and get destroyed? In .Net I am used to every code block having its own scope. So in .Net the answer would be yes, as soon as you exit that If-block, the output variable will fall out of scope and be destroyed (well, released to the GC anyway).

Andrew this is brilliant! I'm so thankful that you seem to understand exactly what I am trying to ask and are able to frame your response in such a way that it is perfectly clear to my way of thinking. A pointer-to-a-pointer actually makes perfect sense because that is very much what .Net actually does. I just didn't know how to write it in C or if that was even the correct way to go about resolving my issue. I see now that it very well could be just what I needed. I'm not sure that I'm entirely clear on * and &... if I am declaring a pointer I use * and when I am passing a pointer I use &... is it that simple? So... <<code>> MyStruct getAResult(int param1) { MyStruct result = (MyStruct *)malloc(sizeof(MyStruct)); //...set result fields return &result; } <</code>> ...would that be correct? --Still a little fuzzy on new versus malloc...-- (EDIT: reading your post on the subject now! I think that will make it all clear) Also, one other question on scope... Does every code block have its own scope, or is it limited to the class and method level? In this pseudo program: <<code>> int stateParam1; int stateParam2; char* result; WiflyDevice wifly; while(1) { updateState(); result = makeSomeDecision(stateParam1, stateParam2); //result value allocated with new/malloc inside method if(strncmp(result, "expected start", 14) == 0) { char output[128]; char* strValue = determineSomeStringValue(); //memory allocated with new/malloc inside method int numValue = determineSomeNumberValue(); sprintf(output, "<cmdname strvalue=%s,numvalue=%i, strValue, numValue); delete strValue; //or, free wifly.write(output); } delete result; //or, free result } <</code>> What happens to "output" in this code when the if{} block ends? Does the variable fall out of scope and get destroyed? In .Net I am used to every code block having its own scope. So in .Net the answer would be yes, as soon as you exit that If-block, the output variable will fall out of scope and be destroyed (well, released to the GC anyway).

Andy A

# 11 Mar 2014

Your c# bad habits are showing ;-) One thing I hated about the small amount of c# I've done is that it blurred the line between pointers and the actual object, from a c/c++ background I hated that.

& and * do get a little weird in places...

When declaring a variable * means a pointer to e.g.

char *myCharPtr;

creates a char pointer called myCharPtr

But in the code

*myCharPtr = 0x55;

means set the value pointed to by myCharPtr to 0x55. (obviously if myCharPointer hasn't been set to point to a valid memory location somewhere between those two lines nasty things will happen).

In other words * can have two different meanings depending on context.

& also has two meanings in c++ but only one in c.

The most common one is "the address of" e.g.

char myChar;
char *myCharPtr = &myChar;

The other one is in c++ a function parameter or return type with a & means a reference to the variable type.

int &myFunction() {...}

In that context consider it to mean the same as a pointer and you'll be ok 99% of the time.

So &(a pointer type) is the address of the pointer or a pointer to a pointer.

In your code

MyStruct getAResult(int param1) {
   MyStruct result = (MyStruct *)malloc(sizeof(MyStruct));
   //...set result fields
   return &result;
}

Is wrong in a couple of places.

The function definition has a return type of MyStruct, but you return &result which is the address of result, in other words it's a MyStruct *. Secondly on your malloc line you define result as a MyStruct but then set it's value to being a MyStruct pointer.

Assuming you want to use pointers everywhere the code should be

MyStruct *getAResult(int param1) {  // return type of MyStruct pointer
   MyStruct *result = (MyStruct *)malloc(sizeof(MyStruct)); // result is of type MyStruct *
   //...set result fields
   return result; // result is already a pointer, &result would be a pointer to a pointer.
}

At the risk of confusing you...

typedef struct aCustomStruct {
int a;
...
} CustomStructType;

void someFunction {
CustomStructType myStruct;  // creates a variable myStruct of type CustomStructType.
CustomStructType *myStructPtr = &myStruct; // create a variable of type CustomStructType* and point it to myStruct

// these lines all do exactly the same thing
myStuct.a = 5;  
(*myStructPtr).a = 5;
myStuctPtr->a = 5; 
}

(*myStructPtr) is the thing pointed to by myStructPtr, in other words myStruct. myStuctPtr->a means value a of the thing pointed to by myStructPtr.

Generally you'd normally use -> in this situation. *(some variable pointer) is generally only used when setting the actual variable itself (e.g. a pass by reference parameter) but that doesn't mean it can't be used at other times.

On the scope issue, yes, exactly as you would expect. Any variables declared within a {} will be destroyed at the end of them. Any pointers set using malloc or new must be freed/destroyed before then or you'll have a memory leak (unless you pass the value out to some variable with a larger scope).

In pure c code some compilers/standards require variable definitions to follow the { that defines their scope. e.g.

void myFunction() {
int x = 4;
x++;
int y = 3; // <-- some c compilers would complain about this line
}

however the scoping itself is still as you would expect.

Your c# bad habits are showing ;-) One thing I hated about the small amount of c# I've done is that it blurred the line between pointers and the actual object, from a c/c++ background I hated that. & and * do get a little weird in places... When declaring a variable * means a pointer to e.g. <<code>> char *myCharPtr; <</code>> creates a char pointer called myCharPtr But in the code <<code>> *myCharPtr = 0x55; <</code>> means set the value pointed to by myCharPtr to 0x55. (obviously if myCharPointer hasn't been set to point to a valid memory location somewhere between those two lines nasty things will happen). In other words * can have two different meanings depending on context. & also has two meanings in c++ but only one in c. The most common one is "the address of" e.g. <<code>> char myChar; char *myCharPtr = &myChar; <</code>> The other one is in c++ a function parameter or return type with a & means a reference to the variable type. <<code>> int &myFunction() {...} <</code>> In that context consider it to mean the same as a pointer and you'll be ok 99% of the time. So &(a pointer type) is the address of the pointer or a pointer to a pointer. In your code <<code>> MyStruct getAResult(int param1) { MyStruct result = (MyStruct *)malloc(sizeof(MyStruct)); //...set result fields return &result; } <</code>> Is wrong in a couple of places. The function definition has a return type of MyStruct, but you return &result which is the address of result, in other words it's a MyStruct *. Secondly on your malloc line you define result as a MyStruct but then set it's value to being a MyStruct pointer. Assuming you want to use pointers everywhere the code should be <<code>> MyStruct *getAResult(int param1) { // return type of MyStruct pointer MyStruct *result = (MyStruct *)malloc(sizeof(MyStruct)); // result is of type MyStruct * //...set result fields return result; // result is already a pointer, &result would be a pointer to a pointer. } <</code>> At the risk of confusing you... <<code>> typedef struct aCustomStruct { int a; ... } CustomStructType; void someFunction { CustomStructType myStruct; // creates a variable myStruct of type CustomStructType. CustomStructType *myStructPtr = &myStruct; // create a variable of type CustomStructType* and point it to myStruct // these lines all do exactly the same thing myStuct.a = 5; (*myStructPtr).a = 5; myStuctPtr->a = 5; } <</code>> (*myStructPtr) is the thing pointed to by myStructPtr, in other words myStruct. myStuctPtr->a means value a of the thing pointed to by myStructPtr. Generally you'd normally use -> in this situation. *(some variable pointer) is generally only used when setting the actual variable itself (e.g. a pass by reference parameter) but that doesn't mean it can't be used at other times. On the scope issue, yes, exactly as you would expect. Any variables declared within a {} will be destroyed at the end of them. Any pointers set using malloc or new must be freed/destroyed before then or you'll have a memory leak (unless you pass the value out to some variable with a larger scope). In pure c code some compilers/standards require variable definitions to follow the { that defines their scope. e.g. <<code>> void myFunction() { int x = 4; x++; int y = 3; // <-- some c compilers would complain about this line } <</code>> however the scoping itself is still as you would expect.

Reed Kimble

# 11 Mar 2014

Hahaha! Actually to make it even worse, my flavor is VB, not C#! I started programming in BASIC when it was the OS on a C64 and AppleIIe. :)

I actually had a huge "ah-ha" moment when you mentioned returning the actual object versus the pointer when the memory allocated was a lot more than the size of a pointer. That's when I sobered up off the .Net juice and saw the single line. :)

I think I'm clear on * and &. I'll probably refer to this post umpteen times while rewriting, but I think I get it.

"At the risk of confusing [me]..." hahaha! You should have said "with the intent of crystalizing this for you"!

I can't tell you how helpful all of this has been. Especially the "between-the-lines" information, like this most recent "ah-ha" moment where I realized there is a place called "oblivion" in C and pointers live there until they are assigned something!

I know I'm spoiled by a managed environment ;) but I do have a pretty good understanding of how the CLR does what it does so with a little nudge in the right direction (which is exactly what I've gotten) I'm pretty sure I can come up with something that would be considered acceptable by unmanaged C standards. :)

Quick question, what's the deal with two type names?

typedef struct aCustomStruct {
int a;
...
} CustomStructType;

I've never put the second, apparently "friendly" name, at the end and would just use "aCustomStruct".

So I guess my only remaining decision now is whether or not to use <string> for my dynamic string composition or stick to managing variable size char arrays. I'm already using <list> for my queuing needs versus managing a static array, but now I understand how I could write a simple dynamic queue to hold an internal array and resize it as necessary (grow and shrink) but I'm not sure there would be any advantage (especially with supporting methods) versus just using the <list>. But if using a list is just as good or better than managing a reference array, then perhaps using string is just as good or better than managing character arrays?

Hahaha! Actually to make it even worse, my flavor is VB, not C#! I started programming in BASIC when it was the OS on a C64 and AppleIIe. :) I actually had a huge "ah-ha" moment when you mentioned returning the actual object versus the pointer when the memory allocated was a lot more than the size of a pointer. That's when I sobered up off the .Net juice and saw the single line. :) I think I'm clear on * and &. I'll probably refer to this post umpteen times while rewriting, but I think I get it. "At the risk of confusing [me]..." hahaha! You should have said "with the intent of crystalizing this for you"! I can't tell you how helpful all of this has been. Especially the "between-the-lines" information, like this most recent "ah-ha" moment where I realized there is a place called "oblivion" in C and pointers live there until they are assigned something! I know I'm spoiled by a managed environment ;) but I do have a pretty good understanding of how the CLR does what it does so with a little nudge in the right direction (which is exactly what I've gotten) I'm pretty sure I can come up with something that would be considered acceptable by unmanaged C standards. :) Quick question, what's the deal with two type names? <<code>> typedef struct aCustomStruct { int a; ... } CustomStructType; <</code>> I've never put the second, apparently "friendly" name, at the end and would just use "aCustomStruct". So I guess my only remaining decision now is whether or not to use <string> for my dynamic string composition or stick to managing variable size char arrays. I'm already using <list> for my queuing needs versus managing a static array, but now I understand how I could write a simple dynamic queue to hold an internal array and resize it as necessary (grow and shrink) but I'm not sure there would be any advantage (especially with supporting methods) versus just using the <list>. But if using a list is just as good or better than managing a reference array, then perhaps using string is just as good or better than managing character arrays?

Andy A

# 12 Mar 2014

C64? That a bit modern isn't it. I was BBC Micro kid (which if you're not in the UK will mean nothing to you but you can trace ARMs origins back to that computer). Also I should warn you that my formal training is in hardware so the software side is purely what I've picked up over the years, given the amount of stuff covered here I'm sure there must be at least some minor technical error somewhere but it should be close enough to make things work.

The two names issue is a little clearer if you imagine brackets around it (you can't in the actual code), one is the name of the structure. The other is the name of the variable type that we are defining.

typedef ( struct structureName { structure contents } ) typeName

Alternatively you could do:

struct aCustomStruct {
int a;
...
};

typedef struct aCustomStruct CustomStructType;

The typedef is purely optional, it saves you putting the struct prefix in front of the structure name every time you want to declare one in the code. Assuming the definitions above these two lines would be identical.

struct aCustomStruct myStruct;
CustomStructType myStruct;

Often people will use the same name for both, since structures and types are different name spaces that is ok and there is no risk of the compiler getting confused. I've avoided that here simple because seeing the same name twice in a type definition gets confusing.

There are also big coding style debates about whether to typedef or not to typedef. The argument against is that by forcing you to remember that something is a struct rather than a primitive data type you will be more aware of it's memory foot print and so more mindful of things like where you allocate the memory and how you pass it between functions. Personally I feel that if you don't at least have a rough idea of what a variable type that you are using contains then you shouldn't be sitting near the computer to start with. It's one of those personal style issues again.

Standard libraries Vs custom code: The standard libraries are nice because a) they are standard (ish, visual c++ has some minor differences because it's microsoft) b) they don't need debugging and c) they are optimized. Having said that they are generally optimized for speed rather than size which can be an issue on embedded systems. Also they will often have features which you don't need meaning that you may be able to do better by only implementing the subset of features you need.

If nothing else the standard library is probably a good starting point and if you find you have memory issues look at manually implementing just the parts you need.

C64? That a bit modern isn't it. I was BBC Micro kid (which if you're not in the UK will mean nothing to you but you can trace ARMs origins back to that computer). Also I should warn you that my formal training is in hardware so the software side is purely what I've picked up over the years, given the amount of stuff covered here I'm sure there must be at least some minor technical error somewhere but it should be close enough to make things work. The two names issue is a little clearer if you imagine brackets around it (you can't in the actual code), one is the name of the structure. The other is the name of the variable type that we are defining. typedef ( struct structureName { structure contents } ) typeName Alternatively you could do: <<code>> struct aCustomStruct { int a; ... }; typedef struct aCustomStruct CustomStructType; <</code>> The typedef is purely optional, it saves you putting the struct prefix in front of the structure name every time you want to declare one in the code. Assuming the definitions above these two lines would be identical. <<code>> struct aCustomStruct myStruct; CustomStructType myStruct; <</code>> Often people will use the same name for both, since structures and types are different name spaces that is ok and there is no risk of the compiler getting confused. I've avoided that here simple because seeing the same name twice in a type definition gets confusing. There are also big coding style debates about whether to typedef or not to typedef. The argument against is that by forcing you to remember that something is a struct rather than a primitive data type you will be more aware of it's memory foot print and so more mindful of things like where you allocate the memory and how you pass it between functions. Personally I feel that if you don't at least have a rough idea of what a variable type that you are using contains then you shouldn't be sitting near the computer to start with. It's one of those personal style issues again. Standard libraries Vs custom code: The standard libraries are nice because a) they are standard (ish, visual c++ has some minor differences because it's microsoft) b) they don't need debugging and c) they are optimized. Having said that they are generally optimized for speed rather than size which can be an issue on embedded systems. Also they will often have features which you don't need meaning that you may be able to do better by only implementing the subset of features you need. If nothing else the standard library is probably a good starting point and if you find you have memory issues look at manually implementing just the parts you need.

Erik -

# 12 Mar 2014

I agree with Andrew's post, however the memory usage of standard C++ libs is really something you need to take serious. Now the LPC1768 has quite a bit, but you probably also choose it to use quite a bit. But as random example, the standard vector lib, adds somce nice functions compared to a standard array.

However imagine the only thing you want is that you want to dynamically create an array, store its length, and after creation it never has to change + don't need to worry about deleting the dynamically created stuff again. Then you can use the example I posted earlier. Meanwhile using the vector lib requires something like 20kB flash. Can be ignored when you making a PC program. On your LPC1768 it is acceptable probably, but if you use many similar libs you might run into a problem. On an LPC812? It won't even fit without anything else.

So using them is fine, as long as you remember where your memory consumption is coming from. Also it is hard to make something which is faster than simply accessing array elements.

Important changes to forums and questions