Challenging issue with ticker interrupts messed up by serial communication

10 Mar 2014

Hi All,

I have a problem in my code and have been struggling with is for a couple of days now so I could use some help.

Here is the case:

I have a complex motion loop processing at 500Hz by means of an Ticker implementation. So the Ticker interrupts to process the motion loop function at a frequency of 500Hz.

Besides this, I communicate with two OLED displays via serial communication (the serial ports are OLED_L and OLED_R). I initiate two animations subsequently on both OLEDs by sending them the required commands. After each command an OLED responds with an ACK (0x06) back to MBED. This response should be catched properly before the second animation can be send.

I catch this ACK response as follows:

#ifdef FULL_VERSION 
    LED_1 = 1;//DEBUG
    //Read back response
    while(OLED_L.readable()==0) {
        wait(0.001);
    }
    while(OLED_L.readable()==1) {
        ACK_L = OLED_L.getc();        //printf("    ACK_L %X [Hex]\n", ACK_L);
    }

    while(OLED_R.readable()==0) {
        wait(0.001);
    }
    while(OLED_R.readable()==1) {
        ACK_R = OLED_R.getc();        //printf("    ACK_R %X [Hex]\n", ACK_R);
    }
    
   
    LED_1 = 0;//DEBUG
#endif

This works fine. However, here is the issue; when I have enabled the code snippet above, so by defining #FULL_VERSION, to catch the ACK response from both OLEDs, the motion loop is strongly distorted by this. I notice the following behavior: as long as LED_1 is on the motion loop interrupt 'hangs' (which I visualized by having it blink at a certain rate as well and it stops blinking) till LED_1 is off again. As soon as it is off, so the processing of the OLEDs serial feedback has finished properly, the motion loop interrupt picks it up again however initially it executes much faster than the 500Hz like if it is making up time for the 'hanging' period by compressing the missed interrupts.

This is strange as the motion loop interrupt should have priority over this part of code.

Has anybody ideas on what causes my motion interrupt to hang when my program executes the serial communication code snippet above?

Additional notes:

  • I also tested it by a severely stripped version of my complete program to decrease complexity. There the motion interrupt performs fine during execution of the code snippet, confirming it's higher priority. So could this have anything to do with memory issues maybe as the complete program is much more complex in processing stuff?
  • When the code snippet is disabled, so Full_VERSION not defined, then the motion loop works fine on the interrupt. Obviously this causes the OLEDs to become unresponsive for the second animation call as the ACK isn't properly catched before the new command is send.
  • I also tried to make use of RxIrq as an altertnative. It is shown below but this doesn't properly catch the OLEDs feedback over the serial communication? I don't get nice acknowledges back but instead it seems to miss some ACKs and then pick up mess. (I updated the MBED library as I read about some issues with this.)

OLED_L.attach(&OLED_L_SerialRecvInterrupt, OLED_L.RxIrq);
OLED_R.attach(&OLED_R_SerialRecvInterrupt, OLED_R.RxIrq);

void OLED_L_SerialRecvInterrupt (void)
{   
    printf("OLED_L_SerialRecvInterrupt\n");
    char ACK_L = 0; 
    while(OLED_L.readable()==1) {
            ACK_L = OLED_L.getc();
            printf("    ACK_L %X [Hex]\n", ACK_L);
        }
}

void OLED_R_SerialRecvInterrupt (void)
{   
    printf("OLED_R_SerialRecvInterrupt\n");
    char ACK_R = 0; 
    while(OLED_R.readable()==1) {
            ACK_R = OLED_R.getc();
            printf("    ACK_R %X [Hex]\n", ACK_R);
        }
}
10 Mar 2014

Nobody ideas about what causes the interrupt to hang, and accelerate afterwards, when the snippet for processing the incoming serial communication is enabled?

If there are questions or there is a need for more specific infor please let me know so I can provide.

Hope you can help.

10 Mar 2014

Using printf in interrupts is not the rigth way of doing things...

It's better to use volatile variables (as flag or else), an then look for the value of these variable to get what happened during the interrupt into the main loop of your soft.

somthing like

volatile char ACK_L=0;

void OLED_L_SerialRecvInterrupt (void)
{   
    while(OLED_L.readable()==1) {
            ACK_L = OLED_L.getc();
        }
}

and into main loop 


...

if (ACK_L)
   printf("    ACK_L %X [Hex]\n", ACK_L);

10 Mar 2014

Hi Raph, thanks for the response. Your response is specific for one of the alternatives I tried to get around the issue, namely using the RxIrq.

The printf statements are not present when I have the problem. I only added them afterwards for debugging purposes.

I will try the RxIrq again with the printf statements disabled again. But if I am not mistaken I already did this and it provided no help. Any other ideas maybe?

Any ideas why the original solution doesn't work properly while it should?

10 Mar 2014

I think your best bet is to selectively start disabling parts of your code until it does function. In principle you can easily show it happens with some blinking LEDs. Since you say that it does work fine with parts disabled, I think it will be very hard to find out what is the issue without having the same software and corresponding hardware.

10 Mar 2014

Hi Erik,

Thank you for your response. The last couple of days I did just that to find out what causes the problem; it is the code snippet that I showed above which is responsible for taking care of catching the OLEDs feedback over the serial communication.

When the part is disabled the motion loop works fine on the Ticker's interrupts. However, in that case the OLEDs become unresponsive for a new command as there response isn't properly taken care of.

When the part is enabled, well the OLEDs function just fine but strangely enough the motion loop interrupt waits/hangs till the serial feedback in this part is finished and then the interrupt is executed again. First very fast as if it makes up for the missed interrupts and then continuing nicely on the desired 500Hz.

Could it have something to do with lower-level stuff in the MBED library that I don't know of (yet)?

10 Mar 2014

However you said it also functions with both enabled and other code disabled (the stripped down version), at least I assume that is with LCD running and motion processing enabled. So then the question is which of those parts you disabled makes it function.

If you have your LCD code running, and your ticker doing something very simple, like just blinking an LED, does that work?

IIRC that it goes very fast to make up missed interrupts is indeed Ticker behavior (after one finishes, it will set a new interrupt at previous ticker absolute time + ticker time settings. If that is in the past it will directly execute it, and do it again, etc, until it is in the future again). So then the question is why your Ticker doesn't fire.

And to do that I really think you need to try to make it less complex. So for example just basic code in the Ticker. Or with only one LCD. If it still doesn't work with one LCD, replace it with simple PC serial, and manually send commands. If that still doesn't work, then we can also try your code since we don't need your hardware anymore :).

11 Mar 2014

My problem seems solved now but to give feedback for others that might be struggling with the same kind of issue:

First of all I back up Erik's last reply that it is wise to start with enabling/disabling code and try less complex (stripped) dummy versions. That is exactly what I did before starting this forum thread and made me focus on the OLED serial communication in combination with Ticker behavior of my motion loop. However, I was completely tuck in what the cause of the problem could be.

To solve it I have used the serial receive interrupt version (RxIrq) to catch both the OLEDs response, as stated in my original question. (Of course without the printf statements.) But as I mentioned before this didn't work properly for me. It appeared to be a timing issue with another timeout that I had in place to report when the OLEDs where finished and could receive new commands. Somehow these could not co-exist and receiving the OLEDs interrupts should be leading.

I still don't get it why the original solution with the 'while' and '.readable' for catching the OLEDs response does not work and ruins the motion loops interrupt. But now with the RxIrq's functional in place these don't ruin the motion loop so this works for me.

Thanks for the help!