LPC1768 FastOut and DigitalOut are crippled

22 Jan 2014

There seems to be a problem with the mbed LPC1768 library that slows down writing to the GPIO ports. Both DigitalOut and FastOut (which uses the FIO registers) are affected.

Back in June 2011 I wrote a simple test program, the gist of which is below.

#include "FastOut.h"
FastOut<LED1> led1;

while (1) {
    led1 = 1;
    led1 = 0;
}

On an LPC1768 running at 96MHz and with library version 26, this code produces a 24MHz waveform. In other words, the loop takes 4 clocks. This is as expected. With the current version of the library (76), the same code produces a 3.4MHz waveform. This equates to a 28 clock loop, i.e. 7 times slower!

Erik Olieman has independently confirmed this. Erik also reports that DigitalOut is more than 3 times slower with the current library than with the earlier one. Erik has done some digging and believes that the problem was introduced with library version 40.

Can I request that you fix this problem ASAP?

@mbed team: look at the programs FastOut and newFastOut in my repository. I can email you 'scope screen dumps if you wish.

23 Jan 2014

As addition on Paul's post, these are what I get when I set the output several times in a loop (10 times, to decrease the loop overhead), with mbed libs before version 40 (which is a 'new build system', and after version 40 (I just measured it with a timer, generally I would think I did something wrong, but since it confirms Paul's result it must be the library):

Before:
 DigitalOut: 0.552084 seconds (55 ns per iteration).
 FastOut: 0.135417 seconds (13 ns per iteration).

So roughly 5 clock cycles for DigitalOut, and 1 for FastOut (there is still some overhead). FastOut is really impressive, but also DigitalOut is very good considering it is a regular C function which needs to figure out the correct registers/bits (IIRC arduino requires 16 clock cycles for the same).

However now after:

DigitalOut: 1.760418 seconds (176 ns per iteration).
 FastOut: 1.416667 seconds (141 ns per iteration).

I don't think I have to comment on those numbers. Problem is that I have no idea what changed in revision 40, and how it can be fixed.

25 Jan 2014

I took a quick look at Paul's FastOut scenario with both builds 39 and 76. It appears to be a codegen issue and not directly related to the SDK code itself.

This is what the while(1) loop looks like with the latest online compiler setup. The BL instructions are the calls to the write() method in the FastOut class. Not the fastest of code.

00001B2E  2101       MOVS R1, #1 ; 1 == 0x1
00001B30  4807       LDR R0, .+32 ; .+32 == 0x1B50
00001B32  F000 FBA5  BL.W .+1870 ; .+1870 == 0x2280
00001B36  2100       MOVS R1, #0 ; 0 == 0x0
00001B38  4805       LDR R0, .+24 ; .+24 == 0x1B50
00001B3A  F000 FBA1  BL.W .+1862 ; .+1862 == 0x2280
00001B3E  E7F6       B .-16 ; .-16 == 0x1B2E

Compare that to the code generated for the same loop when you revert to revision 39. Everything was inlined back then:

000021AE  4807       LDR R0, .+30 ; .+30 == 0x21CC
000021B0  F44F 2180  MOV.W R1, #0x00040000
000021B4  6381       STR R1, [R0, #56] ; 56 == 0x38
000021B6  63C1       STR R1, [R0, #60] ; 60 == 0x3C
000021B8  E7FC       B .-4 ; .-4 == 0x21B4
000021CC  DW $2009C000 

Obviously the latest online build setup isn't as aggressive with inlining of short methods. I don't know if the compiler version was changed with revision 40 or maybe just some compiler flags. Maybe there are some flags that the mbed team can tweak to make the compiler more aggressive about inlining again. I did try adding the inline keyword to the FastOut methods but I still didn't get the desired inlining with the current compiler setup.

I hope that helps,

Adam

25 Jan 2014

Question 1: I did look shortly for it, but where do you get the assembly code from? At least I assume you don't paste the bin in a hex editor and do it manually ;).

Might be worth a look getting the code into offline compiler and changing the optimization settings of the compiler.

25 Jan 2014

Erik - wrote:

Question 1: I did look shortly for it, but where do you get the assembly code from? At least I assume you don't paste the bin in a hex editor and do it manually ;).

Nothing quite as hardcore as hand disassembling :) I wrote a Thumb2 disassembler several years ago and I just used it to peek at the code. I did have to add a line like this to main() and run it to so that I knew where in the binary to start disassembling.

    printf("main = %08X\n", main);
27 Jan 2014

Adam, thank you for this - a very nice piece of detective work!

I find it disappointing that the mbed team have not even commented on this issue. After all, the LPC1768 is still their flagship platform. One would think that they would want it to perform as well as possible.

To me, this issue demonstrates the good and bad parts of the mbed project. The good part is undoubtably the excellent user community. This is a huge asset to ARM, but I suspect, a grossly unappreciated one. The bad part is the isolation and lack of customer focus of the mbed team. They don't communicate with the user base and seem to have a "couldn't care less" attitude to problems.

But hey, that's just my opinion. What do other mbed users think?

27 Jan 2014

Paul, no problem.

I did want to point out that Erik also started a thread on the mbed-devel mailing list last week (which is what first sparked my investigation) and Simon Ford from the mbed team has responded there.

Thanks,

Adam

27 Jan 2014

Adam,

I don't follow the mbed-devel mailing list so was unaware of Simon's response.

I started this topic in the mbed Bugs and Suggestions forum, so to my mind, that is where the response should have gone. At the least, post a short reply in the forum with a link to the mailing list.

What is the point of having a Bugs and Suggestions forum if it is going to be ignored?

27 Jan 2014

Do you get the same slow results when you compile online with mbed-src instead of the precompiled mbed library?

Regards
Neni

27 Jan 2014

Nenad, I haven't tried this, so I don't know.

Having to do stuff like this defeats the whole point of mbed. To quote the Explore section of the site:

Quote:

We've worried about creating and testing startup code, C runtime, libraries and peripheral APIs, so you can worry about coding the smarts of your next product.

09 Feb 2014

Open-sourcing the MBED platform is great, but this problem shows the huge weakness: the proprietary compiler that can potentially stop a project dead in its tracks with some unspecified time to fix (maybe never? Don't choose MBED for your next project!).

Why not open-source the compiler? Or at least, provide the full instructions needed to reproduce the 'official' compiler, even if some required components are not free.

09 Feb 2014

You can always export it to an offline toolchain where you can set all optimization settings yourself. And I can think of enough reasons why not to opensource the compiler. I do agree it would be nice to know which settings are used (ie, that it is effectively Keil with optimization at x with settings at y). But your advice to not use mbed seems a bit on the overkill side to put it mildly. Especially since with what I started with, you can always compile offline. There is nothing magical about the mbed libs.

What is imo a weakness is communication, I now know they are working on it from another thread, but would be nice if they acknowledged it here.

15 Feb 2014

Library version 77 appears to solve this problem. However, it is not mentioned in the library change comments so I am not sure if it was intentional or accidental!

Thanks Emilio. I award you 7/10 for effort and 2/10 for communication.

15 Feb 2014

I saw it in one of the commits, but when I imported mbed-src it didn't change anything. But guess the compiler setting changes weren't in place or something like that. Good it is fixed, but your 2/10 for communication seems a bit generous tbh.

15 Feb 2014

Hi Paul,

Paul Griffith wrote:

Thanks Emilio. I award you 7/10 for effort and 2/10 for communication.

You are welcome.

As Erik noticed, the changes we do on the mbed SDK do not get immediately synched in the online IDE. I was waiting for the changes to go live, before communicating it.

Cheers,
Emilio