Silicon Labs' EFM32™ Zero Gecko ARM® Cortex®-M0+ based 32-bit microcontrollers (MCUs) provide flash memory configurations up to 32 kB, 2-4 kB of RAM and CPU speeds up to 24 MHz, …

Significant inaccuracy with wait_ms/wait_us calls

12 May 2016

I'm trying to use a zero gecko for a motor control (simple high amp h bridge) application, and need to control motor timings as well as bit bang a protocol with about 5us accuracy.. however the wait calls are incredibly inaccurate. I have no problem with this using ST parts but we're looking at the gecko series for power savings when everything is idle.

Unfortunately it's not even limited to just uS timings, but mS timings too. When using the lowpower ticker, I can get rather good timings down to about 2ms, but waits are not anywhere near as accurate.

Running this code:

toggle speed

myled = 0;
    myled = 1;
    myled = 0;
    
    myled = 1;
    wait_ms(1);
    myled = 0;

Using a Keysight DSOX2014A scope I have the first high toggle as 1.6us, and the low toggle as 1.55us - which is on par with what i'd expect to see from mbed code.

The wait_ms(1) high width however is 1.2390ms... Far too long. But it gets much worse when trying to use wait_us.

wait_us speed

    myled = 1;
    wait_us(100);
    myled = 0;
    
    myled = 1;
    wait_us(10);
    myled = 0;
    
    myled = 1;
    wait_us(1);
    myled = 0;

The first pulse width is 348.6us. Well over 3 times longer than it should be. The low pulse between is 1.55us, same as I established previously.

The wait_us(10) pulse is then 348.6us.

Finally, the wait_us(1) pulse is 348.45us.

So, clearly anything under 348.6uS is achievable with the zero gecko?

I then tried the same code with 500/50/5us wait times. The 1ms pulse is again 1.2391ms.

The 500us pulse is 684.2us followed by 351.3us and 361.9us.

Trying again with 900/800/700us pulses to see if the timings have an even offset, or if they are just all over the place. Again, a 1.2390ms pulse for the 1ms wait then: 900us == 1201.6us 800us == 1044us 700us == 1054.5us

So very non-linear and not even remotely accurate.

There is no other code running, no timers, no tickers.. the contents of main() is just what I posted.

Is this something that SiLabs would be willing to fix? At the moment the board is unusable with mbed for my application - and the client wants it to be written using mbed libraries.

Oscilloscope last calibrated Jan 2016.

12 May 2016

With an EFM32HG (STK3400) I'm getting the same results.

The 900/800/700us times are 1.2028ms, 1.0536ms, 1.0572ms.

25 May 2016

Hi Mark,

Are you running with the latest version of the mbed library? We are definitely more than willing to take a look at this, since the behaviour you describe is indeed pretty unacceptable.

24 Jun 2016

Hi Steven,

Sorry for the late reply, I didnt get an email about the reply!

Yes I'm running the latest mbed library. We ended up switching that project over to emlib, however we still have a number of other projects which MBed is just faster/easier to use - so I'd still like to see if we can get a resolution to the timing issues. The EMF32's are a really nice series of uC!

11 Jul 2016

Hi Steven,

I have the same problem with wait_us. I tested the following boards with a negaitven result: EFM32 Happy Gecko, EFM32 Zero Gecko and LPC824. With the boards KL05Z and KL25Z works wait_us perfect.

I would have liked the EFM32 Zero Board used, but with the inaccurate time it is not usable for me.

09 Sep 2016

Hey folks,

I'm so, so sorry for my very late answer. I completely lost track of this thread. Again, I apologize for the delay. TL;DR: there are some pretty big optimizations missing, and I'll update our implementation ASAP.

I investigated this today, and wanted to explain a bit about what is going on under the hood when you call delay_xs();. In effect, what we're doing is setting up a timer with a >= 1MHz resolution, grabbing the microsecond timestamp right when you call delay, and then continually grabbing a new microsecond timestamp until the right amount of microseconds have passed. Internally, mbed uses a 32-bit microsecond timestamp value.

Let me give some background info as well: 1) The EFM32 family currently only has 16-bit timers. There is a way to create a 32-bit one from two 16-bit timers, but since we need a different timer to implement PWM, and the smallest EFM32 part only has two timers, that is not an option for us. That means we need to extend the hardware timer in software, by incrementing a variable on every overflow of the timer, creating the potential of race conditions. 2) Since clock speed and source are configurable, and the timer's prescaler is a power of two (instead of an integer), there is no way for us to guarantee that the timer can get a clean 1MHz clock source. That means we need to count at a higher frequency, and then divide the count value to get a microsecond timestamp. As you are probably aware, divisions are expensive operations on MCU's, but in this case we unfortunately cannot avoid doing one.

That said, I did find a way to optimize our conversion from a software-extended timer value to a microsecond timestamp. Currently, it does a 64-bit division, which is a *HORRIBLY* expensive computation, especially on Cortex-M0+ which doesn't have hardware division support (Zero and Happy). On Cortex-M0+, compiled with GCC, optimized for speed, the function we provide to get a 32-bit microsecond timestamp executes in 1344 cycles. At full speed (24MHz) that translates to 56 us, leading to a minimum delay of 113us. Since the online IDE compiles with Keil (potentially also on a lighter optimization setting), it is entirely possible that that function takes twice as long to execute, leading to the horrible results posted here.

I switched the calculation over to a 32-bit multiplication, 32-bit addition and 16-bit division, and the results are better: On Cortex-M0+ (speed optimized GCC), it now takes 320 cycles for timer to microsecond conversion. It's still far from ideal, but already something. On Cortex-M3/4, which have hardware division and a larger instruction set, the operation takes 78 cycles.

Considering you need to check the timestamp at least twice in a delay function (once to get your start time, then continuously to check if you are already past the start time), the minimum delay at default clock speed becomes 25us for Zero/Happy Gecko, and 4us for Giant/Wonder/Leopard.

I will try to get my optimizations merged into mainline mbed ASAP.

Kind regards, Steven Cooreman Software Engineering Manager, IoT MCU & Wireless Silicon Labs Norway AS

09 Sep 2016

Thanks Steven, and thanks for the in depth explanation!

12 Sep 2016

Pull request for the fix is open, see [https://github.com/ARMmbed/mbed-os/pull/2666]