Bug: I2c block read/write fails to send repeated start when in fast mode

31 Jan 2013

Hi,

I believe I've found a bug in the I2C block read implementation involving repeated starts and fast mode (400kbps).

Here's the scenario. I have a sensor which requires a write operation immediately followed by a read, with no stop bit in between. This can be implemented using a repeated start like so:

Edit: I've added the full function body including input validation and NoAck based on Wim's comments.

STATUS WriteAndRead(I2cFrequency frequency, int address, unsigned char* txBuffer, int txLength, unsigned char* rxBuffer, int rxLength){
    // Validate inputs
    assert(txBuffer);
    assert(rxBuffer);
    assert(txLength > 0);
    assert(rxLength > 0);

    // Set frequency
    i2c->frequency(frequency);

    // Perform write followed by read
    if(0 != i2c->write(address, (char*)txBuffer, txLength, 1))
        goto NoAck;
    if(0 != i2c->read(address, (char*)rxBuffer, rxLength, 0))
        goto NoAck;
    return SUCCESS;
NoAck:
    return FAILURE;
}

The above code works fine when the frequency is set to 100kbps, however it fails when set to 400kbps. Examining what was happening with an oscilloscope, I saw that at 400kbps the line was never being driven low for the repeated start bit. In other words, while repeated start should omit the stop bit at the end of the write, I was seeing both the stop and the start bits omitted. There was no separation between the last byte of the write and the first byte of the read. This meant that the sensor never responded to the read because it interpreted the address byte as another data byte in the write. At 100kbps the start bit was written correctly and so the sensor responded.

I re-wrote the above code using explicit starts and stops with the single-byte reads and writes, as shown below.

STATUS WriteAndRead(I2cFrequency frequency, int address, unsigned char* txBuffer, int txLength, unsigned char* rxBuffer, int rxLength){
    // Validate inputs
    assert(txBuffer);
    assert(rxBuffer);
    assert(txLength > 0);
    assert(rxLength > 0);

    // Prepare for write
    i2c->frequency(frequency);
    i2c->start();
    if(!i2c->write(address & 0xfe))
        goto NoAck;
    
    // Perform write
    for(int i=0; i<txLength; i++){
        if(!i2c->write(*txBuffer++))
            goto NoAck;
    }
    
    // Prepare for read; send a repeated start bit
    i2c->start();
    if(!i2c->write(address | 0x01))
        goto NoAck;
    
    // Perform read
    for(int i=0; i<rxLength; i++)
        *rxBuffer++ = i2c->read(1);
    
    // Finish by sending stop bit
    i2c->stop();
    return SUCCESS;

NoAck:
    i2c->stop();
    return FAILURE;
}

This version works fine at all frequencies. Any idea why I am seeing this behaviour?

29 Jan 2013

Could be a bug.

What is SCL doing when you expect the Repeated Start. Which slave device are you using, are you sure it can handle 400kbit/s. Note that some slaves do not behave as they should according to the I2C standard.

29 Jan 2013

Officially the LPC1768 doesnt support fast mode on its I2C ports (well those available to the mbed), I assume you use the LPC1768? That said I had it working together with a fast mode slave device at 1mb/s, and that included reading operations. So this is weird. You can try if MODI2C (http://mbed.org/users/Sissors/code/MODI2C/) does work without the added start/stop stuff, should be drop in replacement (as long as you dont use start/stop).

Edit: my bad, got confused, indeed support fast mode.

29 Jan 2013

Erik - wrote:

Officially the LPC1768 doesnt support fast mode on its I2C ports (well those available to the mbed)

The LPC1768 has three I2C ports (I2C0, I2C1, I2C2). Only I2C0 can be configured to support Fast Mode Plus (1Mbit/s). I2C1 and I2C2 support fast mode (400 kHz) and standard mode (100 kHz). However, as you say, the mbed pins only provide access to I2C1 and I2C2. So 400kbit/s should work on mbed as long as the slave supports it and the wiring is not too long or noisy. I have run fast mode ports up to 500 kbit/s with an MCP23017 port expander without problems.

30 Jan 2013

Hi guys,

Thanks for the quick responses. To clarify, I'm running at 400kHz, which as Wim says should be supported. I am using the LPC11U24. I'd rather not say what slave I'm using, but it absolutely supports FM and FM+. The second bit of code *does* work in both FM and FM+, suggesting that software is definitely the problem: either I made a mistake with the first program or there is a bug in the library.

Erik: Thanks for the suggestion. However, this is commercial work so we don't want to use anything other than the official mbed libraries. My second bit of code works anyway, so this is more about reporting a bug in the official libraries: I don't need an urgent fix.

Wim: I've taken pictures of all four cases on the scope. In order these are: 100kHz using block reads/writes (i.e. the first bit of code from my last post), 100kHz using manual start/stops (i.e. the second bit of code from my last post), 400kHz using block reads/writes, and 400kHz using manual start/stops. Yellow is data, purple is clock. To help decode it, observe that the slave drives the line more strongly low than the master during an ACK, and the start bit corresponds to an elongated high clock pulse (except in the third image where it is missing). The last byte of the write is 0x40. The first byte of the read (i.e. the address byte) is 0xC1.

As you can see, in all cases except the 400kHz using block write/read, there is an elongated high clock pulse between write and read, during which the start bit is transmitted. You'll also notice that no ACK or data follows the address byte in the third picture, because the slave fails to correctly interpret the read due to the missing start bit.

/media/uploads/calew/100khz_block.jpg /media/uploads/calew/100khz_byte.jpg /media/uploads/calew/400khz_block.jpg /media/uploads/calew/400khz_byte.jpg

31 Jan 2013

I see you use the LPC11U24. There may be differences between that device and the LPC1768. Never tested the LPC11U24 myself.

Not sure why you use the repeated start. The only reasons for using repeated start rather than regular stop+start are (maybe) a slightly faster communication and in cases where you have multiple I2C masters. The repeated start will prevent another master from seizing control of the I2C bus from Master1 between its write operation (setting a register address) and its subsequent read from that register address. That could cause problems when Master2 changes the registeraddress of that Slave.

Anyhow, the repeated start mode should work, but apparently something is wrong.

 if(0 != i2c->write(address, (char*)txBuffer, txLength, 1))
        goto NoAck;
 if(0 != i2c->read(address, (char*)rxBuffer, rxLength, 0))
        goto NoAck;

The third picture shows two things:

  1. No Repeated Start, but the transmitted databyte (i.e. slave readaddress) should be (wrongly) interpreted by the slave as a regular write instead. So why is there no slave acknowledge. Does the slave stop responding when you send more than a certain number of bytes (ie txLength) in one message?
  2. Why do we see a regular Stop following that address. Is that produced by the blockread or by the NoAck code? Please show NoAck code.

I have seen strange behaviour of I2C software in border conditions. This is often due to incomplete error/bordercase handling in the I2C statemachine code. Could it be that the value 'rxLength' is 0 in case you select the 400kbit/s AND use blockread mode. That 0 rxLength value could somehow cause the I2C statemachine software to skip the repeated start and return.

31 Jan 2013

Hi Wim,

I use the repeated start because it is required by the slave. What is actually happening here is the write is a request for a particular type of data (as the slave can output data in various formats). If the slave ever sees a stop bit after the write, it assumes the communication is finished and doesn't send any data.

1) I suspect this is what is happening, yes. The slave has a fixed message format, so it seems likely it responds with a NACK if this is not followed precisely.

2) This stop is sent by the block read. The NoAck label just returns from the function with an error code, it doesn't call anything else on I2C. In contrast, the NoAck label for the second set of code *does* manually send a stop bit on a NACK, as otherwise the line would be left low.

A zero value for rxLength is a good thought. However, there are some assert statements which I didn't post which already check this at the top of the function(e.g. "assert(rxLength > 0);"). Plus, it's actually a constant which is being passed in.

I've updated my original post with the full function body, save for some name changes, which show the NoAck and assertions.

Thanks for your help, Carl

31 Jan 2013

Hi Carl,

I did a quick test with your code in a simple framework with some modifications:

  • used the LPC1768 mbed
  • used a PCF8574 port expander as slave
  • max i2c freq was set at 400kbit/s. Actual freq is less with mbed libs (about 360). Comms break down at higher freq. No surprise, it at the limit for the PCF8574.
  • used some defines and other fixes to be able to run your code
  • i2c message = start, address+W, write 4 bytes, restart, address+R, read 2 bytes, stop
  • disabled asserts

Anyhow, I see no issues with the restart on my logic analyser. See pics below. I also tested with rxLength=0. Turns out that the blockread will do a restart, send the address+R, read one byte (NoAck) and then stop. Maybe your mbed lib is an older version and some bugs have been fixed. I used rev 58.

Complete interaction

/media/uploads/wim/_scaled_screenshot_i2c_400_4.jpg

High Res is here

Write interaction

/media/uploads/wim/_scaled_screenshot_i2c_400_1.jpg

High Res is here

Restart

/media/uploads/wim/_scaled_screenshot_i2c_400_2.jpg

High Res is here

Read interaction

/media/uploads/wim/_scaled_screenshot_i2c_400_3.jpg

High Res is here

#include "mbed.h"

// mbed Interface Hardware definitions
DigitalOut myled1(LED1);
DigitalOut myled2(LED2);
DigitalOut myled3(LED3);
DigitalOut heartbeatLED(LED4);

typedef int STATUS;
#define SUCCESS  0
#define FAILURE -1

typedef int I2cFrequency;

#define TXBUF_SIZE 16
#define RXBUF_SIZE 16

unsigned char txBuffer[TXBUF_SIZE];
int txLength = 4;

unsigned char rxBuffer[RXBUF_SIZE];
int rxLength = 2;

//Pin Defines for I2C Bus
//#define D_SDA                  p9
//#define D_SCL                  p10
#define D_SDA                  p28
#define D_SCL                  p27

// Host PC Communication channels
Serial pc(USBTX, USBRX); // tx, rx

//I2C Bus
I2C _i2c(D_SDA, D_SCL);
I2C *i2c = &_i2c;

//int slaveaddress = 0xC0;
int slaveaddress = 0x40;

void InitBuf () {
   for(int i=0; i<txLength; i++){
     txBuffer[i] = i;
   }    

   for(int i=0; i<rxLength; i++){
     rxBuffer[i] = 0;
   }  
}

STATUS WriteAndRead(I2cFrequency frequency, int address, unsigned char* txBuffer, int txLength, unsigned char* rxBuffer, int rxLength){
    // Validate inputs
//    assert(txBuffer);
//    assert(rxBuffer);
//    assert(txLength > 0);
//    assert(rxLength > 0);
 
    // Set frequency
    i2c->frequency(frequency);
 
    // Perform write followed by read
    if(0 != i2c->write(address, (char*)txBuffer, txLength, 1))
        goto NoAck;
    if(0 != i2c->read(address, (char*)rxBuffer, rxLength, 0))
        goto NoAck;
    return SUCCESS;
NoAck:
    return FAILURE;
}

int main() {
   STATUS status;
   
   pc.printf("I2C Test\n\r");
   
   while (1) {
     InitBuf();
     status = WriteAndRead(400000, slaveaddress, txBuffer, txLength, rxBuffer, rxLength);
     if (status!=SUCCESS) {
       pc.printf("I2C Test FAILURE\n\r");   
     }
   }
}
01 Feb 2013

Hi Wim,

Thanks; you've put a lot of work into this. As you say, your trace looks fine. I am using revision 58, unfortunately, so the only other thing I can think of is that there is a bug in the library specific to the LPC11U24, or I've got a bug elsewhere in my code. Though I can't imagine what sort of bug could make me consistently miss out a single bit in a block operation.

In any case, my second set of code works perfectly well so I'll just stick with that. Thanks for all your help, Carl

P.S. Nice scope, by the way

01 Feb 2013

A problem with the LPC11U24 libs is the most likely explanation. I dont have access to an LPC11U24 so can't check. Maybe someone else (at mbed) can try the above code and see what happens.

The logic analyser I use is the Saleae Logic16. It works great, is affordable and has many protocol decoders (eg I2C, SPI, UART). That makes interpreting the data real easy. Certainly beats counting clockpulses and writing down 0s and 1s....