12 years, 2 months ago.

RAM addresses & DMA channel used by Ethernet Interface?

I'd like to implement a low-latency Ethernet-to-I2S design.

DMA transfers from the Ethernet packets directly to the peripheral seem like a good solution.

I imagine that the Ethernet handler uses a DMA channel, and maybe some AHB1 RAM to manage the transfers - but what are the DMA channel and addresses?

Any tips for achieving the fastest redirection of the data? 5.5Mbit/s or even higher is desirable. Little or no processing of the data is needed.

Thanks!

Rod

5 Answers

12 years, 2 months ago.

I assume you have already found this?

http://www.nxp.com/documents/user_manual/UM10360.pdf

12 years, 2 months ago.

I2S with DMA can be quite fast: based on the work of Ivo van Poorten I have made the I2S output do 25 MHz in DMA mode (for VGA output)

Assuming ethernet input is also coming in by DMA it is just a matter of triggering the DMA if no processing needs to be done. If it does need processing use an alternating buffer and process one while outputting the other. The I2S DMA output routines can be linked together so this will happen automatically. (I used the link function to emit 'blank' after emitting one scanline)

Rod Coleman
poster
12 years, 2 months ago.

Gert, Thanks for your suggestion! Ivo's work looks to be a useful starting point for handling I2S, and I am encouraged to hear that you have achieved 25MHz, that is plenty!

For the Ethernet side, I am hoping to start with the newly released official libraries from the mbed team. I assume that DMA is used in that implementation, but confirmation of the details would be useful.

I like your alternating buffer scheme, too. Maybe I can receive UDP packets of 512 byte into alternating buffers.... do you have a sample of the link function?

Thanks!

Rod

12 years, 2 months ago.

This is a small excerpt from the 640x480 vga library: Note: most of the settings are done using Ivo's fastlib http://mbed.org/users/Ivop/code/fastlib/ functions (very handy)

// -----------------------------------------------------------------------------------
// define structure for DMA linked lists
struct dma_lli {
    void *source;
    void *dest;
    struct dma_lli *next;
    unsigned control_word;
};

// some arbitrary blank data for I2S used for blanking
// even after DMA the I2S output will keep on emitting zeroes (= blank)
static unsigned char blanking[]={0,0,0,0};

// preset our blanking DMA linked list
extern const struct dma_lli blank_lli;

// blank linked lists ends the DMA cycle (lli=0)
static const struct dma_lli blank_lli = {
    blanking, (void*)FL_I2STXFIFO, 0, 4
    | (1 << 12)
    | (1 << 15)
    | (2 << 18)
    | (2 << 21)
    | (0 << 26)
    | (0 << 27)
    | (0 << 31)
};

// setup the DMA controller
static void init_dma_controller(void) {
    fl_dma_enable(FL_ENABLE);
    while (!fl_dma_is_enabled()) ;
}

// -----------------------------------------------------------------------------------
// setup I2S for 25 MHz dot/pixel clock
static void init_i2s(void) {
    // I2S on P0.9 (DIP5)
    fl_select_clock_i2s(FL_CLOCK_DIV1);                     // assume 100MHz
    fl_pinsel(0, 9, FL_FUNC_ALT1, FL_IGNORE, FL_IGNORE);    // emit I2S data on pin 5 (I2STX_SDA)
    fl_i2s_set_tx_rate(1,4);
    fl_i2s_output_set_config(FL_8BITS, FL_STEREO, 8, 0, 0, 0, 0);

}

// -----------------------------------------------------------------------------------
// create HSYNC output with PWM
static void init_hsync(void) {
    // PWM1.2 on P2.1 (DIP25)
    fl_select_clock_pwm1(FL_CLOCK_DIV1);
    fl_pinsel(2, 1, FL_FUNC_ALT1, FL_FLOATING, FL_FUNC_DEFAULT);    // PWM1.2, no pullup/down, no open drain

    fl_pwm_set_prescale(4);         // 100/25 = 4

#define HSHIFT 0

    // main PWM
    fl_pwm_set_match(0, 800);   // 800 color clocks

    // generate line interrupts from PWM MR0
    fl_pwm_config_match(0, FL_ON, FL_ON, FL_OFF);   // interrupt, reset, !stop

    // this PWM generates the HSYNC pulse
    fl_pwm_set_match(2, 752+HSHIFT);         // go low at 752
    fl_pwm_set_match(1, 24+HSHIFT);         // go high at 24
    fl_pwm_config_edges(2, FL_DOUBLE_EDGE); // need this for negative sync
    fl_pwm_output_enable(2, FL_ENABLE);     // enable this output

}

// -----------------------------------------------------------------------------------
// state machine list for the complete screen output
static void state_before_vsync(void);

static void (*state)(void) = state_before_vsync;

// emit a line from the visible area (framebuffer)
static void state_visible_area(void) {
    extern const struct dma_lli blank_lli;
    // limit visible area to the size of the framebuffer
    if (line_counter != (VISIBLE+1)) {

        // reset DMA parameters for active line
        fl_dma_set_srcaddr (0,(unsigned char *)pointer);  // source is our current framebuffer pointer
        fl_dma_set_destaddr(0, (void*)FL_I2STXFIFO);      // destination is I2S
        fl_dma_set_next_lli(0, &blank_lli);               // connect to blanking list
        fl_dma_channel_control(0,                  // control word
                               20,                 // count (20*4 = 80 bytes active)
                               4, 4,               // src and dest burst size
                               32, 32,             // src and dest width
                               FL_SRC_INCREMENT,   // increment source address
                               FL_NO_DEST_INCREMENT, // do not increment destination address
                               FL_OFF              // no interrupts
                              );

        // restart DMA sequence
        fl_dma_channel_config(0, FL_ENABLE,
                              FL_DMA_PERIPHERAL_IS_MEMORY, FL_DMA_BURST_REQUEST_I2S_CH0,
                              FL_DMA_MEMORY_TO_PERIPHERAL,
                              FL_ON, FL_ON
                             );
        // increment framebuffer pointer every other line (line doubling)
        if (!(line_counter & 1)) pointer+=80;
    } else   state = state_before_vsync;
}

Thanks Gert, That is very helpful! Time for me to try some of these functions.

posted by Rod Coleman 19 Nov 2012
12 years, 1 month ago.

Hi Rod i would love to know how you get on with this project as i am trying to shift 100 X 540 byte UDP packets per second using the new stack and it struggles to cope

the packets come in groups of 4 with 56us between packets and once 4 have come in there is a delay of 35ms before the next group come in. all i'm trying to do is get each of the 4 packets into 4 array's and then work on it

this is for an artnet decoder for use with a media server driving rgb leds for a low resolution video display

many thanks

Chris

Chris, I am afraid of something like that happening. Emilio remarked recently that the stack is being revised again now - with changes expected to the memory map. Maybe it won't be ready for a while then. I may well return to the very first LwIP stack used here - Rolf's 2009 issue. If have had fine results with this, working with 220-byte packets in TCP.

posted by Rod Coleman 21 Nov 2012