Rewriting SD wav player library to utilise DMA, running into issues.

08 Sep 2016

Hi All,

This is on a LPC1768 chip.

I'm rewriting the wave_player library that originally was designed to read a wav file from an SD and output it via a tick at a specific sample rate to the DAC. The original lib is below:

https://developer.mbed.org/users/sravet/code/wave_player/

However I want to use DMA to transfer the data from the SD to the DAC. To start I'm just using DMA to transfer the slices after they've been read from the SD file to memory. However I'm running into some issues, I can't seem to get the sound to play properly and I'm not exactly sure why. I was hoping someone could look over my code and see if they notice something wrong.

My suspicion is that it's something to do with the way I've configured the DMA transfer to the DAC. Currently I'm attempting to send 256 (arbitrary number) 16 bit slices per DMA transfer. I have read a lot about the DMA config for the LPC1768 and seen that there are options for source/destination transfer width as well as source/destination burst size. But in the example for MODDMA (https://developer.mbed.org/users/AjK/code/MODDMA/file/97a16bf2ff43/example4.h) which is a continuous memory to peripheral (DAC) transfer, the author doesn't bother with width or burst size and that program works perfectly on my chip. What I'm attempting to do is not much different except for the fact that the DACCNTVAL is much faster. Much of my DMA code is based on that MODDMA example.

Which brings me to the DACCNTVAL value. Since this file is 16bit 22.1Khz wav file, I'm using the below math to come up with 4.24 for the value (round up to 5 since no fractions allowed).

CCLK = 96Mhz PCLKDAC = 96MHz/4 = 24Mhz 24Mhz / 256 (slices) / 22.Khz = 4,24

I think the math is right but I'm not totally sure, the sound coming out seems long enough to be the sample I'm attempting to play (a bassdrum) but I can definitely hear slices and glitches. It's not easily discernible as the sample so I'm not ready to say that its actually playing it. I have double checked the data before its transferred via DMA and it is correct. I did this by comparing the values to the ones from the original wave_player library which works.

Any ideas?

Thanks

wav_player_dma_rewrite.cpp

#include <mbed.h>
#include <stdio.h>
#include <wave_player.h>
#include "MODDMA.h"
#include "SDFileSystem.h"

DigitalOut led1(LED1);
DigitalOut led3(LED3);
DigitalOut led4(LED4);

AnalogOut signal(p18);

SDFileSystem    sd(p5, p6, p7, p8, "sd"); //PinName mosi, PinName miso, PinName sclk, PinName cs, const char* name

MODDMA dma;
//MODDMA_Config *conf0, *conf1;
MODDMA_Config *conf0;

void TC0_callback(void);
void ERR0_callback(void);

//void TC1_callback(void);
//void ERR1_callback(void);

unsigned chunk_id,chunk_size,channel;
unsigned data,samp_int,i;
short unsigned dac_data;
long long slice_value;
char *slice_buf;
short *data_sptr;
unsigned char *data_bptr;
int *data_wptr;
FMT_STRUCT wav_format;
long slice,num_slices;
unsigned short DAC_fifo[256];
short DAC_wptr;
int read_slices = 0;
int DMA_complete = 0;
int file_end = 0;

FILE *wav_file4 = fopen("/sd/wf/bd_04.wav", "r"); //16 bit 22.1 khz

/*
Reads wave file header info, and sets a flag when it gets to the slice data
*/
void read_wav_file(FILE *& wavefile) 
{
  fread(&chunk_id,4,1,wavefile); //get chunk id
  fread(&chunk_size,4,1,wavefile); //get chunk size
  //printf("chunk type 0x%x, size %d\n",chunk_id,chunk_size);
  if (!feof(wavefile)) //sequentially reading the wave file seperating it into different chunk_ids
  { 
    switch (chunk_id) 
    {
      case 0x46464952: //RIFF CHUNK
        fread(&data,4,1,wavefile); 
        break;
      case 0x20746d66: //WAV FORMAT DATA
        fread(&wav_format,sizeof(wav_format),1,wavefile); //get wave format info
        //printf("wav_format - %d\n", wav_format.block_align);
        if (chunk_size > sizeof(wav_format))
          fseek(wavefile,chunk_size-sizeof(wav_format),SEEK_CUR);
        break;
      case 0x61746164: //AUDIO DATA
        read_slices = 1; //set read_slices flag, start reading slices 
        num_slices=chunk_size/wav_format.block_align; //atleast 16 bits / 16 bits
        //printf("num_slices = %d / %d = %d \n", chunk_size, wav_format.block_align, chunk_size / wav_format.block_align); 
        break;
      case 0x5453494c: //INFO CHUNK
        fseek(wavefile,chunk_size,SEEK_CUR);
        break;
      default: //UNKNOWN CHUNK
        //printf("unknown chunk type 0x%x, size %d\n",chunk_id,chunk_size);
        data=fseek(wavefile,chunk_size,SEEK_CUR);
        break;
    }
    file_end = 0;
  } else if (feof(wavefile)) 
  {
    file_end = 1;
    
  }
}

/*
Reads slice by slice, averaging channels to mono. Saves to the DAC_fifo[slicesRead]
*/
void read_and_avg_slices(FILE *& wavefile, short DAC_wptr)
{
  if (!feof(wavefile)) //sequentially reading the wave file seperating it into different chunk_ids
  { 
    //allocate slice buffer big enough to hold a slice
    slice_buf=(char *)malloc(wav_format.block_align); //set a 16 bit slice buffer (block_aligis type short)  
    if (!slice_buf) 
    {
      printf("Unable to malloc slice buffer");
      exit(1);
    }     
    //samp_int=1000000/(wav_format.sample_rate
    fread(slice_buf,wav_format.block_align,1,wavefile);
    if (feof(wavefile)) 
    {
      printf("Oops -- not enough slices in the wave file\n");
      exit(1);
    }
    data_sptr=(short *)slice_buf;     // 16 bit samples
    slice_value=0;
    for (channel=0;channel<wav_format.num_channels;channel++) 
    {
      switch (wav_format.sig_bps) 
      {
        case 16:
            //printf("16 bit channel %d data=%d ",channel,data_sptr[channel]);
            slice_value+=data_sptr[channel];
            break;
      }
    }
    slice_value/=wav_format.num_channels; // summed and averaged
    //slice_value is now averaged.  Next it needs to be scaled to an unsigned 16 bit value with DC offso  ibe written to the  DAC.
    switch (wav_format.sig_bps) 
    {
      case 16:   
        slice_value+=32768; //scaling to unsigned 16 bit
        break;
    }
  
    dac_data=(short unsigned)slice_value;//16 bit
    DAC_fifo[DAC_wptr]=dac_data; //put slice value into dac fifo        
    free(slice_buf);
  } else if (feof(wavefile))
  {
    file_end = 1; 
       
  }
}

/*
Config and enable for the DMA
*/
void startDMA(int numOfSlices)
{
  //printf("num of slices to send - %d\r\n", numOfSlices);
  conf0 = new MODDMA_Config;

  conf0
   ->channelNum    ( MODDMA::Channel_0 )
   ->srcMemAddr    ( (uint32_t) &DAC_fifo)
   ->dstMemAddr    ( MODDMA::DAC )
   ->transferSize  ( numOfSlices )
   ->transferType  ( MODDMA::m2p )
   ->dstConn       ( MODDMA::DAC )
   ->attach_tc     ( &TC0_callback )
   ->attach_err    ( &ERR0_callback )     
  ; // config end
  
  //DAC frequency
  LPC_DAC->DACCNTVAL = 5; // 24 MHz / 2 bytes for 1 hz... /  =  
                                       // 24 MHz / 2 / 44.1 KHz = 272.1
                                       // 24 MHz / 2 / 22.1 KHz = 542.98 
                                       // 24 MHZ / 256/ 22.1 Khz = 4.24
  // Prepare first configuration.
  if (!dma.Prepare( conf0 )) {
      error("dma conf0 not loaded");
  }
  // Begin (enable DMA and counter). Note, don't enable
  // DBLBUF_ENA as we are using DMA double buffering.
  LPC_DAC->DACCTRL |= (3UL << 2); //CNT_ENA time out counter is enabled, DMA_ENA is enabled
}

int main()
{
  fseek(wav_file4, 0, SEEK_SET);
  int slicesRead = 0;
  int slice_num = 0;
  file_end = 0;
  for (i=0;i<256;i+=2) {
    DAC_fifo[i]=0;
    DAC_fifo[i+1]=3000;
  }
  while (file_end == 0)
  {
    do
    {
      if(read_slices == 0) 
      {
        read_wav_file(wav_file4); // otherwise wav file data is read until slice data is found, feof sets file_end to 1
      } else if (read_slices == 1)           
      {      
        read_and_avg_slices(wav_file4, slicesRead);
        slicesRead = (slicesRead+1); //increment to the next DAC_fifo position
        //slices ++;
        if (num_slices == slice_num) //if all slices are read for this file, turn off read_slices flag 
        {
          read_slices = 0;
        } else 
        {
          slice_num++; //increment to the next slice
        }
      }
    } while (slicesRead != 256 && file_end == 0); //256 slices read, time for DMA
    if (slice_num == 0)
    {
        // nothing to send
        break; 
    } else 
    {      
      startDMA(slicesRead);
      slicesRead = 0;
      /*
      while (!DMA_complete) //wait for the DMA completion flag to set
      {
        //printf("DMA in progress - DMA_complete = %d\n", DMA_complete);
        ;
      }
      printf("DMA_Complete - %d\n", DMA_complete);
      DMA_complete = 0; //reset DMA flag
      */
    
    }
  }
}

// Configuration callback on TC
void TC0_callback(void) {
    // Just show sending complete.
    led3 = !led3;
    // Get configuration pointer.
    MODDMA_Config *config = dma.getConfig();
    
    // Finish the DMA cycle by shutting down the channel.
    dma.Disable( (MODDMA::CHANNELS)config->channelNum() );

    //setup and enable
    //dma.Prepare( conf0 ); //cant prepare conf0 in TC0

    // Clear DMA IRQ flags.
    if (dma.irqType() == MODDMA::TcIrq) dma.clearTcIrq();

    DMA_complete = 1;
}

// Configuration callback on Error
void ERR0_callback(void) {
    error("TC0 Callback error");
}
08 Sep 2016

I see from your comments that you expect DMA double buffering, but you only provide one DAC_fifo[] array. Do you expect the DMA code to provide the double buffering by making its own copy of the DAC_fifo? Is "dma.Prepare( conf0 )" supposed to handle the double buffering and also stall the loop in main() to set the net transfer pace? If dma.Prepare() cannot do the copying fast enough, perhaps you need to do the file input directly to 2 ping-pong buffers? If dma.Prepare() is stalling the loop, that may eat-up time which is needed to read from the SD card at a pace that matches the DAC. What happens if you 'fake' reading the file and instead generate a simple square or sawtooth wave segment?

09 Sep 2016

Hi Fred,

I currently don't have DMA double buffering enabled in the config or two buffers in use. However I think you are right I do eventually want double buffering, I was just trying get something relatively simple that worked so I could build on that.

I'm not sure if the loop is being stalled by dma.Prepare() at the moment.

The 'fake' scenario you are speaking of is already in place in example4.h for MODDMA which is this here

https://developer.mbed.org/users/AjK/code/MODDMA/file/97a16bf2ff43/example4.h

In the above example a sine wave is created in one buffer and then duplicated in a second buffer. Then two DMA channels are used to alternate sending out the entirety of each buffer (360 elements) per transfer to the DAC. I have tried to model this in an alternate version and it still doesn't really work.

Here's my version, using a two buffers and two DMA channels.

Double DMA channel version

#include <mbed.h>
#include <stdio.h>
#include <wave_player.h>
#include "MODDMA.h"
#include "SDFileSystem.h"

DigitalOut led1(LED1);
DigitalOut led3(LED3);
DigitalOut led4(LED4);

AnalogOut signal(p18);

SDFileSystem    sd(p5, p6, p7, p8, "sd"); //PinName mosi, PinName miso, PinName sclk, PinName cs, const char* name

MODDMA dma;
MODDMA_Config *conf0, *conf1;
//MODDMA_Config *conf0;

void TC0_callback(void);
void ERR0_callback(void);

void TC1_callback(void);
void ERR1_callback(void);

unsigned chunk_id,chunk_size,channel;
unsigned data,samp_int,i;
short unsigned dac_data;
long long slice_value;
char *slice_buf;
short *data_sptr;
unsigned char *data_bptr;
int *data_wptr;
FMT_STRUCT wav_format;
long slice,num_slices;
unsigned short DAC_fifo[256], DAC_fifo2[256];
short DAC_wptr;
int read_slices = 0;
int DMA_complete = 0;
int file_end = 0;
int buf_sel = 0;
int dma_sel = 0;

FILE *wav_file4 = fopen("/sd/wf/bd_04.wav", "r"); //16 bit 22.1 khz

/*
Reads wave file header info, and sets a flag when it gets to the slice data
*/
void read_wav_file(FILE *& wavefile) 
{
  //printf("read_wav_file function\r\n");
  fread(&chunk_id,4,1,wavefile); //get chunk id
  fread(&chunk_size,4,1,wavefile); //get chunk size
  //printf("chunk type 0x%x, size %d\n",chunk_id,chunk_size);
  if (!feof(wavefile)) //sequentially reading the wave file seperating it into different chunk_ids
  { 
    switch (chunk_id) 
    {
      case 0x46464952: //RIFF CHUNK
        fread(&data,4,1,wavefile); 
        break;
      case 0x20746d66: //WAV FORMAT DATA
        fread(&wav_format,sizeof(wav_format),1,wavefile); //get wave format info
        //printf("wav_format - %d\n", wav_format.block_align);
        if (chunk_size > sizeof(wav_format))
          fseek(wavefile,chunk_size-sizeof(wav_format),SEEK_CUR);
        break;
      case 0x61746164: //AUDIO DATA
        read_slices = 1; //set read_slices flag, start reading slices 
        //printf("read_slices 3 %d\r\n", read_slices);    
        num_slices=chunk_size/wav_format.block_align; //atleast 16 bits / 16 bits
        //printf("num_slices = %d / %d = %d \n", chunk_size, wav_format.block_align, chunk_size / wav_format.block_align); 
        break;
      case 0x5453494c: //INFO CHUNK
        fseek(wavefile,chunk_size,SEEK_CUR);
        break;
      default: //UNKNOWN CHUNK
        //printf("unknown chunk type 0x%x, size %d\n",chunk_id,chunk_size);
        data=fseek(wavefile,chunk_size,SEEK_CUR);
        break;
    }
    file_end = 0;
  } else if (feof(wavefile)) 
  {
    file_end = 1;
    //printf("1 feof - %d\n", file_end);
    
  }
}

/*
Reads slice by slice, averaging channels to mono. Saves to the DAC_fifo[slicesRead]
*/
void read_and_avg_slices(FILE *& wavefile, short DAC_wptr)
{
  if (!feof(wavefile)) //sequentially reading the wave file seperating it into different chunk_ids
  { 
    //printf("read_and_avg_slices\n");
    //allocate slice buffer big enough to hold a slice
    slice_buf=(char *)malloc(wav_format.block_align); //set a 16 bit slice buffer (block_aligis type short)  
    if (!slice_buf) 
    {
      printf("Unable to malloc slice buffer");
      exit(1);
    }     
    //samp_int=1000000/(wav_format.sample_rate
    fread(slice_buf,wav_format.block_align,1,wavefile);
    if (feof(wavefile)) 
    {
      printf("Oops -- not enough slices in the wave file\n");
      exit(1);
    }
    data_sptr=(short *)slice_buf;     // 16 bit samples
    slice_value=0;
    for (channel=0;channel<wav_format.num_channels;channel++) 
    {
      switch (wav_format.sig_bps) 
      {
        case 16:
            //printf("16 bit channel %d data=%d ",channel,data_sptr[channel]);
            slice_value+=data_sptr[channel];
            break;
      }
    }
    slice_value/=wav_format.num_channels; // summed and averaged
    //slice_value is now averaged.  Next it needs to be scaled to an unsigned 16 bit value with DC offso  ibe written to the  DAC.
    switch (wav_format.sig_bps) 
    {
      case 16:   
        slice_value+=32768; //scaling to unsigned 16 bit
        break;
    }
  
    dac_data=(short unsigned)slice_value;//16 bit
    //printf("buf_sel - %d\n", buf_sel);
    if (buf_sel == 0)
    {
      DAC_fifo[DAC_wptr]=dac_data; //put slice value into dac fifo
    } else if (buf_sel == 1)
    {
      DAC_fifo2[DAC_wptr]=dac_data; //put slice value into dac fifo
    }
    free(slice_buf);
  } else if (feof(wavefile))
  {
    file_end = 1;        
  }
}

/*
Config and enable for the DMA
*/
void startDMA(int numOfSlices)
{
  //printf("numOfSlices to send - %d\r\n", numOfSlices);
  conf0
   ->transferSize  ( numOfSlices ) //in bytes
  ; // config end

  conf1
   ->transferSize  ( numOfSlices ) //in bytes
  ; // config end  
  
  //DAC frequency
  LPC_DAC->DACCNTVAL = 4.24; // 24 MHz / 2 bytes for 1 hz... /  =  
                                       // 24 MHz / 2 / 44.1 KHz = 272.1
                                       // 24 MHz / 2 / 22.1 KHz = 542.98 
                                       // 24 MHZ / 256/ 22.1 Khz = 4.24
  // Prepare config
  //printf("dma_sel = %d\n", dma_sel);
  if(dma_sel == 0) 
  {
    if (!dma.Prepare( conf0 )) {
        error("dma conf0 not loaded");
    }
  }else if (dma_sel == 1)
  {
    if (!dma.Prepare( conf1 )) {
        error("dma conf1 not loaded");
    }      
  }
  dma_sel = (dma_sel+1) & 1;
  
  // Begin (enable DMA and counter). Note, don't enable
  // DBLBUF_ENA as we are using DMA double buffering.
  LPC_DAC->DACCTRL |= (3UL << 2); //CNT_ENA time out counter is enabled, DMA_ENA is enabled
}

int main()
{  
  fseek(wav_file4, 0, SEEK_SET);
  int slicesRead = 0;
  int slice_num = 0;
  file_end = 0;
  for (i=0;i<256;i+=2) {
    DAC_fifo[i]=0;
    DAC_fifo[i+1]=3000;
  }
  conf0 = new MODDMA_Config;
  conf0
   ->channelNum    ( MODDMA::Channel_0 )
   ->srcMemAddr    ( (uint32_t) &DAC_fifo)
   ->dstMemAddr    ( MODDMA::DAC )
   ->transferType  ( MODDMA::m2p )
   ->dstConn       ( MODDMA::DAC )
   ->attach_tc     ( &TC0_callback )
   ->attach_err    ( &ERR0_callback ) 
  ; // config end

  conf1 = new MODDMA_Config;  
  conf1
   ->channelNum    ( MODDMA::Channel_1 )
   ->srcMemAddr    ( (uint32_t) &DAC_fifo2)
   ->dstMemAddr    ( MODDMA::DAC )
   ->transferType  ( MODDMA::m2p )
   ->dstConn       ( MODDMA::DAC )
   ->attach_tc     ( &TC1_callback )
   ->attach_err    ( &ERR1_callback )     
  ; // config end     
  while (file_end == 0)
  {
    //printf("file_end - %d\n", file_end);
    do
    {
      if(read_slices == 0) 
      {
        read_wav_file(wav_file4); // otherwise wav file data is read until slice data is found, feof sets file_end to 1
        //printf("read_wav_file \r\n");
        //printf("slicesRead %d\r\n", slicesRead); 
      } else if (read_slices == 1)           
      {      
        read_and_avg_slices(wav_file4, slicesRead);
        //printf("read_and_avg_slices \r\n");
        //printf("DAC_fifo[%d] - %d\n", slicesRead, DAC_fifo[slicesRead]); //print slice data
        //printf("%d\n", DAC_fifo[slicesRead]); //print slice data
        slicesRead = (slicesRead+1); //increment to the next DAC_fifo position
        if (num_slices == slice_num) //if all slices are read for this file, turn off read_slices flag 
        {
          read_slices = 0;
        } else 
        {
          slice_num++; //increment to the next slice
        }
      }
    } while (slicesRead != 256 && file_end == 0); //256 slices read, time for DMA
    if (slice_num == 0)
    {
        // nothing to send
        break; 
    } else 
    {
      //printf("2 - chunk type 0x%x, size %d\n",chunk_id,chunk_size);        
      //printf("slice_num = %d\r\n", slice_num); //this and slicesRead should match at the end 
      //printf("slicesRead - %d\n", slicesRead);
      buf_sel = (buf_sel+1) & 1;
      startDMA(slicesRead); 
      slicesRead = 0;
      /*
      while (!DMA_complete) //wait for the DMA completion flag to set
      {
        //printf("DMA in progress - DMA_complete = %d\n", DMA_complete);
        //this doesnt work unless there is some command in here, need to research busy wait..
        ;
      }
      printf("DMA_Complete - %d\n", DMA_complete);
      DMA_complete = 0; //reset DMA flag
      */
    
    }
  }
}

// Configuration callback on TC
void TC0_callback(void) {
    // Just show sending complete.
    led3 = !led3;
    // Get configuration pointer.
    MODDMA_Config *config = dma.getConfig();
    
    // Finish the DMA cycle by shutting down the channel.
    dma.Disable( (MODDMA::CHANNELS)config->channelNum() );

    //setup and enable
    //dma.Prepare( conf1 );

    // Clear DMA IRQ flags.
    if (dma.irqType() == MODDMA::TcIrq) dma.clearTcIrq();

    //DMA_complete = 1;
}

// Configuration callback on Error
void ERR0_callback(void) {
    error("TC0 Callback error");
}

// Configuration callback on TC
void TC1_callback(void) {
    // Just show sending complete.
    led4 = !led4;
    // Get configuration pointer.
    MODDMA_Config *config = dma.getConfig();
    
    // Finish the DMA cycle by shutting down the channel.
    dma.Disable( (MODDMA::CHANNELS)config->channelNum() );

    //setup and enable
    //dma.Prepare( conf0 );

    // Clear DMA IRQ flags.
    if (dma.irqType() == MODDMA::TcIrq) dma.clearTcIrq();

    //DMA_complete = 1;
}

// Configuration callback on Error
void ERR1_callback(void) {
    error("TC1 Callback error");
}
09 Sep 2016

R T,

One small point of communication - in my education there is a difference between "double buffering" and what the example does (which I know as dual-buffering or 'ping-pong' buffering).

I think you are on the right track and need to use 2 DMA configuration structures that use 2 (different) DAC_fifo[] arrays. Most of your code could be the same if DAC_fifo was just a pointer to either array. Set DAC_fifo to initially point to the first array (call it DAC_fifo1[256]), and have the two TC call-back routines change which array is pointed to for the next chunk.

That is, in TC0_callback(), set "DAC_fifo = &DAC_fifo2[0];", etc.

Completing the use of "ping-pong" buffering should eliminate any wasted waiting, and allow the SD file reading to properly overlap the DMA transfer without corrupting the DMA data stream.

It will possibly also help (be necessary?) to have globals (that are used by the call-backs and DMA) declared with the 'volatile' modifier.