Neil Tan / dsp

Fork of dsp by Simon Ford

Committer:
simon
Date:
Thu Mar 10 15:07:50 2011 +0000
Revision:
0:1014af42efd9

        

Who changed what in which revision?

UserRevisionLine numberNew contents of line
simon 0:1014af42efd9 1 /* ----------------------------------------------------------------------
simon 0:1014af42efd9 2 * Copyright (C) 2010 ARM Limited. All rights reserved.
simon 0:1014af42efd9 3 *
simon 0:1014af42efd9 4 * $Date: 29. November 2010
simon 0:1014af42efd9 5 * $Revision: V1.0.3
simon 0:1014af42efd9 6 *
simon 0:1014af42efd9 7 * Project: CMSIS DSP Library
simon 0:1014af42efd9 8 * Title: arm_lms_norm_f32.c
simon 0:1014af42efd9 9 *
simon 0:1014af42efd9 10 * Description: Processing function for the floating-point Normalised LMS.
simon 0:1014af42efd9 11 *
simon 0:1014af42efd9 12 * Target Processor: Cortex-M4/Cortex-M3
simon 0:1014af42efd9 13 *
simon 0:1014af42efd9 14 * Version 1.0.3 2010/11/29
simon 0:1014af42efd9 15 * Re-organized the CMSIS folders and updated documentation.
simon 0:1014af42efd9 16 *
simon 0:1014af42efd9 17 * Version 1.0.2 2010/11/11
simon 0:1014af42efd9 18 * Documentation updated.
simon 0:1014af42efd9 19 *
simon 0:1014af42efd9 20 * Version 1.0.1 2010/10/05
simon 0:1014af42efd9 21 * Production release and review comments incorporated.
simon 0:1014af42efd9 22 *
simon 0:1014af42efd9 23 * Version 1.0.0 2010/09/20
simon 0:1014af42efd9 24 * Production release and review comments incorporated
simon 0:1014af42efd9 25 *
simon 0:1014af42efd9 26 * Version 0.0.7 2010/06/10
simon 0:1014af42efd9 27 * Misra-C changes done
simon 0:1014af42efd9 28 * -------------------------------------------------------------------- */
simon 0:1014af42efd9 29
simon 0:1014af42efd9 30 #include "arm_math.h"
simon 0:1014af42efd9 31
simon 0:1014af42efd9 32 /**
simon 0:1014af42efd9 33 * @ingroup groupFilters
simon 0:1014af42efd9 34 */
simon 0:1014af42efd9 35
simon 0:1014af42efd9 36 /**
simon 0:1014af42efd9 37 * @defgroup LMS_NORM Normalized LMS Filters
simon 0:1014af42efd9 38 *
simon 0:1014af42efd9 39 * This set of functions implements a commonly used adaptive filter.
simon 0:1014af42efd9 40 * It is related to the Least Mean Square (LMS) adaptive filter and includes an additional normalization
simon 0:1014af42efd9 41 * factor which increases the adaptation rate of the filter.
simon 0:1014af42efd9 42 * The CMSIS DSP Library contains normalized LMS filter functions that operate on Q15, Q31, and floating-point data types.
simon 0:1014af42efd9 43 *
simon 0:1014af42efd9 44 * A normalized least mean square (NLMS) filter consists of two components as shown below.
simon 0:1014af42efd9 45 * The first component is a standard transversal or FIR filter.
simon 0:1014af42efd9 46 * The second component is a coefficient update mechanism.
simon 0:1014af42efd9 47 * The NLMS filter has two input signals.
simon 0:1014af42efd9 48 * The "input" feeds the FIR filter while the "reference input" corresponds to the desired output of the FIR filter.
simon 0:1014af42efd9 49 * That is, the FIR filter coefficients are updated so that the output of the FIR filter matches the reference input.
simon 0:1014af42efd9 50 * The filter coefficient update mechanism is based on the difference between the FIR filter output and the reference input.
simon 0:1014af42efd9 51 * This "error signal" tends towards zero as the filter adapts.
simon 0:1014af42efd9 52 * The NLMS processing functions accept the input and reference input signals and generate the filter output and error signal.
simon 0:1014af42efd9 53 * \image html LMS.gif "Internal structure of the NLMS adaptive filter"
simon 0:1014af42efd9 54 *
simon 0:1014af42efd9 55 * The functions operate on blocks of data and each call to the function processes
simon 0:1014af42efd9 56 * <code>blockSize</code> samples through the filter.
simon 0:1014af42efd9 57 * <code>pSrc</code> points to input signal, <code>pRef</code> points to reference signal,
simon 0:1014af42efd9 58 * <code>pOut</code> points to output signal and <code>pErr</code> points to error signal.
simon 0:1014af42efd9 59 * All arrays contain <code>blockSize</code> values.
simon 0:1014af42efd9 60 *
simon 0:1014af42efd9 61 * The API functions operate on a block-by-block basis.
simon 0:1014af42efd9 62 * Internally, the filter coefficients <code>b[n]</code> are updated on a sample-by-sample basis.
simon 0:1014af42efd9 63 * The convergence of the LMS filter is slower compared to the normalized LMS algorithm.
simon 0:1014af42efd9 64 *
simon 0:1014af42efd9 65 * \par Algorithm:
simon 0:1014af42efd9 66 * The output signal <code>y[n]</code> is computed by a standard FIR filter:
simon 0:1014af42efd9 67 * <pre>
simon 0:1014af42efd9 68 * y[n] = b[0] * x[n] + b[1] * x[n-1] + b[2] * x[n-2] + ...+ b[numTaps-1] * x[n-numTaps+1]
simon 0:1014af42efd9 69 * </pre>
simon 0:1014af42efd9 70 *
simon 0:1014af42efd9 71 * \par
simon 0:1014af42efd9 72 * The error signal equals the difference between the reference signal <code>d[n]</code> and the filter output:
simon 0:1014af42efd9 73 * <pre>
simon 0:1014af42efd9 74 * e[n] = d[n] - y[n].
simon 0:1014af42efd9 75 * </pre>
simon 0:1014af42efd9 76 *
simon 0:1014af42efd9 77 * \par
simon 0:1014af42efd9 78 * After each sample of the error signal is computed the instanteous energy of the filter state variables is calculated:
simon 0:1014af42efd9 79 * <pre>
simon 0:1014af42efd9 80 * E = x[n]^2 + x[n-1]^2 + ... + x[n-numTaps+1]^2.
simon 0:1014af42efd9 81 * </pre>
simon 0:1014af42efd9 82 * The filter coefficients <code>b[k]</code> are then updated on a sample-by-sample basis:
simon 0:1014af42efd9 83 * <pre>
simon 0:1014af42efd9 84 * b[k] = b[k] + e[n] * (mu/E) * x[n-k], for k=0, 1, ..., numTaps-1
simon 0:1014af42efd9 85 * </pre>
simon 0:1014af42efd9 86 * where <code>mu</code> is the step size and controls the rate of coefficient convergence.
simon 0:1014af42efd9 87 *\par
simon 0:1014af42efd9 88 * In the APIs, <code>pCoeffs</code> points to a coefficient array of size <code>numTaps</code>.
simon 0:1014af42efd9 89 * Coefficients are stored in time reversed order.
simon 0:1014af42efd9 90 * \par
simon 0:1014af42efd9 91 * <pre>
simon 0:1014af42efd9 92 * {b[numTaps-1], b[numTaps-2], b[N-2], ..., b[1], b[0]}
simon 0:1014af42efd9 93 * </pre>
simon 0:1014af42efd9 94 * \par
simon 0:1014af42efd9 95 * <code>pState</code> points to a state array of size <code>numTaps + blockSize - 1</code>.
simon 0:1014af42efd9 96 * Samples in the state buffer are stored in the order:
simon 0:1014af42efd9 97 * \par
simon 0:1014af42efd9 98 * <pre>
simon 0:1014af42efd9 99 * {x[n-numTaps+1], x[n-numTaps], x[n-numTaps-1], x[n-numTaps-2]....x[0], x[1], ..., x[blockSize-1]}
simon 0:1014af42efd9 100 * </pre>
simon 0:1014af42efd9 101 * \par
simon 0:1014af42efd9 102 * Note that the length of the state buffer exceeds the length of the coefficient array by <code>blockSize-1</code> samples.
simon 0:1014af42efd9 103 * The increased state buffer length allows circular addressing, which is traditionally used in FIR filters,
simon 0:1014af42efd9 104 * to be avoided and yields a significant speed improvement.
simon 0:1014af42efd9 105 * The state variables are updated after each block of data is processed.
simon 0:1014af42efd9 106 * \par Instance Structure
simon 0:1014af42efd9 107 * The coefficients and state variables for a filter are stored together in an instance data structure.
simon 0:1014af42efd9 108 * A separate instance structure must be defined for each filter and
simon 0:1014af42efd9 109 * coefficient and state arrays cannot be shared among instances.
simon 0:1014af42efd9 110 * There are separate instance structure declarations for each of the 3 supported data types.
simon 0:1014af42efd9 111 *
simon 0:1014af42efd9 112 * \par Initialization Functions
simon 0:1014af42efd9 113 * There is also an associated initialization function for each data type.
simon 0:1014af42efd9 114 * The initialization function performs the following operations:
simon 0:1014af42efd9 115 * - Sets the values of the internal structure fields.
simon 0:1014af42efd9 116 * - Zeros out the values in the state buffer.
simon 0:1014af42efd9 117 * \par
simon 0:1014af42efd9 118 * Instance structure cannot be placed into a const data section and it is recommended to use the initialization function.
simon 0:1014af42efd9 119 * \par Fixed-Point Behavior:
simon 0:1014af42efd9 120 * Care must be taken when using the Q15 and Q31 versions of the normalised LMS filter.
simon 0:1014af42efd9 121 * The following issues must be considered:
simon 0:1014af42efd9 122 * - Scaling of coefficients
simon 0:1014af42efd9 123 * - Overflow and saturation
simon 0:1014af42efd9 124 *
simon 0:1014af42efd9 125 * \par Scaling of Coefficients:
simon 0:1014af42efd9 126 * Filter coefficients are represented as fractional values and
simon 0:1014af42efd9 127 * coefficients are restricted to lie in the range <code>[-1 +1)</code>.
simon 0:1014af42efd9 128 * The fixed-point functions have an additional scaling parameter <code>postShift</code>.
simon 0:1014af42efd9 129 * At the output of the filter's accumulator is a shift register which shifts the result by <code>postShift</code> bits.
simon 0:1014af42efd9 130 * This essentially scales the filter coefficients by <code>2^postShift</code> and
simon 0:1014af42efd9 131 * allows the filter coefficients to exceed the range <code>[+1 -1)</code>.
simon 0:1014af42efd9 132 * The value of <code>postShift</code> is set by the user based on the expected gain through the system being modeled.
simon 0:1014af42efd9 133 *
simon 0:1014af42efd9 134 * \par Overflow and Saturation:
simon 0:1014af42efd9 135 * Overflow and saturation behavior of the fixed-point Q15 and Q31 versions are
simon 0:1014af42efd9 136 * described separately as part of the function specific documentation below.
simon 0:1014af42efd9 137 */
simon 0:1014af42efd9 138
simon 0:1014af42efd9 139
simon 0:1014af42efd9 140 /**
simon 0:1014af42efd9 141 * @addtogroup LMS_NORM
simon 0:1014af42efd9 142 * @{
simon 0:1014af42efd9 143 */
simon 0:1014af42efd9 144
simon 0:1014af42efd9 145
simon 0:1014af42efd9 146 /**
simon 0:1014af42efd9 147 * @brief Processing function for floating-point normalized LMS filter.
simon 0:1014af42efd9 148 * @param[in] *S points to an instance of the floating-point normalized LMS filter structure.
simon 0:1014af42efd9 149 * @param[in] *pSrc points to the block of input data.
simon 0:1014af42efd9 150 * @param[in] *pRef points to the block of reference data.
simon 0:1014af42efd9 151 * @param[out] *pOut points to the block of output data.
simon 0:1014af42efd9 152 * @param[out] *pErr points to the block of error data.
simon 0:1014af42efd9 153 * @param[in] blockSize number of samples to process.
simon 0:1014af42efd9 154 * @return none.
simon 0:1014af42efd9 155 */
simon 0:1014af42efd9 156
simon 0:1014af42efd9 157 void arm_lms_norm_f32(
simon 0:1014af42efd9 158 arm_lms_norm_instance_f32 * S,
simon 0:1014af42efd9 159 float32_t * pSrc,
simon 0:1014af42efd9 160 float32_t * pRef,
simon 0:1014af42efd9 161 float32_t * pOut,
simon 0:1014af42efd9 162 float32_t * pErr,
simon 0:1014af42efd9 163 uint32_t blockSize)
simon 0:1014af42efd9 164 {
simon 0:1014af42efd9 165 float32_t *pState = S->pState; /* State pointer */
simon 0:1014af42efd9 166 float32_t *pCoeffs = S->pCoeffs; /* Coefficient pointer */
simon 0:1014af42efd9 167 float32_t *pStateCurnt; /* Points to the current sample of the state */
simon 0:1014af42efd9 168 float32_t *px, *pb; /* Temporary pointers for state and coefficient buffers */
simon 0:1014af42efd9 169 float32_t mu = S->mu; /* Adaptive factor */
simon 0:1014af42efd9 170 uint32_t numTaps = S->numTaps; /* Number of filter coefficients in the filter */
simon 0:1014af42efd9 171 uint32_t tapCnt, blkCnt; /* Loop counters */
simon 0:1014af42efd9 172 float32_t energy; /* Energy of the input */
simon 0:1014af42efd9 173 float32_t sum, e, d; /* accumulator, error, reference data sample */
simon 0:1014af42efd9 174 float32_t w, x0, in; /* weight factor, temporary variable to hold input sample and state */
simon 0:1014af42efd9 175
simon 0:1014af42efd9 176 /* Initializations of error, difference, Coefficient update */
simon 0:1014af42efd9 177 e = 0.0f;
simon 0:1014af42efd9 178 d = 0.0f;
simon 0:1014af42efd9 179 w = 0.0f;
simon 0:1014af42efd9 180
simon 0:1014af42efd9 181 energy = S->energy;
simon 0:1014af42efd9 182 x0 = S->x0;
simon 0:1014af42efd9 183
simon 0:1014af42efd9 184 /* S->pState points to buffer which contains previous frame (numTaps - 1) samples */
simon 0:1014af42efd9 185 /* pStateCurnt points to the location where the new input data should be written */
simon 0:1014af42efd9 186 pStateCurnt = &(S->pState[(numTaps - 1u)]);
simon 0:1014af42efd9 187
simon 0:1014af42efd9 188 blkCnt = blockSize;
simon 0:1014af42efd9 189
simon 0:1014af42efd9 190 while(blkCnt > 0u)
simon 0:1014af42efd9 191 {
simon 0:1014af42efd9 192 /* Copy the new input sample into the state buffer */
simon 0:1014af42efd9 193 *pStateCurnt++ = *pSrc;
simon 0:1014af42efd9 194
simon 0:1014af42efd9 195 /* Initialize pState pointer */
simon 0:1014af42efd9 196 px = pState;
simon 0:1014af42efd9 197
simon 0:1014af42efd9 198 /* Initialize coeff pointer */
simon 0:1014af42efd9 199 pb = (pCoeffs);
simon 0:1014af42efd9 200
simon 0:1014af42efd9 201 /* Read the sample from input buffer */
simon 0:1014af42efd9 202 in = *pSrc++;
simon 0:1014af42efd9 203
simon 0:1014af42efd9 204 /* Update the energy calculation */
simon 0:1014af42efd9 205 energy -= x0 * x0;
simon 0:1014af42efd9 206 energy += in * in;
simon 0:1014af42efd9 207
simon 0:1014af42efd9 208 /* Set the accumulator to zero */
simon 0:1014af42efd9 209 sum = 0.0f;
simon 0:1014af42efd9 210
simon 0:1014af42efd9 211 /* Loop unrolling. Process 4 taps at a time. */
simon 0:1014af42efd9 212 tapCnt = numTaps >> 2;
simon 0:1014af42efd9 213
simon 0:1014af42efd9 214 while(tapCnt > 0u)
simon 0:1014af42efd9 215 {
simon 0:1014af42efd9 216 /* Perform the multiply-accumulate */
simon 0:1014af42efd9 217 sum += (*px++) * (*pb++);
simon 0:1014af42efd9 218 sum += (*px++) * (*pb++);
simon 0:1014af42efd9 219 sum += (*px++) * (*pb++);
simon 0:1014af42efd9 220 sum += (*px++) * (*pb++);
simon 0:1014af42efd9 221
simon 0:1014af42efd9 222 /* Decrement the loop counter */
simon 0:1014af42efd9 223 tapCnt--;
simon 0:1014af42efd9 224 }
simon 0:1014af42efd9 225
simon 0:1014af42efd9 226 /* If the filter length is not a multiple of 4, compute the remaining filter taps */
simon 0:1014af42efd9 227 tapCnt = numTaps % 0x4u;
simon 0:1014af42efd9 228
simon 0:1014af42efd9 229 while(tapCnt > 0u)
simon 0:1014af42efd9 230 {
simon 0:1014af42efd9 231 /* Perform the multiply-accumulate */
simon 0:1014af42efd9 232 sum += (*px++) * (*pb++);
simon 0:1014af42efd9 233
simon 0:1014af42efd9 234 /* Decrement the loop counter */
simon 0:1014af42efd9 235 tapCnt--;
simon 0:1014af42efd9 236 }
simon 0:1014af42efd9 237
simon 0:1014af42efd9 238 /* The result in the accumulator, store in the destination buffer. */
simon 0:1014af42efd9 239 *pOut++ = sum;
simon 0:1014af42efd9 240
simon 0:1014af42efd9 241 /* Compute and store error */
simon 0:1014af42efd9 242 d = (float32_t) (*pRef++);
simon 0:1014af42efd9 243 e = d - sum;
simon 0:1014af42efd9 244 *pErr++ = e;
simon 0:1014af42efd9 245
simon 0:1014af42efd9 246 /* Calculation of Weighting factor for updating filter coefficients */
simon 0:1014af42efd9 247 /* epsilon value 0.000000119209289f */
simon 0:1014af42efd9 248 w = (e * mu) / (energy + 0.000000119209289f);
simon 0:1014af42efd9 249
simon 0:1014af42efd9 250 /* Initialize pState pointer */
simon 0:1014af42efd9 251 px = pState;
simon 0:1014af42efd9 252
simon 0:1014af42efd9 253 /* Initialize coeff pointer */
simon 0:1014af42efd9 254 pb = (pCoeffs);
simon 0:1014af42efd9 255
simon 0:1014af42efd9 256 /* Loop unrolling. Process 4 taps at a time. */
simon 0:1014af42efd9 257 tapCnt = numTaps >> 2;
simon 0:1014af42efd9 258
simon 0:1014af42efd9 259 /* Update filter coefficients */
simon 0:1014af42efd9 260 while(tapCnt > 0u)
simon 0:1014af42efd9 261 {
simon 0:1014af42efd9 262 /* Perform the multiply-accumulate */
simon 0:1014af42efd9 263 *pb += w * (*px++);
simon 0:1014af42efd9 264 pb++;
simon 0:1014af42efd9 265
simon 0:1014af42efd9 266 *pb += w * (*px++);
simon 0:1014af42efd9 267 pb++;
simon 0:1014af42efd9 268
simon 0:1014af42efd9 269 *pb += w * (*px++);
simon 0:1014af42efd9 270 pb++;
simon 0:1014af42efd9 271
simon 0:1014af42efd9 272 *pb += w * (*px++);
simon 0:1014af42efd9 273 pb++;
simon 0:1014af42efd9 274
simon 0:1014af42efd9 275
simon 0:1014af42efd9 276 /* Decrement the loop counter */
simon 0:1014af42efd9 277 tapCnt--;
simon 0:1014af42efd9 278 }
simon 0:1014af42efd9 279
simon 0:1014af42efd9 280 /* If the filter length is not a multiple of 4, compute the remaining filter taps */
simon 0:1014af42efd9 281 tapCnt = numTaps % 0x4u;
simon 0:1014af42efd9 282
simon 0:1014af42efd9 283 while(tapCnt > 0u)
simon 0:1014af42efd9 284 {
simon 0:1014af42efd9 285 /* Perform the multiply-accumulate */
simon 0:1014af42efd9 286 *pb += w * (*px++);
simon 0:1014af42efd9 287 pb++;
simon 0:1014af42efd9 288
simon 0:1014af42efd9 289 /* Decrement the loop counter */
simon 0:1014af42efd9 290 tapCnt--;
simon 0:1014af42efd9 291 }
simon 0:1014af42efd9 292
simon 0:1014af42efd9 293 x0 = *pState;
simon 0:1014af42efd9 294
simon 0:1014af42efd9 295 /* Advance state pointer by 1 for the next sample */
simon 0:1014af42efd9 296 pState = pState + 1;
simon 0:1014af42efd9 297
simon 0:1014af42efd9 298 /* Decrement the loop counter */
simon 0:1014af42efd9 299 blkCnt--;
simon 0:1014af42efd9 300 }
simon 0:1014af42efd9 301
simon 0:1014af42efd9 302 S->energy = energy;
simon 0:1014af42efd9 303 S->x0 = x0;
simon 0:1014af42efd9 304
simon 0:1014af42efd9 305 /* Processing is complete. Now copy the last numTaps - 1 samples to the
simon 0:1014af42efd9 306 satrt of the state buffer. This prepares the state buffer for the
simon 0:1014af42efd9 307 next function call. */
simon 0:1014af42efd9 308
simon 0:1014af42efd9 309 /* Points to the start of the pState buffer */
simon 0:1014af42efd9 310 pStateCurnt = S->pState;
simon 0:1014af42efd9 311
simon 0:1014af42efd9 312 /* Loop unrolling for (numTaps - 1u)/4 samples copy */
simon 0:1014af42efd9 313 tapCnt = (numTaps - 1u) >> 2u;
simon 0:1014af42efd9 314
simon 0:1014af42efd9 315 /* copy data */
simon 0:1014af42efd9 316 while(tapCnt > 0u)
simon 0:1014af42efd9 317 {
simon 0:1014af42efd9 318 *pStateCurnt++ = *pState++;
simon 0:1014af42efd9 319 *pStateCurnt++ = *pState++;
simon 0:1014af42efd9 320 *pStateCurnt++ = *pState++;
simon 0:1014af42efd9 321 *pStateCurnt++ = *pState++;
simon 0:1014af42efd9 322
simon 0:1014af42efd9 323 /* Decrement the loop counter */
simon 0:1014af42efd9 324 tapCnt--;
simon 0:1014af42efd9 325 }
simon 0:1014af42efd9 326
simon 0:1014af42efd9 327 /* Calculate remaining number of copies */
simon 0:1014af42efd9 328 tapCnt = (numTaps - 1u) % 0x4u;
simon 0:1014af42efd9 329
simon 0:1014af42efd9 330 /* Copy the remaining q31_t data */
simon 0:1014af42efd9 331 while(tapCnt > 0u)
simon 0:1014af42efd9 332 {
simon 0:1014af42efd9 333 *pStateCurnt++ = *pState++;
simon 0:1014af42efd9 334
simon 0:1014af42efd9 335 /* Decrement the loop counter */
simon 0:1014af42efd9 336 tapCnt--;
simon 0:1014af42efd9 337 }
simon 0:1014af42efd9 338
simon 0:1014af42efd9 339
simon 0:1014af42efd9 340 }
simon 0:1014af42efd9 341
simon 0:1014af42efd9 342 /**
simon 0:1014af42efd9 343 * @} end of LMS_NORM group
simon 0:1014af42efd9 344 */