Important changes to repositories hosted on mbed.com
Mbed hosted mercurial repositories are deprecated and are due to be permanently deleted in July 2026.
To keep a copy of this software download the repository Zip archive or clone locally using Mercurial.
It is also possible to export all your personal repositories from the account settings page.
Fork of dsp by
arm_lms_norm_f32.c
00001 /* ---------------------------------------------------------------------- 00002 * Copyright (C) 2010 ARM Limited. All rights reserved. 00003 * 00004 * $Date: 29. November 2010 00005 * $Revision: V1.0.3 00006 * 00007 * Project: CMSIS DSP Library 00008 * Title: arm_lms_norm_f32.c 00009 * 00010 * Description: Processing function for the floating-point Normalised LMS. 00011 * 00012 * Target Processor: Cortex-M4/Cortex-M3 00013 * 00014 * Version 1.0.3 2010/11/29 00015 * Re-organized the CMSIS folders and updated documentation. 00016 * 00017 * Version 1.0.2 2010/11/11 00018 * Documentation updated. 00019 * 00020 * Version 1.0.1 2010/10/05 00021 * Production release and review comments incorporated. 00022 * 00023 * Version 1.0.0 2010/09/20 00024 * Production release and review comments incorporated 00025 * 00026 * Version 0.0.7 2010/06/10 00027 * Misra-C changes done 00028 * -------------------------------------------------------------------- */ 00029 00030 #include "arm_math.h" 00031 00032 /** 00033 * @ingroup groupFilters 00034 */ 00035 00036 /** 00037 * @defgroup LMS_NORM Normalized LMS Filters 00038 * 00039 * This set of functions implements a commonly used adaptive filter. 00040 * It is related to the Least Mean Square (LMS) adaptive filter and includes an additional normalization 00041 * factor which increases the adaptation rate of the filter. 00042 * The CMSIS DSP Library contains normalized LMS filter functions that operate on Q15, Q31, and floating-point data types. 00043 * 00044 * A normalized least mean square (NLMS) filter consists of two components as shown below. 00045 * The first component is a standard transversal or FIR filter. 00046 * The second component is a coefficient update mechanism. 00047 * The NLMS filter has two input signals. 00048 * The "input" feeds the FIR filter while the "reference input" corresponds to the desired output of the FIR filter. 00049 * That is, the FIR filter coefficients are updated so that the output of the FIR filter matches the reference input. 00050 * The filter coefficient update mechanism is based on the difference between the FIR filter output and the reference input. 00051 * This "error signal" tends towards zero as the filter adapts. 00052 * The NLMS processing functions accept the input and reference input signals and generate the filter output and error signal. 00053 * \image html LMS.gif "Internal structure of the NLMS adaptive filter" 00054 * 00055 * The functions operate on blocks of data and each call to the function processes 00056 * <code>blockSize</code> samples through the filter. 00057 * <code>pSrc</code> points to input signal, <code>pRef</code> points to reference signal, 00058 * <code>pOut</code> points to output signal and <code>pErr</code> points to error signal. 00059 * All arrays contain <code>blockSize</code> values. 00060 * 00061 * The API functions operate on a block-by-block basis. 00062 * Internally, the filter coefficients <code>b[n]</code> are updated on a sample-by-sample basis. 00063 * The convergence of the LMS filter is slower compared to the normalized LMS algorithm. 00064 * 00065 * \par Algorithm: 00066 * The output signal <code>y[n]</code> is computed by a standard FIR filter: 00067 * <pre> 00068 * y[n] = b[0] * x[n] + b[1] * x[n-1] + b[2] * x[n-2] + ...+ b[numTaps-1] * x[n-numTaps+1] 00069 * </pre> 00070 * 00071 * \par 00072 * The error signal equals the difference between the reference signal <code>d[n]</code> and the filter output: 00073 * <pre> 00074 * e[n] = d[n] - y[n]. 00075 * </pre> 00076 * 00077 * \par 00078 * After each sample of the error signal is computed the instanteous energy of the filter state variables is calculated: 00079 * <pre> 00080 * E = x[n]^2 + x[n-1]^2 + ... + x[n-numTaps+1]^2. 00081 * </pre> 00082 * The filter coefficients <code>b[k]</code> are then updated on a sample-by-sample basis: 00083 * <pre> 00084 * b[k] = b[k] + e[n] * (mu/E) * x[n-k], for k=0, 1, ..., numTaps-1 00085 * </pre> 00086 * where <code>mu</code> is the step size and controls the rate of coefficient convergence. 00087 *\par 00088 * In the APIs, <code>pCoeffs</code> points to a coefficient array of size <code>numTaps</code>. 00089 * Coefficients are stored in time reversed order. 00090 * \par 00091 * <pre> 00092 * {b[numTaps-1], b[numTaps-2], b[N-2], ..., b[1], b[0]} 00093 * </pre> 00094 * \par 00095 * <code>pState</code> points to a state array of size <code>numTaps + blockSize - 1</code>. 00096 * Samples in the state buffer are stored in the order: 00097 * \par 00098 * <pre> 00099 * {x[n-numTaps+1], x[n-numTaps], x[n-numTaps-1], x[n-numTaps-2]....x[0], x[1], ..., x[blockSize-1]} 00100 * </pre> 00101 * \par 00102 * Note that the length of the state buffer exceeds the length of the coefficient array by <code>blockSize-1</code> samples. 00103 * The increased state buffer length allows circular addressing, which is traditionally used in FIR filters, 00104 * to be avoided and yields a significant speed improvement. 00105 * The state variables are updated after each block of data is processed. 00106 * \par Instance Structure 00107 * The coefficients and state variables for a filter are stored together in an instance data structure. 00108 * A separate instance structure must be defined for each filter and 00109 * coefficient and state arrays cannot be shared among instances. 00110 * There are separate instance structure declarations for each of the 3 supported data types. 00111 * 00112 * \par Initialization Functions 00113 * There is also an associated initialization function for each data type. 00114 * The initialization function performs the following operations: 00115 * - Sets the values of the internal structure fields. 00116 * - Zeros out the values in the state buffer. 00117 * \par 00118 * Instance structure cannot be placed into a const data section and it is recommended to use the initialization function. 00119 * \par Fixed-Point Behavior: 00120 * Care must be taken when using the Q15 and Q31 versions of the normalised LMS filter. 00121 * The following issues must be considered: 00122 * - Scaling of coefficients 00123 * - Overflow and saturation 00124 * 00125 * \par Scaling of Coefficients: 00126 * Filter coefficients are represented as fractional values and 00127 * coefficients are restricted to lie in the range <code>[-1 +1)</code>. 00128 * The fixed-point functions have an additional scaling parameter <code>postShift</code>. 00129 * At the output of the filter's accumulator is a shift register which shifts the result by <code>postShift</code> bits. 00130 * This essentially scales the filter coefficients by <code>2^postShift</code> and 00131 * allows the filter coefficients to exceed the range <code>[+1 -1)</code>. 00132 * The value of <code>postShift</code> is set by the user based on the expected gain through the system being modeled. 00133 * 00134 * \par Overflow and Saturation: 00135 * Overflow and saturation behavior of the fixed-point Q15 and Q31 versions are 00136 * described separately as part of the function specific documentation below. 00137 */ 00138 00139 00140 /** 00141 * @addtogroup LMS_NORM 00142 * @{ 00143 */ 00144 00145 00146 /** 00147 * @brief Processing function for floating-point normalized LMS filter. 00148 * @param[in] *S points to an instance of the floating-point normalized LMS filter structure. 00149 * @param[in] *pSrc points to the block of input data. 00150 * @param[in] *pRef points to the block of reference data. 00151 * @param[out] *pOut points to the block of output data. 00152 * @param[out] *pErr points to the block of error data. 00153 * @param[in] blockSize number of samples to process. 00154 * @return none. 00155 */ 00156 00157 void arm_lms_norm_f32( 00158 arm_lms_norm_instance_f32 * S, 00159 float32_t * pSrc, 00160 float32_t * pRef, 00161 float32_t * pOut, 00162 float32_t * pErr, 00163 uint32_t blockSize) 00164 { 00165 float32_t *pState = S->pState; /* State pointer */ 00166 float32_t *pCoeffs = S->pCoeffs; /* Coefficient pointer */ 00167 float32_t *pStateCurnt; /* Points to the current sample of the state */ 00168 float32_t *px, *pb; /* Temporary pointers for state and coefficient buffers */ 00169 float32_t mu = S->mu; /* Adaptive factor */ 00170 uint32_t numTaps = S->numTaps; /* Number of filter coefficients in the filter */ 00171 uint32_t tapCnt, blkCnt; /* Loop counters */ 00172 float32_t energy; /* Energy of the input */ 00173 float32_t sum, e, d; /* accumulator, error, reference data sample */ 00174 float32_t w, x0, in; /* weight factor, temporary variable to hold input sample and state */ 00175 00176 /* Initializations of error, difference, Coefficient update */ 00177 e = 0.0f; 00178 d = 0.0f; 00179 w = 0.0f; 00180 00181 energy = S->energy; 00182 x0 = S->x0; 00183 00184 /* S->pState points to buffer which contains previous frame (numTaps - 1) samples */ 00185 /* pStateCurnt points to the location where the new input data should be written */ 00186 pStateCurnt = &(S->pState[(numTaps - 1u)]); 00187 00188 blkCnt = blockSize; 00189 00190 while(blkCnt > 0u) 00191 { 00192 /* Copy the new input sample into the state buffer */ 00193 *pStateCurnt++ = *pSrc; 00194 00195 /* Initialize pState pointer */ 00196 px = pState; 00197 00198 /* Initialize coeff pointer */ 00199 pb = (pCoeffs); 00200 00201 /* Read the sample from input buffer */ 00202 in = *pSrc++; 00203 00204 /* Update the energy calculation */ 00205 energy -= x0 * x0; 00206 energy += in * in; 00207 00208 /* Set the accumulator to zero */ 00209 sum = 0.0f; 00210 00211 /* Loop unrolling. Process 4 taps at a time. */ 00212 tapCnt = numTaps >> 2; 00213 00214 while(tapCnt > 0u) 00215 { 00216 /* Perform the multiply-accumulate */ 00217 sum += (*px++) * (*pb++); 00218 sum += (*px++) * (*pb++); 00219 sum += (*px++) * (*pb++); 00220 sum += (*px++) * (*pb++); 00221 00222 /* Decrement the loop counter */ 00223 tapCnt--; 00224 } 00225 00226 /* If the filter length is not a multiple of 4, compute the remaining filter taps */ 00227 tapCnt = numTaps % 0x4u; 00228 00229 while(tapCnt > 0u) 00230 { 00231 /* Perform the multiply-accumulate */ 00232 sum += (*px++) * (*pb++); 00233 00234 /* Decrement the loop counter */ 00235 tapCnt--; 00236 } 00237 00238 /* The result in the accumulator, store in the destination buffer. */ 00239 *pOut++ = sum; 00240 00241 /* Compute and store error */ 00242 d = (float32_t) (*pRef++); 00243 e = d - sum; 00244 *pErr++ = e; 00245 00246 /* Calculation of Weighting factor for updating filter coefficients */ 00247 /* epsilon value 0.000000119209289f */ 00248 w = (e * mu) / (energy + 0.000000119209289f); 00249 00250 /* Initialize pState pointer */ 00251 px = pState; 00252 00253 /* Initialize coeff pointer */ 00254 pb = (pCoeffs); 00255 00256 /* Loop unrolling. Process 4 taps at a time. */ 00257 tapCnt = numTaps >> 2; 00258 00259 /* Update filter coefficients */ 00260 while(tapCnt > 0u) 00261 { 00262 /* Perform the multiply-accumulate */ 00263 *pb += w * (*px++); 00264 pb++; 00265 00266 *pb += w * (*px++); 00267 pb++; 00268 00269 *pb += w * (*px++); 00270 pb++; 00271 00272 *pb += w * (*px++); 00273 pb++; 00274 00275 00276 /* Decrement the loop counter */ 00277 tapCnt--; 00278 } 00279 00280 /* If the filter length is not a multiple of 4, compute the remaining filter taps */ 00281 tapCnt = numTaps % 0x4u; 00282 00283 while(tapCnt > 0u) 00284 { 00285 /* Perform the multiply-accumulate */ 00286 *pb += w * (*px++); 00287 pb++; 00288 00289 /* Decrement the loop counter */ 00290 tapCnt--; 00291 } 00292 00293 x0 = *pState; 00294 00295 /* Advance state pointer by 1 for the next sample */ 00296 pState = pState + 1; 00297 00298 /* Decrement the loop counter */ 00299 blkCnt--; 00300 } 00301 00302 S->energy = energy; 00303 S->x0 = x0; 00304 00305 /* Processing is complete. Now copy the last numTaps - 1 samples to the 00306 satrt of the state buffer. This prepares the state buffer for the 00307 next function call. */ 00308 00309 /* Points to the start of the pState buffer */ 00310 pStateCurnt = S->pState; 00311 00312 /* Loop unrolling for (numTaps - 1u)/4 samples copy */ 00313 tapCnt = (numTaps - 1u) >> 2u; 00314 00315 /* copy data */ 00316 while(tapCnt > 0u) 00317 { 00318 *pStateCurnt++ = *pState++; 00319 *pStateCurnt++ = *pState++; 00320 *pStateCurnt++ = *pState++; 00321 *pStateCurnt++ = *pState++; 00322 00323 /* Decrement the loop counter */ 00324 tapCnt--; 00325 } 00326 00327 /* Calculate remaining number of copies */ 00328 tapCnt = (numTaps - 1u) % 0x4u; 00329 00330 /* Copy the remaining q31_t data */ 00331 while(tapCnt > 0u) 00332 { 00333 *pStateCurnt++ = *pState++; 00334 00335 /* Decrement the loop counter */ 00336 tapCnt--; 00337 } 00338 00339 00340 } 00341 00342 /** 00343 * @} end of LMS_NORM group 00344 */
Generated on Tue Jul 12 2022 19:55:43 by
1.7.2
