CMSIS DSP Library from CMSIS 2.0. See http://www.onarm.com/cmsis/ for full details
Dependents: K22F_DSP_Matrix_least_square BNO055-ELEC3810 1BNO055 ECE4180Project--Slave2 ... more
arm_lms_f32.c
00001 /* ---------------------------------------------------------------------- 00002 * Copyright (C) 2010 ARM Limited. All rights reserved. 00003 * 00004 * $Date: 29. November 2010 00005 * $Revision: V1.0.3 00006 * 00007 * Project: CMSIS DSP Library 00008 * Title: arm_lms_f32.c 00009 * 00010 * Description: Processing function for the floating-point LMS filter. 00011 * 00012 * Target Processor: Cortex-M4/Cortex-M3 00013 * 00014 * Version 1.0.3 2010/11/29 00015 * Re-organized the CMSIS folders and updated documentation. 00016 * 00017 * Version 1.0.2 2010/11/11 00018 * Documentation updated. 00019 * 00020 * Version 1.0.1 2010/10/05 00021 * Production release and review comments incorporated. 00022 * 00023 * Version 1.0.0 2010/09/20 00024 * Production release and review comments incorporated 00025 * 00026 * Version 0.0.7 2010/06/10 00027 * Misra-C changes done 00028 * -------------------------------------------------------------------- */ 00029 00030 #include "arm_math.h" 00031 00032 /** 00033 * @ingroup groupFilters 00034 */ 00035 00036 /** 00037 * @defgroup LMS Least Mean Square (LMS) Filters 00038 * 00039 * LMS filters are a class of adaptive filters that are able to "learn" an unknown transfer functions. 00040 * LMS filters use a gradient descent method in which the filter coefficients are updated based on the instantaneous error signal. 00041 * Adaptive filters are often used in communication systems, equalizers, and noise removal. 00042 * The CMSIS DSP Library contains LMS filter functions that operate on Q15, Q31, and floating-point data types. 00043 * The library also contains normalized LMS filters in which the filter coefficient adaptation is indepedent of the level of the input signal. 00044 * 00045 * An LMS filter consists of two components as shown below. 00046 * The first component is a standard transversal or FIR filter. 00047 * The second component is a coefficient update mechanism. 00048 * The LMS filter has two input signals. 00049 * The "input" feeds the FIR filter while the "reference input" corresponds to the desired output of the FIR filter. 00050 * That is, the FIR filter coefficients are updated so that the output of the FIR filter matches the reference input. 00051 * The filter coefficient update mechanism is based on the difference between the FIR filter output and the reference input. 00052 * This "error signal" tends towards zero as the filter adapts. 00053 * The LMS processing functions accept the input and reference input signals and generate the filter output and error signal. 00054 * \image html LMS.gif "Internal structure of the Least Mean Square filter" 00055 * 00056 * The functions operate on blocks of data and each call to the function processes 00057 * <code>blockSize</code> samples through the filter. 00058 * <code>pSrc</code> points to input signal, <code>pRef</code> points to reference signal, 00059 * <code>pOut</code> points to output signal and <code>pErr</code> points to error signal. 00060 * All arrays contain <code>blockSize</code> values. 00061 * 00062 * The API functions operate on a block-by-block basis. 00063 * Internally, the filter coefficients <code>b[n]</code> are updated on a sample-by-sample basis. 00064 * The convergence of the LMS filter is slower compared to the normalized LMS algorithm. 00065 * 00066 * \par Algorithm: 00067 * The output signal <code>y[n]</code> is computed by a standard FIR filter: 00068 * <pre> 00069 * y[n] = b[0] * x[n] + b[1] * x[n-1] + b[2] * x[n-2] + ...+ b[numTaps-1] * x[n-numTaps+1] 00070 * </pre> 00071 * 00072 * \par 00073 * The error signal equals the difference between the reference signal <code>d[n]</code> and the filter output: 00074 * <pre> 00075 * e[n] = d[n] - y[n]. 00076 * </pre> 00077 * 00078 * \par 00079 * After each sample of the error signal is computed, the filter coefficients <code>b[k]</code> are updated on a sample-by-sample basis: 00080 * <pre> 00081 * b[k] = b[k] + e[n] * mu * x[n-k], for k=0, 1, ..., numTaps-1 00082 * </pre> 00083 * where <code>mu</code> is the step size and controls the rate of coefficient convergence. 00084 *\par 00085 * In the APIs, <code>pCoeffs</code> points to a coefficient array of size <code>numTaps</code>. 00086 * Coefficients are stored in time reversed order. 00087 * \par 00088 * <pre> 00089 * {b[numTaps-1], b[numTaps-2], b[N-2], ..., b[1], b[0]} 00090 * </pre> 00091 * \par 00092 * <code>pState</code> points to a state array of size <code>numTaps + blockSize - 1</code>. 00093 * Samples in the state buffer are stored in the order: 00094 * \par 00095 * <pre> 00096 * {x[n-numTaps+1], x[n-numTaps], x[n-numTaps-1], x[n-numTaps-2]....x[0], x[1], ..., x[blockSize-1]} 00097 * </pre> 00098 * \par 00099 * Note that the length of the state buffer exceeds the length of the coefficient array by <code>blockSize-1</code> samples. 00100 * The increased state buffer length allows circular addressing, which is traditionally used in FIR filters, 00101 * to be avoided and yields a significant speed improvement. 00102 * The state variables are updated after each block of data is processed. 00103 * \par Instance Structure 00104 * The coefficients and state variables for a filter are stored together in an instance data structure. 00105 * A separate instance structure must be defined for each filter and 00106 * coefficient and state arrays cannot be shared among instances. 00107 * There are separate instance structure declarations for each of the 3 supported data types. 00108 * 00109 * \par Initialization Functions 00110 * There is also an associated initialization function for each data type. 00111 * The initialization function performs the following operations: 00112 * - Sets the values of the internal structure fields. 00113 * - Zeros out the values in the state buffer. 00114 * \par 00115 * Use of the initialization function is optional. 00116 * However, if the initialization function is used, then the instance structure cannot be placed into a const data section. 00117 * To place an instance structure into a const data section, the instance structure must be manually initialized. 00118 * Set the values in the state buffer to zeros before static initialization. 00119 * The code below statically initializes each of the 3 different data type filter instance structures 00120 * <pre> 00121 * arm_lms_instance_f32 S = {numTaps, pState, pCoeffs, mu}; 00122 * arm_lms_instance_q31 S = {numTaps, pState, pCoeffs, mu, postShift}; 00123 * arm_lms_instance_q15 S = {numTaps, pState, pCoeffs, mu, postShift}; 00124 * </pre> 00125 * where <code>numTaps</code> is the number of filter coefficients in the filter; <code>pState</code> is the address of the state buffer; 00126 * <code>pCoeffs</code> is the address of the coefficient buffer; <code>mu</code> is the step size parameter; and <code>postShift</code> is the shift applied to coefficients. 00127 * 00128 * \par Fixed-Point Behavior: 00129 * Care must be taken when using the Q15 and Q31 versions of the LMS filter. 00130 * The following issues must be considered: 00131 * - Scaling of coefficients 00132 * - Overflow and saturation 00133 * 00134 * \par Scaling of Coefficients: 00135 * Filter coefficients are represented as fractional values and 00136 * coefficients are restricted to lie in the range <code>[-1 +1)</code>. 00137 * The fixed-point functions have an additional scaling parameter <code>postShift</code>. 00138 * At the output of the filter's accumulator is a shift register which shifts the result by <code>postShift</code> bits. 00139 * This essentially scales the filter coefficients by <code>2^postShift</code> and 00140 * allows the filter coefficients to exceed the range <code>[+1 -1)</code>. 00141 * The value of <code>postShift</code> is set by the user based on the expected gain through the system being modeled. 00142 * 00143 * \par Overflow and Saturation: 00144 * Overflow and saturation behavior of the fixed-point Q15 and Q31 versions are 00145 * described separately as part of the function specific documentation below. 00146 */ 00147 00148 /** 00149 * @addtogroup LMS 00150 * @{ 00151 */ 00152 00153 /** 00154 * @brief Processing function for floating-point LMS filter. 00155 * @param[in] *S points to an instance of the floating-point LMS filter structure. 00156 * @param[in] *pSrc points to the block of input data. 00157 * @param[in] *pRef points to the block of reference data. 00158 * @param[out] *pOut points to the block of output data. 00159 * @param[out] *pErr points to the block of error data. 00160 * @param[in] blockSize number of samples to process. 00161 * @return none. 00162 */ 00163 00164 void arm_lms_f32( 00165 const arm_lms_instance_f32 * S, 00166 float32_t * pSrc, 00167 float32_t * pRef, 00168 float32_t * pOut, 00169 float32_t * pErr, 00170 uint32_t blockSize) 00171 { 00172 float32_t *pState = S->pState; /* State pointer */ 00173 float32_t *pCoeffs = S->pCoeffs; /* Coefficient pointer */ 00174 float32_t *pStateCurnt; /* Points to the current sample of the state */ 00175 float32_t *px, *pb; /* Temporary pointers for state and coefficient buffers */ 00176 float32_t mu = S->mu; /* Adaptive factor */ 00177 uint32_t numTaps = S->numTaps; /* Number of filter coefficients in the filter */ 00178 uint32_t tapCnt, blkCnt; /* Loop counters */ 00179 float32_t sum, e, d; /* accumulator, error, reference data sample */ 00180 float32_t w = 0.0f; /* weight factor */ 00181 00182 e = 0.0f; 00183 d = 0.0f; 00184 00185 /* S->pState points to state array which contains previous frame (numTaps - 1) samples */ 00186 /* pStateCurnt points to the location where the new input data should be written */ 00187 pStateCurnt = &(S->pState[(numTaps - 1u)]); 00188 00189 blkCnt = blockSize; 00190 00191 while(blkCnt > 0u) 00192 { 00193 /* Copy the new input sample into the state buffer */ 00194 *pStateCurnt++ = *pSrc++; 00195 00196 /* Initialize pState pointer */ 00197 px = pState; 00198 00199 /* Initialize coeff pointer */ 00200 pb = (pCoeffs); 00201 00202 /* Set the accumulator to zero */ 00203 sum = 0.0f; 00204 00205 /* Loop unrolling. Process 4 taps at a time. */ 00206 tapCnt = numTaps >> 2; 00207 00208 while(tapCnt > 0u) 00209 { 00210 /* Perform the multiply-accumulate */ 00211 sum += (*px++) * (*pb++); 00212 sum += (*px++) * (*pb++); 00213 sum += (*px++) * (*pb++); 00214 sum += (*px++) * (*pb++); 00215 00216 /* Decrement the loop counter */ 00217 tapCnt--; 00218 } 00219 00220 /* If the filter length is not a multiple of 4, compute the remaining filter taps */ 00221 tapCnt = numTaps % 0x4u; 00222 00223 while(tapCnt > 0u) 00224 { 00225 /* Perform the multiply-accumulate */ 00226 sum += (*px++) * (*pb++); 00227 00228 /* Decrement the loop counter */ 00229 tapCnt--; 00230 } 00231 00232 /* The result in the accumulator, store in the destination buffer. */ 00233 *pOut++ = sum; 00234 00235 /* Compute and store error */ 00236 d = (float32_t) (*pRef++); 00237 e = d - sum; 00238 *pErr++ = e; 00239 00240 /* Calculation of Weighting factor for the updating filter coefficients */ 00241 w = e * mu; 00242 00243 /* Initialize pState pointer */ 00244 px = pState; 00245 00246 /* Initialize coeff pointer */ 00247 pb = (pCoeffs); 00248 00249 /* Loop unrolling. Process 4 taps at a time. */ 00250 tapCnt = numTaps >> 2; 00251 00252 /* Update filter coefficients */ 00253 while(tapCnt > 0u) 00254 { 00255 /* Perform the multiply-accumulate */ 00256 *pb = *pb + (w * (*px++)); 00257 pb++; 00258 00259 *pb = *pb + (w * (*px++)); 00260 pb++; 00261 00262 *pb = *pb + (w * (*px++)); 00263 pb++; 00264 00265 *pb = *pb + (w * (*px++)); 00266 pb++; 00267 00268 /* Decrement the loop counter */ 00269 tapCnt--; 00270 } 00271 00272 /* If the filter length is not a multiple of 4, compute the remaining filter taps */ 00273 tapCnt = numTaps % 0x4u; 00274 00275 while(tapCnt > 0u) 00276 { 00277 /* Perform the multiply-accumulate */ 00278 *pb = *pb + (w * (*px++)); 00279 pb++; 00280 00281 /* Decrement the loop counter */ 00282 tapCnt--; 00283 } 00284 00285 /* Advance state pointer by 1 for the next sample */ 00286 pState = pState + 1; 00287 00288 /* Decrement the loop counter */ 00289 blkCnt--; 00290 } 00291 00292 00293 /* Processing is complete. Now copy the last numTaps - 1 samples to the 00294 satrt of the state buffer. This prepares the state buffer for the 00295 next function call. */ 00296 00297 /* Points to the start of the pState buffer */ 00298 pStateCurnt = S->pState; 00299 00300 /* Loop unrolling for (numTaps - 1u) samples copy */ 00301 tapCnt = (numTaps - 1u) >> 2u; 00302 00303 /* copy data */ 00304 while(tapCnt > 0u) 00305 { 00306 *pStateCurnt++ = *pState++; 00307 *pStateCurnt++ = *pState++; 00308 *pStateCurnt++ = *pState++; 00309 *pStateCurnt++ = *pState++; 00310 00311 /* Decrement the loop counter */ 00312 tapCnt--; 00313 } 00314 00315 /* Calculate remaining number of copies */ 00316 tapCnt = (numTaps - 1u) % 0x4u; 00317 00318 /* Copy the remaining q31_t data */ 00319 while(tapCnt > 0u) 00320 { 00321 *pStateCurnt++ = *pState++; 00322 00323 /* Decrement the loop counter */ 00324 tapCnt--; 00325 } 00326 } 00327 00328 /** 00329 * @} end of LMS group 00330 */
Generated on Tue Jul 12 2022 14:13:53 by 1.7.2