Important changes to repositories hosted on mbed.com
Mbed hosted mercurial repositories are deprecated and are due to be permanently deleted in July 2026.
To keep a copy of this software download the repository Zip archive or clone locally using Mercurial.
It is also possible to export all your personal repositories from the account settings page.
Fork of OmniWheels by
arm_biquad_cascade_df2T_f32.c
00001 /* ---------------------------------------------------------------------- 00002 * Copyright (C) 2010-2014 ARM Limited. All rights reserved. 00003 * 00004 * $Date: 19. March 2015 00005 * $Revision: V.1.4.5 00006 * 00007 * Project: CMSIS DSP Library 00008 * Title: arm_biquad_cascade_df2T_f32.c 00009 * 00010 * Description: Processing function for the floating-point transposed 00011 * direct form II Biquad cascade filter. 00012 * 00013 * Target Processor: Cortex-M4/Cortex-M3/Cortex-M0 00014 * 00015 * Redistribution and use in source and binary forms, with or without 00016 * modification, are permitted provided that the following conditions 00017 * are met: 00018 * - Redistributions of source code must retain the above copyright 00019 * notice, this list of conditions and the following disclaimer. 00020 * - Redistributions in binary form must reproduce the above copyright 00021 * notice, this list of conditions and the following disclaimer in 00022 * the documentation and/or other materials provided with the 00023 * distribution. 00024 * - Neither the name of ARM LIMITED nor the names of its contributors 00025 * may be used to endorse or promote products derived from this 00026 * software without specific prior written permission. 00027 * 00028 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 00029 * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 00030 * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 00031 * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 00032 * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 00033 * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 00034 * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 00035 * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 00036 * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 00037 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN 00038 * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 00039 * POSSIBILITY OF SUCH DAMAGE. 00040 * -------------------------------------------------------------------- */ 00041 00042 #include "arm_math.h" 00043 00044 /** 00045 * @ingroup groupFilters 00046 */ 00047 00048 /** 00049 * @defgroup BiquadCascadeDF2T Biquad Cascade IIR Filters Using a Direct Form II Transposed Structure 00050 * 00051 * This set of functions implements arbitrary order recursive (IIR) filters using a transposed direct form II structure. 00052 * The filters are implemented as a cascade of second order Biquad sections. 00053 * These functions provide a slight memory savings as compared to the direct form I Biquad filter functions. 00054 * Only floating-point data is supported. 00055 * 00056 * This function operate on blocks of input and output data and each call to the function 00057 * processes <code>blockSize</code> samples through the filter. 00058 * <code>pSrc</code> points to the array of input data and 00059 * <code>pDst</code> points to the array of output data. 00060 * Both arrays contain <code>blockSize</code> values. 00061 * 00062 * \par Algorithm 00063 * Each Biquad stage implements a second order filter using the difference equation: 00064 * <pre> 00065 * y[n] = b0 * x[n] + d1 00066 * d1 = b1 * x[n] + a1 * y[n] + d2 00067 * d2 = b2 * x[n] + a2 * y[n] 00068 * </pre> 00069 * where d1 and d2 represent the two state values. 00070 * 00071 * \par 00072 * A Biquad filter using a transposed Direct Form II structure is shown below. 00073 * \image html BiquadDF2Transposed.gif "Single transposed Direct Form II Biquad" 00074 * Coefficients <code>b0, b1, and b2 </code> multiply the input signal <code>x[n]</code> and are referred to as the feedforward coefficients. 00075 * Coefficients <code>a1</code> and <code>a2</code> multiply the output signal <code>y[n]</code> and are referred to as the feedback coefficients. 00076 * Pay careful attention to the sign of the feedback coefficients. 00077 * Some design tools flip the sign of the feedback coefficients: 00078 * <pre> 00079 * y[n] = b0 * x[n] + d1; 00080 * d1 = b1 * x[n] - a1 * y[n] + d2; 00081 * d2 = b2 * x[n] - a2 * y[n]; 00082 * </pre> 00083 * In this case the feedback coefficients <code>a1</code> and <code>a2</code> must be negated when used with the CMSIS DSP Library. 00084 * 00085 * \par 00086 * Higher order filters are realized as a cascade of second order sections. 00087 * <code>numStages</code> refers to the number of second order stages used. 00088 * For example, an 8th order filter would be realized with <code>numStages=4</code> second order stages. 00089 * A 9th order filter would be realized with <code>numStages=5</code> second order stages with the 00090 * coefficients for one of the stages configured as a first order filter (<code>b2=0</code> and <code>a2=0</code>). 00091 * 00092 * \par 00093 * <code>pState</code> points to the state variable array. 00094 * Each Biquad stage has 2 state variables <code>d1</code> and <code>d2</code>. 00095 * The state variables are arranged in the <code>pState</code> array as: 00096 * <pre> 00097 * {d11, d12, d21, d22, ...} 00098 * </pre> 00099 * where <code>d1x</code> refers to the state variables for the first Biquad and 00100 * <code>d2x</code> refers to the state variables for the second Biquad. 00101 * The state array has a total length of <code>2*numStages</code> values. 00102 * The state variables are updated after each block of data is processed; the coefficients are untouched. 00103 * 00104 * \par 00105 * The CMSIS library contains Biquad filters in both Direct Form I and transposed Direct Form II. 00106 * The advantage of the Direct Form I structure is that it is numerically more robust for fixed-point data types. 00107 * That is why the Direct Form I structure supports Q15 and Q31 data types. 00108 * The transposed Direct Form II structure, on the other hand, requires a wide dynamic range for the state variables <code>d1</code> and <code>d2</code>. 00109 * Because of this, the CMSIS library only has a floating-point version of the Direct Form II Biquad. 00110 * The advantage of the Direct Form II Biquad is that it requires half the number of state variables, 2 rather than 4, per Biquad stage. 00111 * 00112 * \par Instance Structure 00113 * The coefficients and state variables for a filter are stored together in an instance data structure. 00114 * A separate instance structure must be defined for each filter. 00115 * Coefficient arrays may be shared among several instances while state variable arrays cannot be shared. 00116 * 00117 * \par Init Functions 00118 * There is also an associated initialization function. 00119 * The initialization function performs following operations: 00120 * - Sets the values of the internal structure fields. 00121 * - Zeros out the values in the state buffer. 00122 * To do this manually without calling the init function, assign the follow subfields of the instance structure: 00123 * numStages, pCoeffs, pState. Also set all of the values in pState to zero. 00124 * 00125 * \par 00126 * Use of the initialization function is optional. 00127 * However, if the initialization function is used, then the instance structure cannot be placed into a const data section. 00128 * To place an instance structure into a const data section, the instance structure must be manually initialized. 00129 * Set the values in the state buffer to zeros before static initialization. 00130 * For example, to statically initialize the instance structure use 00131 * <pre> 00132 * arm_biquad_cascade_df2T_instance_f32 S1 = {numStages, pState, pCoeffs}; 00133 * </pre> 00134 * where <code>numStages</code> is the number of Biquad stages in the filter; <code>pState</code> is the address of the state buffer. 00135 * <code>pCoeffs</code> is the address of the coefficient buffer; 00136 * 00137 */ 00138 00139 /** 00140 * @addtogroup BiquadCascadeDF2T 00141 * @{ 00142 */ 00143 00144 /** 00145 * @brief Processing function for the floating-point transposed direct form II Biquad cascade filter. 00146 * @param[in] *S points to an instance of the filter data structure. 00147 * @param[in] *pSrc points to the block of input data. 00148 * @param[out] *pDst points to the block of output data 00149 * @param[in] blockSize number of samples to process. 00150 * @return none. 00151 */ 00152 00153 00154 LOW_OPTIMIZATION_ENTER 00155 void arm_biquad_cascade_df2T_f32( 00156 const arm_biquad_cascade_df2T_instance_f32 * S, 00157 float32_t * pSrc, 00158 float32_t * pDst, 00159 uint32_t blockSize) 00160 { 00161 00162 float32_t *pIn = pSrc; /* source pointer */ 00163 float32_t *pOut = pDst; /* destination pointer */ 00164 float32_t *pState = S->pState; /* State pointer */ 00165 float32_t *pCoeffs = S->pCoeffs; /* coefficient pointer */ 00166 float32_t acc1; /* accumulator */ 00167 float32_t b0, b1, b2, a1, a2; /* Filter coefficients */ 00168 float32_t Xn1; /* temporary input */ 00169 float32_t d1, d2; /* state variables */ 00170 uint32_t sample, stage = S->numStages; /* loop counters */ 00171 00172 #if defined(ARM_MATH_CM7) 00173 00174 float32_t Xn2, Xn3, Xn4, Xn5, Xn6, Xn7, Xn8; /* Input State variables */ 00175 float32_t Xn9, Xn10, Xn11, Xn12, Xn13, Xn14, Xn15, Xn16; 00176 float32_t acc2, acc3, acc4, acc5, acc6, acc7; /* Simulates the accumulator */ 00177 float32_t acc8, acc9, acc10, acc11, acc12, acc13, acc14, acc15, acc16; 00178 00179 do 00180 { 00181 /* Reading the coefficients */ 00182 b0 = pCoeffs[0]; 00183 b1 = pCoeffs[1]; 00184 b2 = pCoeffs[2]; 00185 a1 = pCoeffs[3]; 00186 /* Apply loop unrolling and compute 16 output values simultaneously. */ 00187 sample = blockSize >> 4u; 00188 a2 = pCoeffs[4]; 00189 00190 /*Reading the state values */ 00191 d1 = pState[0]; 00192 d2 = pState[1]; 00193 00194 pCoeffs += 5u; 00195 00196 00197 /* First part of the processing with loop unrolling. Compute 16 outputs at a time. 00198 ** a second loop below computes the remaining 1 to 15 samples. */ 00199 while(sample > 0u) { 00200 00201 /* y[n] = b0 * x[n] + d1 */ 00202 /* d1 = b1 * x[n] + a1 * y[n] + d2 */ 00203 /* d2 = b2 * x[n] + a2 * y[n] */ 00204 00205 /* Read the first 2 inputs. 2 cycles */ 00206 Xn1 = pIn[0 ]; 00207 Xn2 = pIn[1 ]; 00208 00209 /* Sample 1. 5 cycles */ 00210 Xn3 = pIn[2 ]; 00211 acc1 = b0 * Xn1 + d1; 00212 00213 Xn4 = pIn[3 ]; 00214 d1 = b1 * Xn1 + d2; 00215 00216 Xn5 = pIn[4 ]; 00217 d2 = b2 * Xn1; 00218 00219 Xn6 = pIn[5 ]; 00220 d1 += a1 * acc1; 00221 00222 Xn7 = pIn[6 ]; 00223 d2 += a2 * acc1; 00224 00225 /* Sample 2. 5 cycles */ 00226 Xn8 = pIn[7 ]; 00227 acc2 = b0 * Xn2 + d1; 00228 00229 Xn9 = pIn[8 ]; 00230 d1 = b1 * Xn2 + d2; 00231 00232 Xn10 = pIn[9 ]; 00233 d2 = b2 * Xn2; 00234 00235 Xn11 = pIn[10]; 00236 d1 += a1 * acc2; 00237 00238 Xn12 = pIn[11]; 00239 d2 += a2 * acc2; 00240 00241 /* Sample 3. 5 cycles */ 00242 Xn13 = pIn[12]; 00243 acc3 = b0 * Xn3 + d1; 00244 00245 Xn14 = pIn[13]; 00246 d1 = b1 * Xn3 + d2; 00247 00248 Xn15 = pIn[14]; 00249 d2 = b2 * Xn3; 00250 00251 Xn16 = pIn[15]; 00252 d1 += a1 * acc3; 00253 00254 pIn += 16; 00255 d2 += a2 * acc3; 00256 00257 /* Sample 4. 5 cycles */ 00258 acc4 = b0 * Xn4 + d1; 00259 d1 = b1 * Xn4 + d2; 00260 d2 = b2 * Xn4; 00261 d1 += a1 * acc4; 00262 d2 += a2 * acc4; 00263 00264 /* Sample 5. 5 cycles */ 00265 acc5 = b0 * Xn5 + d1; 00266 d1 = b1 * Xn5 + d2; 00267 d2 = b2 * Xn5; 00268 d1 += a1 * acc5; 00269 d2 += a2 * acc5; 00270 00271 /* Sample 6. 5 cycles */ 00272 acc6 = b0 * Xn6 + d1; 00273 d1 = b1 * Xn6 + d2; 00274 d2 = b2 * Xn6; 00275 d1 += a1 * acc6; 00276 d2 += a2 * acc6; 00277 00278 /* Sample 7. 5 cycles */ 00279 acc7 = b0 * Xn7 + d1; 00280 d1 = b1 * Xn7 + d2; 00281 d2 = b2 * Xn7; 00282 d1 += a1 * acc7; 00283 d2 += a2 * acc7; 00284 00285 /* Sample 8. 5 cycles */ 00286 acc8 = b0 * Xn8 + d1; 00287 d1 = b1 * Xn8 + d2; 00288 d2 = b2 * Xn8; 00289 d1 += a1 * acc8; 00290 d2 += a2 * acc8; 00291 00292 /* Sample 9. 5 cycles */ 00293 acc9 = b0 * Xn9 + d1; 00294 d1 = b1 * Xn9 + d2; 00295 d2 = b2 * Xn9; 00296 d1 += a1 * acc9; 00297 d2 += a2 * acc9; 00298 00299 /* Sample 10. 5 cycles */ 00300 acc10 = b0 * Xn10 + d1; 00301 d1 = b1 * Xn10 + d2; 00302 d2 = b2 * Xn10; 00303 d1 += a1 * acc10; 00304 d2 += a2 * acc10; 00305 00306 /* Sample 11. 5 cycles */ 00307 acc11 = b0 * Xn11 + d1; 00308 d1 = b1 * Xn11 + d2; 00309 d2 = b2 * Xn11; 00310 d1 += a1 * acc11; 00311 d2 += a2 * acc11; 00312 00313 /* Sample 12. 5 cycles */ 00314 acc12 = b0 * Xn12 + d1; 00315 d1 = b1 * Xn12 + d2; 00316 d2 = b2 * Xn12; 00317 d1 += a1 * acc12; 00318 d2 += a2 * acc12; 00319 00320 /* Sample 13. 5 cycles */ 00321 acc13 = b0 * Xn13 + d1; 00322 d1 = b1 * Xn13 + d2; 00323 d2 = b2 * Xn13; 00324 00325 pOut[0 ] = acc1 ; 00326 d1 += a1 * acc13; 00327 00328 pOut[1 ] = acc2 ; 00329 d2 += a2 * acc13; 00330 00331 /* Sample 14. 5 cycles */ 00332 pOut[2 ] = acc3 ; 00333 acc14 = b0 * Xn14 + d1; 00334 00335 pOut[3 ] = acc4 ; 00336 d1 = b1 * Xn14 + d2; 00337 00338 pOut[4 ] = acc5 ; 00339 d2 = b2 * Xn14; 00340 00341 pOut[5 ] = acc6 ; 00342 d1 += a1 * acc14; 00343 00344 pOut[6 ] = acc7 ; 00345 d2 += a2 * acc14; 00346 00347 /* Sample 15. 5 cycles */ 00348 pOut[7 ] = acc8 ; 00349 pOut[8 ] = acc9 ; 00350 acc15 = b0 * Xn15 + d1; 00351 00352 pOut[9 ] = acc10; 00353 d1 = b1 * Xn15 + d2; 00354 00355 pOut[10] = acc11; 00356 d2 = b2 * Xn15; 00357 00358 pOut[11] = acc12; 00359 d1 += a1 * acc15; 00360 00361 pOut[12] = acc13; 00362 d2 += a2 * acc15; 00363 00364 /* Sample 16. 5 cycles */ 00365 pOut[13] = acc14; 00366 acc16 = b0 * Xn16 + d1; 00367 00368 pOut[14] = acc15; 00369 d1 = b1 * Xn16 + d2; 00370 00371 pOut[15] = acc16; 00372 d2 = b2 * Xn16; 00373 00374 sample--; 00375 d1 += a1 * acc16; 00376 00377 pOut += 16; 00378 d2 += a2 * acc16; 00379 } 00380 00381 sample = blockSize & 0xFu; 00382 while(sample > 0u) { 00383 Xn1 = *pIn; 00384 acc1 = b0 * Xn1 + d1; 00385 00386 pIn++; 00387 d1 = b1 * Xn1 + d2; 00388 00389 *pOut = acc1; 00390 d2 = b2 * Xn1; 00391 00392 pOut++; 00393 d1 += a1 * acc1; 00394 00395 sample--; 00396 d2 += a2 * acc1; 00397 } 00398 00399 /* Store the updated state variables back into the state array */ 00400 pState[0] = d1; 00401 /* The current stage input is given as the output to the next stage */ 00402 pIn = pDst; 00403 00404 pState[1] = d2; 00405 /* decrement the loop counter */ 00406 stage--; 00407 00408 pState += 2u; 00409 00410 /*Reset the output working pointer */ 00411 pOut = pDst; 00412 00413 } while(stage > 0u); 00414 00415 #elif defined(ARM_MATH_CM0_FAMILY) 00416 00417 /* Run the below code for Cortex-M0 */ 00418 00419 do 00420 { 00421 /* Reading the coefficients */ 00422 b0 = *pCoeffs++; 00423 b1 = *pCoeffs++; 00424 b2 = *pCoeffs++; 00425 a1 = *pCoeffs++; 00426 a2 = *pCoeffs++; 00427 00428 /*Reading the state values */ 00429 d1 = pState[0]; 00430 d2 = pState[1]; 00431 00432 00433 sample = blockSize; 00434 00435 while(sample > 0u) 00436 { 00437 /* Read the input */ 00438 Xn1 = *pIn++; 00439 00440 /* y[n] = b0 * x[n] + d1 */ 00441 acc1 = (b0 * Xn1) + d1; 00442 00443 /* Store the result in the accumulator in the destination buffer. */ 00444 *pOut++ = acc1; 00445 00446 /* Every time after the output is computed state should be updated. */ 00447 /* d1 = b1 * x[n] + a1 * y[n] + d2 */ 00448 d1 = ((b1 * Xn1) + (a1 * acc1)) + d2; 00449 00450 /* d2 = b2 * x[n] + a2 * y[n] */ 00451 d2 = (b2 * Xn1) + (a2 * acc1); 00452 00453 /* decrement the loop counter */ 00454 sample--; 00455 } 00456 00457 /* Store the updated state variables back into the state array */ 00458 *pState++ = d1; 00459 *pState++ = d2; 00460 00461 /* The current stage input is given as the output to the next stage */ 00462 pIn = pDst; 00463 00464 /*Reset the output working pointer */ 00465 pOut = pDst; 00466 00467 /* decrement the loop counter */ 00468 stage--; 00469 00470 } while(stage > 0u); 00471 00472 #else 00473 00474 float32_t Xn2, Xn3, Xn4; /* Input State variables */ 00475 float32_t acc2, acc3, acc4; /* accumulator */ 00476 00477 00478 float32_t p0, p1, p2, p3, p4, A1; 00479 00480 /* Run the below code for Cortex-M4 and Cortex-M3 */ 00481 do 00482 { 00483 /* Reading the coefficients */ 00484 b0 = *pCoeffs++; 00485 b1 = *pCoeffs++; 00486 b2 = *pCoeffs++; 00487 a1 = *pCoeffs++; 00488 a2 = *pCoeffs++; 00489 00490 00491 /*Reading the state values */ 00492 d1 = pState[0]; 00493 d2 = pState[1]; 00494 00495 /* Apply loop unrolling and compute 4 output values simultaneously. */ 00496 sample = blockSize >> 2u; 00497 00498 /* First part of the processing with loop unrolling. Compute 4 outputs at a time. 00499 ** a second loop below computes the remaining 1 to 3 samples. */ 00500 while(sample > 0u) { 00501 00502 /* y[n] = b0 * x[n] + d1 */ 00503 /* d1 = b1 * x[n] + a1 * y[n] + d2 */ 00504 /* d2 = b2 * x[n] + a2 * y[n] */ 00505 00506 /* Read the four inputs */ 00507 Xn1 = pIn[0]; 00508 Xn2 = pIn[1]; 00509 Xn3 = pIn[2]; 00510 Xn4 = pIn[3]; 00511 pIn += 4; 00512 00513 p0 = b0 * Xn1; 00514 p1 = b1 * Xn1; 00515 acc1 = p0 + d1; 00516 p0 = b0 * Xn2; 00517 p3 = a1 * acc1; 00518 p2 = b2 * Xn1; 00519 A1 = p1 + p3; 00520 p4 = a2 * acc1; 00521 d1 = A1 + d2; 00522 d2 = p2 + p4; 00523 00524 p1 = b1 * Xn2; 00525 acc2 = p0 + d1; 00526 p0 = b0 * Xn3; 00527 p3 = a1 * acc2; 00528 p2 = b2 * Xn2; 00529 A1 = p1 + p3; 00530 p4 = a2 * acc2; 00531 d1 = A1 + d2; 00532 d2 = p2 + p4; 00533 00534 p1 = b1 * Xn3; 00535 acc3 = p0 + d1; 00536 p0 = b0 * Xn4; 00537 p3 = a1 * acc3; 00538 p2 = b2 * Xn3; 00539 A1 = p1 + p3; 00540 p4 = a2 * acc3; 00541 d1 = A1 + d2; 00542 d2 = p2 + p4; 00543 00544 acc4 = p0 + d1; 00545 p1 = b1 * Xn4; 00546 p3 = a1 * acc4; 00547 p2 = b2 * Xn4; 00548 A1 = p1 + p3; 00549 p4 = a2 * acc4; 00550 d1 = A1 + d2; 00551 d2 = p2 + p4; 00552 00553 pOut[0] = acc1; 00554 pOut[1] = acc2; 00555 pOut[2] = acc3; 00556 pOut[3] = acc4; 00557 pOut += 4; 00558 00559 sample--; 00560 } 00561 00562 sample = blockSize & 0x3u; 00563 while(sample > 0u) { 00564 Xn1 = *pIn++; 00565 00566 p0 = b0 * Xn1; 00567 p1 = b1 * Xn1; 00568 acc1 = p0 + d1; 00569 p3 = a1 * acc1; 00570 p2 = b2 * Xn1; 00571 A1 = p1 + p3; 00572 p4 = a2 * acc1; 00573 d1 = A1 + d2; 00574 d2 = p2 + p4; 00575 00576 *pOut++ = acc1; 00577 00578 sample--; 00579 } 00580 00581 /* Store the updated state variables back into the state array */ 00582 *pState++ = d1; 00583 *pState++ = d2; 00584 00585 /* The current stage input is given as the output to the next stage */ 00586 pIn = pDst; 00587 00588 /*Reset the output working pointer */ 00589 pOut = pDst; 00590 00591 /* decrement the loop counter */ 00592 stage--; 00593 00594 } while(stage > 0u); 00595 00596 #endif 00597 00598 } 00599 LOW_OPTIMIZATION_EXIT 00600 00601 /** 00602 * @} end of BiquadCascadeDF2T group 00603 */
Generated on Fri Jul 22 2022 04:53:43 by
1.7.2
