Robert Lopez / CMSIS5
Embed: (wiki syntax)

« Back to documentation index

Show/hide line numbers arm_biquad_cascade_df1_f32.c Source File

arm_biquad_cascade_df1_f32.c

00001 /* ----------------------------------------------------------------------
00002  * Project:      CMSIS DSP Library
00003  * Title:        arm_biquad_cascade_df1_f32.c
00004  * Description:  Processing function for the floating-point Biquad cascade DirectFormI(DF1) filter
00005  *
00006  * $Date:        27. January 2017
00007  * $Revision:    V.1.5.1
00008  *
00009  * Target Processor: Cortex-M cores
00010  * -------------------------------------------------------------------- */
00011 /*
00012  * Copyright (C) 2010-2017 ARM Limited or its affiliates. All rights reserved.
00013  *
00014  * SPDX-License-Identifier: Apache-2.0
00015  *
00016  * Licensed under the Apache License, Version 2.0 (the License); you may
00017  * not use this file except in compliance with the License.
00018  * You may obtain a copy of the License at
00019  *
00020  * www.apache.org/licenses/LICENSE-2.0
00021  *
00022  * Unless required by applicable law or agreed to in writing, software
00023  * distributed under the License is distributed on an AS IS BASIS, WITHOUT
00024  * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
00025  * See the License for the specific language governing permissions and
00026  * limitations under the License.
00027  */
00028 
00029 #include "arm_math.h"
00030 
00031 /**
00032  * @ingroup groupFilters
00033  */
00034 
00035 /**
00036  * @defgroup BiquadCascadeDF1 Biquad Cascade IIR Filters Using Direct Form I Structure
00037  *
00038  * This set of functions implements arbitrary order recursive (IIR) filters.
00039  * The filters are implemented as a cascade of second order Biquad sections.
00040  * The functions support Q15, Q31 and floating-point data types.
00041  * Fast version of Q15 and Q31 also supported on CortexM4 and Cortex-M3.
00042  *
00043  * The functions operate on blocks of input and output data and each call to the function
00044  * processes <code>blockSize</code> samples through the filter.
00045  * <code>pSrc</code> points to the array of input data and
00046  * <code>pDst</code> points to the array of output data.
00047  * Both arrays contain <code>blockSize</code> values.
00048  *
00049  * \par Algorithm
00050  * Each Biquad stage implements a second order filter using the difference equation:
00051  * <pre>
00052  *     y[n] = b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1] + a2 * y[n-2]
00053  * </pre>
00054  * A Direct Form I algorithm is used with 5 coefficients and 4 state variables per stage.
00055  * \image html Biquad.gif "Single Biquad filter stage"
00056  * Coefficients <code>b0, b1 and b2 </code> multiply the input signal <code>x[n]</code> and are referred to as the feedforward coefficients.
00057  * Coefficients <code>a1</code> and <code>a2</code> multiply the output signal <code>y[n]</code> and are referred to as the feedback coefficients.
00058  * Pay careful attention to the sign of the feedback coefficients.
00059  * Some design tools use the difference equation
00060  * <pre>
00061  *     y[n] = b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] - a1 * y[n-1] - a2 * y[n-2]
00062  * </pre>
00063  * In this case the feedback coefficients <code>a1</code> and <code>a2</code> must be negated when used with the CMSIS DSP Library.
00064  *
00065  * \par
00066  * Higher order filters are realized as a cascade of second order sections.
00067  * <code>numStages</code> refers to the number of second order stages used.
00068  * For example, an 8th order filter would be realized with <code>numStages=4</code> second order stages.
00069  * \image html BiquadCascade.gif "8th order filter using a cascade of Biquad stages"
00070  * A 9th order filter would be realized with <code>numStages=5</code> second order stages with the coefficients for one of the stages configured as a first order filter (<code>b2=0</code> and <code>a2=0</code>).
00071  *
00072  * \par
00073  * The <code>pState</code> points to state variables array.
00074  * Each Biquad stage has 4 state variables <code>x[n-1], x[n-2], y[n-1],</code> and <code>y[n-2]</code>.
00075  * The state variables are arranged in the <code>pState</code> array as:
00076  * <pre>
00077  *     {x[n-1], x[n-2], y[n-1], y[n-2]}
00078  * </pre>
00079  *
00080  * \par
00081  * The 4 state variables for stage 1 are first, then the 4 state variables for stage 2, and so on.
00082  * The state array has a total length of <code>4*numStages</code> values.
00083  * The state variables are updated after each block of data is processed, the coefficients are untouched.
00084  *
00085  * \par Instance Structure
00086  * The coefficients and state variables for a filter are stored together in an instance data structure.
00087  * A separate instance structure must be defined for each filter.
00088  * Coefficient arrays may be shared among several instances while state variable arrays cannot be shared.
00089  * There are separate instance structure declarations for each of the 3 supported data types.
00090  *
00091  * \par Init Functions
00092  * There is also an associated initialization function for each data type.
00093  * The initialization function performs following operations:
00094  * - Sets the values of the internal structure fields.
00095  * - Zeros out the values in the state buffer.
00096  * To do this manually without calling the init function, assign the follow subfields of the instance structure:
00097  * numStages, pCoeffs, pState. Also set all of the values in pState to zero.
00098  *
00099  * \par
00100  * Use of the initialization function is optional.
00101  * However, if the initialization function is used, then the instance structure cannot be placed into a const data section.
00102  * To place an instance structure into a const data section, the instance structure must be manually initialized.
00103  * Set the values in the state buffer to zeros before static initialization.
00104  * The code below statically initializes each of the 3 different data type filter instance structures
00105  * <pre>
00106  *     arm_biquad_casd_df1_inst_f32 S1 = {numStages, pState, pCoeffs};
00107  *     arm_biquad_casd_df1_inst_q15 S2 = {numStages, pState, pCoeffs, postShift};
00108  *     arm_biquad_casd_df1_inst_q31 S3 = {numStages, pState, pCoeffs, postShift};
00109  * </pre>
00110  * where <code>numStages</code> is the number of Biquad stages in the filter; <code>pState</code> is the address of the state buffer;
00111  * <code>pCoeffs</code> is the address of the coefficient buffer; <code>postShift</code> shift to be applied.
00112  *
00113  * \par Fixed-Point Behavior
00114  * Care must be taken when using the Q15 and Q31 versions of the Biquad Cascade filter functions.
00115  * Following issues must be considered:
00116  * - Scaling of coefficients
00117  * - Filter gain
00118  * - Overflow and saturation
00119  *
00120  * \par
00121  * <b>Scaling of coefficients: </b>
00122  * Filter coefficients are represented as fractional values and
00123  * coefficients are restricted to lie in the range <code>[-1 +1)</code>.
00124  * The fixed-point functions have an additional scaling parameter <code>postShift</code>
00125  * which allow the filter coefficients to exceed the range <code>[+1 -1)</code>.
00126  * At the output of the filter's accumulator is a shift register which shifts the result by <code>postShift</code> bits.
00127  * \image html BiquadPostshift.gif "Fixed-point Biquad with shift by postShift bits after accumulator"
00128  * This essentially scales the filter coefficients by <code>2^postShift</code>.
00129  * For example, to realize the coefficients
00130  * <pre>
00131  *    {1.5, -0.8, 1.2, 1.6, -0.9}
00132  * </pre>
00133  * set the pCoeffs array to:
00134  * <pre>
00135  *    {0.75, -0.4, 0.6, 0.8, -0.45}
00136  * </pre>
00137  * and set <code>postShift=1</code>
00138  *
00139  * \par
00140  * <b>Filter gain: </b>
00141  * The frequency response of a Biquad filter is a function of its coefficients.
00142  * It is possible for the gain through the filter to exceed 1.0 meaning that the filter increases the amplitude of certain frequencies.
00143  * This means that an input signal with amplitude < 1.0 may result in an output > 1.0 and these are saturated or overflowed based on the implementation of the filter.
00144  * To avoid this behavior the filter needs to be scaled down such that its peak gain < 1.0 or the input signal must be scaled down so that the combination of input and filter are never overflowed.
00145  *
00146  * \par
00147  * <b>Overflow and saturation: </b>
00148  * For Q15 and Q31 versions, it is described separately as part of the function specific documentation below.
00149  */
00150 
00151 /**
00152  * @addtogroup BiquadCascadeDF1
00153  * @{
00154  */
00155 
00156 /**
00157  * @param[in]  *S         points to an instance of the floating-point Biquad cascade structure.
00158  * @param[in]  *pSrc      points to the block of input data.
00159  * @param[out] *pDst      points to the block of output data.
00160  * @param[in]  blockSize  number of samples to process per call.
00161  * @return     none.
00162  *
00163  */
00164 
00165 void arm_biquad_cascade_df1_f32(
00166   const arm_biquad_casd_df1_inst_f32 * S,
00167   float32_t * pSrc,
00168   float32_t * pDst,
00169   uint32_t blockSize)
00170 {
00171   float32_t *pIn = pSrc;                         /*  source pointer            */
00172   float32_t *pOut = pDst;                        /*  destination pointer       */
00173   float32_t *pState = S->pState;                 /*  pState pointer            */
00174   float32_t *pCoeffs = S->pCoeffs;               /*  coefficient pointer       */
00175   float32_t acc;                                 /*  Simulates the accumulator */
00176   float32_t b0, b1, b2, a1, a2;                  /*  Filter coefficients       */
00177   float32_t Xn1, Xn2, Yn1, Yn2;                  /*  Filter pState variables   */
00178   float32_t Xn;                                  /*  temporary input           */
00179   uint32_t sample, stage = S->numStages;         /*  loop counters             */
00180 
00181 
00182 #if defined (ARM_MATH_DSP)
00183 
00184   /* Run the below code for Cortex-M4 and Cortex-M3 */
00185 
00186   do
00187   {
00188     /* Reading the coefficients */
00189     b0 = *pCoeffs++;
00190     b1 = *pCoeffs++;
00191     b2 = *pCoeffs++;
00192     a1 = *pCoeffs++;
00193     a2 = *pCoeffs++;
00194 
00195     /* Reading the pState values */
00196     Xn1 = pState[0];
00197     Xn2 = pState[1];
00198     Yn1 = pState[2];
00199     Yn2 = pState[3];
00200 
00201     /* Apply loop unrolling and compute 4 output values simultaneously. */
00202     /*      The variable acc hold output values that are being computed:
00203      *
00204      *    acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1]   + a2 * y[n-2]
00205      *    acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1]   + a2 * y[n-2]
00206      *    acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1]   + a2 * y[n-2]
00207      *    acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1]   + a2 * y[n-2]
00208      */
00209 
00210     sample = blockSize >> 2U;
00211 
00212     /* First part of the processing with loop unrolling.  Compute 4 outputs at a time.
00213      ** a second loop below computes the remaining 1 to 3 samples. */
00214     while (sample > 0U)
00215     {
00216       /* Read the first input */
00217       Xn = *pIn++;
00218 
00219       /* acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1] + a2 * y[n-2] */
00220       Yn2 = (b0 * Xn) + (b1 * Xn1) + (b2 * Xn2) + (a1 * Yn1) + (a2 * Yn2);
00221 
00222       /* Store the result in the accumulator in the destination buffer. */
00223       *pOut++ = Yn2;
00224 
00225       /* Every time after the output is computed state should be updated. */
00226       /* The states should be updated as:  */
00227       /* Xn2 = Xn1    */
00228       /* Xn1 = Xn     */
00229       /* Yn2 = Yn1    */
00230       /* Yn1 = acc   */
00231 
00232       /* Read the second input */
00233       Xn2 = *pIn++;
00234 
00235       /* acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1] + a2 * y[n-2] */
00236       Yn1 = (b0 * Xn2) + (b1 * Xn) + (b2 * Xn1) + (a1 * Yn2) + (a2 * Yn1);
00237 
00238       /* Store the result in the accumulator in the destination buffer. */
00239       *pOut++ = Yn1;
00240 
00241       /* Every time after the output is computed state should be updated. */
00242       /* The states should be updated as:  */
00243       /* Xn2 = Xn1    */
00244       /* Xn1 = Xn     */
00245       /* Yn2 = Yn1    */
00246       /* Yn1 = acc   */
00247 
00248       /* Read the third input */
00249       Xn1 = *pIn++;
00250 
00251       /* acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1] + a2 * y[n-2] */
00252       Yn2 = (b0 * Xn1) + (b1 * Xn2) + (b2 * Xn) + (a1 * Yn1) + (a2 * Yn2);
00253 
00254       /* Store the result in the accumulator in the destination buffer. */
00255       *pOut++ = Yn2;
00256 
00257       /* Every time after the output is computed state should be updated. */
00258       /* The states should be updated as: */
00259       /* Xn2 = Xn1    */
00260       /* Xn1 = Xn     */
00261       /* Yn2 = Yn1    */
00262       /* Yn1 = acc   */
00263 
00264       /* Read the forth input */
00265       Xn = *pIn++;
00266 
00267       /* acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1] + a2 * y[n-2] */
00268       Yn1 = (b0 * Xn) + (b1 * Xn1) + (b2 * Xn2) + (a1 * Yn2) + (a2 * Yn1);
00269 
00270       /* Store the result in the accumulator in the destination buffer. */
00271       *pOut++ = Yn1;
00272 
00273       /* Every time after the output is computed state should be updated. */
00274       /* The states should be updated as:  */
00275       /* Xn2 = Xn1    */
00276       /* Xn1 = Xn     */
00277       /* Yn2 = Yn1    */
00278       /* Yn1 = acc   */
00279       Xn2 = Xn1;
00280       Xn1 = Xn;
00281 
00282       /* decrement the loop counter */
00283       sample--;
00284 
00285     }
00286 
00287     /* If the blockSize is not a multiple of 4, compute any remaining output samples here.
00288      ** No loop unrolling is used. */
00289     sample = blockSize & 0x3U;
00290 
00291     while (sample > 0U)
00292     {
00293       /* Read the input */
00294       Xn = *pIn++;
00295 
00296       /* acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1] + a2 * y[n-2] */
00297       acc = (b0 * Xn) + (b1 * Xn1) + (b2 * Xn2) + (a1 * Yn1) + (a2 * Yn2);
00298 
00299       /* Store the result in the accumulator in the destination buffer. */
00300       *pOut++ = acc;
00301 
00302       /* Every time after the output is computed state should be updated. */
00303       /* The states should be updated as:    */
00304       /* Xn2 = Xn1    */
00305       /* Xn1 = Xn     */
00306       /* Yn2 = Yn1    */
00307       /* Yn1 = acc   */
00308       Xn2 = Xn1;
00309       Xn1 = Xn;
00310       Yn2 = Yn1;
00311       Yn1 = acc;
00312 
00313       /* decrement the loop counter */
00314       sample--;
00315 
00316     }
00317 
00318     /*  Store the updated state variables back into the pState array */
00319     *pState++ = Xn1;
00320     *pState++ = Xn2;
00321     *pState++ = Yn1;
00322     *pState++ = Yn2;
00323 
00324     /*  The first stage goes from the input buffer to the output buffer. */
00325     /*  Subsequent numStages  occur in-place in the output buffer */
00326     pIn = pDst;
00327 
00328     /* Reset the output pointer */
00329     pOut = pDst;
00330 
00331     /* decrement the loop counter */
00332     stage--;
00333 
00334   } while (stage > 0U);
00335 
00336 #else
00337 
00338   /* Run the below code for Cortex-M0 */
00339 
00340   do
00341   {
00342     /* Reading the coefficients */
00343     b0 = *pCoeffs++;
00344     b1 = *pCoeffs++;
00345     b2 = *pCoeffs++;
00346     a1 = *pCoeffs++;
00347     a2 = *pCoeffs++;
00348 
00349     /* Reading the pState values */
00350     Xn1 = pState[0];
00351     Xn2 = pState[1];
00352     Yn1 = pState[2];
00353     Yn2 = pState[3];
00354 
00355     /*      The variables acc holds the output value that is computed:
00356      *    acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1]   + a2 * y[n-2]
00357      */
00358 
00359     sample = blockSize;
00360 
00361     while (sample > 0U)
00362     {
00363       /* Read the input */
00364       Xn = *pIn++;
00365 
00366       /* acc =  b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] + a1 * y[n-1] + a2 * y[n-2] */
00367       acc = (b0 * Xn) + (b1 * Xn1) + (b2 * Xn2) + (a1 * Yn1) + (a2 * Yn2);
00368 
00369       /* Store the result in the accumulator in the destination buffer. */
00370       *pOut++ = acc;
00371 
00372       /* Every time after the output is computed state should be updated. */
00373       /* The states should be updated as:    */
00374       /* Xn2 = Xn1    */
00375       /* Xn1 = Xn     */
00376       /* Yn2 = Yn1    */
00377       /* Yn1 = acc   */
00378       Xn2 = Xn1;
00379       Xn1 = Xn;
00380       Yn2 = Yn1;
00381       Yn1 = acc;
00382 
00383       /* decrement the loop counter */
00384       sample--;
00385     }
00386 
00387     /*  Store the updated state variables back into the pState array */
00388     *pState++ = Xn1;
00389     *pState++ = Xn2;
00390     *pState++ = Yn1;
00391     *pState++ = Yn2;
00392 
00393     /*  The first stage goes from the input buffer to the output buffer. */
00394     /*  Subsequent numStages  occur in-place in the output buffer */
00395     pIn = pDst;
00396 
00397     /* Reset the output pointer */
00398     pOut = pDst;
00399 
00400     /* decrement the loop counter */
00401     stage--;
00402 
00403   } while (stage > 0U);
00404 
00405 #endif /* #if defined (ARM_MATH_DSP) */
00406 
00407 }
00408 
00409 
00410   /**
00411    * @} end of BiquadCascadeDF1 group
00412    */
00413