Fork of mbed-dsp. CMSIS-DSP library of supporting NEON

Dependents:   mbed-os-example-cmsis_dsp_neon

Fork of mbed-dsp by mbed official

Information

Japanese version is available in lower part of this page.
このページの後半に日本語版が用意されています.

CMSIS-DSP of supporting NEON

What is this ?

A library for CMSIS-DSP of supporting NEON.
We supported the NEON to CMSIS-DSP Ver1.4.3(CMSIS V4.1) that ARM supplied, has achieved the processing speed improvement.
If you use the mbed-dsp library, you can use to replace this library.
CMSIS-DSP of supporting NEON is provied as a library.

Library Creation environment

CMSIS-DSP library of supporting NEON was created by the following environment.

  • Compiler
    ARMCC Version 5.03
  • Compile option switch[C Compiler]
   -DARM_MATH_MATRIX_CHECK -DARM_MATH_ROUNDING -O3 -Otime --cpu=Cortex-A9 --littleend --arm 
   --apcs=/interwork --no_unaligned_access --fpu=vfpv3_fp16 --fpmode=fast --apcs=/hardfp 
   --vectorize --asm
  • Compile option switch[Assembler]
   --cpreproc --cpu=Cortex-A9 --littleend --arm --apcs=/interwork --no_unaligned_access 
   --fpu=vfpv3_fp16 --fpmode=fast --apcs=/hardfp


Effects of NEON support

In the data which passes to each function, large size will be expected more effective than small size.
Also if the data is a multiple of 16, effect will be expected in every function in the CMSIS-DSP.


NEON対応CMSIS-DSP

概要

NEON対応したCMSIS-DSPのライブラリです。
ARM社提供のCMSIS-DSP Ver1.4.3(CMSIS V4.1)をターゲットにNEON対応を行ない、処理速度向上を実現しております。
mbed-dspライブラリを使用している場合は、本ライブラリに置き換えて使用することができます。
NEON対応したCMSIS-DSPはライブラリで提供します。

ライブラリ作成環境

NEON対応CMSIS-DSPライブラリは、以下の環境で作成しています。

  • コンパイラ
    ARMCC Version 5.03
  • コンパイルオプションスイッチ[C Compiler]
   -DARM_MATH_MATRIX_CHECK -DARM_MATH_ROUNDING -O3 -Otime --cpu=Cortex-A9 --littleend --arm 
   --apcs=/interwork --no_unaligned_access --fpu=vfpv3_fp16 --fpmode=fast --apcs=/hardfp 
   --vectorize --asm
  • コンパイルオプションスイッチ[Assembler]
   --cpreproc --cpu=Cortex-A9 --littleend --arm --apcs=/interwork --no_unaligned_access 
   --fpu=vfpv3_fp16 --fpmode=fast --apcs=/hardfp


NEON対応による効果について

CMSIS-DSP内の各関数へ渡すデータは、小さいサイズよりも大きいサイズの方が効果が見込めます。
また、16の倍数のデータであれば、CMSIS-DSP内のどの関数でも効果が見込めます。


Committer:
emilmont
Date:
Wed Nov 28 12:30:09 2012 +0000
Revision:
1:fdd22bb7aa52
Child:
2:da51fb522205
DSP library code

Who changed what in which revision?

UserRevisionLine numberNew contents of line
emilmont 1:fdd22bb7aa52 1 /* ----------------------------------------------------------------------
emilmont 1:fdd22bb7aa52 2 * Copyright (C) 2010 ARM Limited. All rights reserved.
emilmont 1:fdd22bb7aa52 3 *
emilmont 1:fdd22bb7aa52 4 * $Date: 15. February 2012
emilmont 1:fdd22bb7aa52 5 * $Revision: V1.1.0
emilmont 1:fdd22bb7aa52 6 *
emilmont 1:fdd22bb7aa52 7 * Project: CMSIS DSP Library
emilmont 1:fdd22bb7aa52 8 * Title: arm_rfft_f32.c
emilmont 1:fdd22bb7aa52 9 *
emilmont 1:fdd22bb7aa52 10 * Description: RFFT & RIFFT Floating point process function
emilmont 1:fdd22bb7aa52 11 *
emilmont 1:fdd22bb7aa52 12 * Target Processor: Cortex-M4/Cortex-M3/Cortex-M0
emilmont 1:fdd22bb7aa52 13 *
emilmont 1:fdd22bb7aa52 14 * Version 1.1.0 2012/02/15
emilmont 1:fdd22bb7aa52 15 * Updated with more optimizations, bug fixes and minor API changes.
emilmont 1:fdd22bb7aa52 16 *
emilmont 1:fdd22bb7aa52 17 * Version 1.0.10 2011/7/15
emilmont 1:fdd22bb7aa52 18 * Big Endian support added and Merged M0 and M3/M4 Source code.
emilmont 1:fdd22bb7aa52 19 *
emilmont 1:fdd22bb7aa52 20 * Version 1.0.3 2010/11/29
emilmont 1:fdd22bb7aa52 21 * Re-organized the CMSIS folders and updated documentation.
emilmont 1:fdd22bb7aa52 22 *
emilmont 1:fdd22bb7aa52 23 * Version 1.0.2 2010/11/11
emilmont 1:fdd22bb7aa52 24 * Documentation updated.
emilmont 1:fdd22bb7aa52 25 *
emilmont 1:fdd22bb7aa52 26 * Version 1.0.1 2010/10/05
emilmont 1:fdd22bb7aa52 27 * Production release and review comments incorporated.
emilmont 1:fdd22bb7aa52 28 *
emilmont 1:fdd22bb7aa52 29 * Version 1.0.0 2010/09/20
emilmont 1:fdd22bb7aa52 30 * Production release and review comments incorporated.
emilmont 1:fdd22bb7aa52 31 *
emilmont 1:fdd22bb7aa52 32 * Version 0.0.7 2010/06/10
emilmont 1:fdd22bb7aa52 33 * Misra-C changes done
emilmont 1:fdd22bb7aa52 34 * -------------------------------------------------------------------- */
emilmont 1:fdd22bb7aa52 35
emilmont 1:fdd22bb7aa52 36 #include "arm_math.h"
emilmont 1:fdd22bb7aa52 37
emilmont 1:fdd22bb7aa52 38 /**
emilmont 1:fdd22bb7aa52 39 * @ingroup groupTransforms
emilmont 1:fdd22bb7aa52 40 */
emilmont 1:fdd22bb7aa52 41
emilmont 1:fdd22bb7aa52 42 /**
emilmont 1:fdd22bb7aa52 43 * @defgroup RFFT_RIFFT Real FFT Functions
emilmont 1:fdd22bb7aa52 44 *
emilmont 1:fdd22bb7aa52 45 * \par
emilmont 1:fdd22bb7aa52 46 * Complex FFT/IFFT typically assumes complex input and output. However many applications use real valued data in time domain.
emilmont 1:fdd22bb7aa52 47 * Real FFT/IFFT efficiently process real valued sequences with the advantage of requirement of low memory and with less complexity.
emilmont 1:fdd22bb7aa52 48 *
emilmont 1:fdd22bb7aa52 49 * \par
emilmont 1:fdd22bb7aa52 50 * This set of functions implements Real Fast Fourier Transforms(RFFT) and Real Inverse Fast Fourier Transform(RIFFT)
emilmont 1:fdd22bb7aa52 51 * for Q15, Q31, and floating-point data types.
emilmont 1:fdd22bb7aa52 52 *
emilmont 1:fdd22bb7aa52 53 *
emilmont 1:fdd22bb7aa52 54 * \par Algorithm:
emilmont 1:fdd22bb7aa52 55 *
emilmont 1:fdd22bb7aa52 56 * <b>Real Fast Fourier Transform:</b>
emilmont 1:fdd22bb7aa52 57 * \par
emilmont 1:fdd22bb7aa52 58 * Real FFT of N-point is calculated using CFFT of N/2-point and Split RFFT process as shown below figure.
emilmont 1:fdd22bb7aa52 59 * \par
emilmont 1:fdd22bb7aa52 60 * \image html RFFT.gif "Real Fast Fourier Transform"
emilmont 1:fdd22bb7aa52 61 * \par
emilmont 1:fdd22bb7aa52 62 * The RFFT functions operate on blocks of input and output data and each call to the function processes
emilmont 1:fdd22bb7aa52 63 * <code>fftLenR</code> samples through the transform. <code>pSrc</code> points to input array containing <code>fftLenR</code> values.
emilmont 1:fdd22bb7aa52 64 * <code>pDst</code> points to output array containing <code>2*fftLenR</code> values. \n
emilmont 1:fdd22bb7aa52 65 * Input for real FFT is in the order of
emilmont 1:fdd22bb7aa52 66 * <pre>{real[0], real[1], real[2], real[3], ..}</pre>
emilmont 1:fdd22bb7aa52 67 * Output for real FFT is complex and are in the order of
emilmont 1:fdd22bb7aa52 68 * <pre>{real(0), imag(0), real(1), imag(1), ...}</pre>
emilmont 1:fdd22bb7aa52 69 *
emilmont 1:fdd22bb7aa52 70 * <b>Real Inverse Fast Fourier Transform:</b>
emilmont 1:fdd22bb7aa52 71 * \par
emilmont 1:fdd22bb7aa52 72 * Real IFFT of N-point is calculated using Split RIFFT process and CFFT of N/2-point as shown below figure.
emilmont 1:fdd22bb7aa52 73 * \par
emilmont 1:fdd22bb7aa52 74 * \image html RIFFT.gif "Real Inverse Fast Fourier Transform"
emilmont 1:fdd22bb7aa52 75 * \par
emilmont 1:fdd22bb7aa52 76 * The RIFFT functions operate on blocks of input and output data and each call to the function processes
emilmont 1:fdd22bb7aa52 77 * <code>2*fftLenR</code> samples through the transform. <code>pSrc</code> points to input array containing <code>2*fftLenR</code> values.
emilmont 1:fdd22bb7aa52 78 * <code>pDst</code> points to output array containing <code>fftLenR</code> values. \n
emilmont 1:fdd22bb7aa52 79 * Input for real IFFT is complex and are in the order of
emilmont 1:fdd22bb7aa52 80 * <pre>{real(0), imag(0), real(1), imag(1), ...}</pre>
emilmont 1:fdd22bb7aa52 81 * Output for real IFFT is real and in the order of
emilmont 1:fdd22bb7aa52 82 * <pre>{real[0], real[1], real[2], real[3], ..}</pre>
emilmont 1:fdd22bb7aa52 83 *
emilmont 1:fdd22bb7aa52 84 * \par Lengths supported by the transform:
emilmont 1:fdd22bb7aa52 85 * \par
emilmont 1:fdd22bb7aa52 86 * Real FFT/IFFT supports the lengths [128, 512, 2048], as it internally uses CFFT/CIFFT.
emilmont 1:fdd22bb7aa52 87 *
emilmont 1:fdd22bb7aa52 88 * \par Instance Structure
emilmont 1:fdd22bb7aa52 89 * A separate instance structure must be defined for each Instance but the twiddle factors can be reused.
emilmont 1:fdd22bb7aa52 90 * There are separate instance structure declarations for each of the 3 supported data types.
emilmont 1:fdd22bb7aa52 91 *
emilmont 1:fdd22bb7aa52 92 * \par Initialization Functions
emilmont 1:fdd22bb7aa52 93 * There is also an associated initialization function for each data type.
emilmont 1:fdd22bb7aa52 94 * The initialization function performs the following operations:
emilmont 1:fdd22bb7aa52 95 * - Sets the values of the internal structure fields.
emilmont 1:fdd22bb7aa52 96 * - Initializes twiddle factor tables.
emilmont 1:fdd22bb7aa52 97 * - Initializes CFFT data structure fields.
emilmont 1:fdd22bb7aa52 98 * \par
emilmont 1:fdd22bb7aa52 99 * Use of the initialization function is optional.
emilmont 1:fdd22bb7aa52 100 * However, if the initialization function is used, then the instance structure cannot be placed into a const data section.
emilmont 1:fdd22bb7aa52 101 * To place an instance structure into a const data section, the instance structure must be manually initialized.
emilmont 1:fdd22bb7aa52 102 * Manually initialize the instance structure as follows:
emilmont 1:fdd22bb7aa52 103 * <pre>
emilmont 1:fdd22bb7aa52 104 *arm_rfft_instance_f32 S = {fftLenReal, fftLenBy2, ifftFlagR, bitReverseFlagR, twidCoefRModifier, pTwiddleAReal, pTwiddleBReal, pCfft};
emilmont 1:fdd22bb7aa52 105 *arm_rfft_instance_q31 S = {fftLenReal, fftLenBy2, ifftFlagR, bitReverseFlagR, twidCoefRModifier, pTwiddleAReal, pTwiddleBReal, pCfft};
emilmont 1:fdd22bb7aa52 106 *arm_rfft_instance_q15 S = {fftLenReal, fftLenBy2, ifftFlagR, bitReverseFlagR, twidCoefRModifier, pTwiddleAReal, pTwiddleBReal, pCfft};
emilmont 1:fdd22bb7aa52 107 * </pre>
emilmont 1:fdd22bb7aa52 108 * where <code>fftLenReal</code> length of RFFT/RIFFT; <code>fftLenBy2</code> length of CFFT/CIFFT.
emilmont 1:fdd22bb7aa52 109 * <code>ifftFlagR</code> Flag for selection of RFFT or RIFFT(Set ifftFlagR to calculate RIFFT otherwise calculates RFFT);
emilmont 1:fdd22bb7aa52 110 * <code>bitReverseFlagR</code> Flag for selection of output order(Set bitReverseFlagR to output in normal order otherwise output in bit reversed order);
emilmont 1:fdd22bb7aa52 111 * <code>twidCoefRModifier</code> modifier for twiddle factor table which supports 128, 512, 2048 RFFT lengths with same table;
emilmont 1:fdd22bb7aa52 112 * <code>pTwiddleAReal</code>points to A array of twiddle coefficients; <code>pTwiddleBReal</code>points to B array of twiddle coefficients;
emilmont 1:fdd22bb7aa52 113 * <code>pCfft</code> points to the CFFT Instance structure. The CFFT structure also needs to be initialized, refer to arm_cfft_radix4_f32() for details regarding
emilmont 1:fdd22bb7aa52 114 * static initialization of cfft structure.
emilmont 1:fdd22bb7aa52 115 *
emilmont 1:fdd22bb7aa52 116 * \par Fixed-Point Behavior
emilmont 1:fdd22bb7aa52 117 * Care must be taken when using the fixed-point versions of the RFFT/RIFFT function.
emilmont 1:fdd22bb7aa52 118 * Refer to the function specific documentation below for usage guidelines.
emilmont 1:fdd22bb7aa52 119 */
emilmont 1:fdd22bb7aa52 120
emilmont 1:fdd22bb7aa52 121 /*--------------------------------------------------------------------
emilmont 1:fdd22bb7aa52 122 * Internal functions prototypes
emilmont 1:fdd22bb7aa52 123 *--------------------------------------------------------------------*/
emilmont 1:fdd22bb7aa52 124
emilmont 1:fdd22bb7aa52 125 void arm_split_rfft_f32(
emilmont 1:fdd22bb7aa52 126 float32_t * pSrc,
emilmont 1:fdd22bb7aa52 127 uint32_t fftLen,
emilmont 1:fdd22bb7aa52 128 float32_t * pATable,
emilmont 1:fdd22bb7aa52 129 float32_t * pBTable,
emilmont 1:fdd22bb7aa52 130 float32_t * pDst,
emilmont 1:fdd22bb7aa52 131 uint32_t modifier);
emilmont 1:fdd22bb7aa52 132 void arm_split_rifft_f32(
emilmont 1:fdd22bb7aa52 133 float32_t * pSrc,
emilmont 1:fdd22bb7aa52 134 uint32_t fftLen,
emilmont 1:fdd22bb7aa52 135 float32_t * pATable,
emilmont 1:fdd22bb7aa52 136 float32_t * pBTable,
emilmont 1:fdd22bb7aa52 137 float32_t * pDst,
emilmont 1:fdd22bb7aa52 138 uint32_t modifier);
emilmont 1:fdd22bb7aa52 139
emilmont 1:fdd22bb7aa52 140 /**
emilmont 1:fdd22bb7aa52 141 * @addtogroup RFFT_RIFFT
emilmont 1:fdd22bb7aa52 142 * @{
emilmont 1:fdd22bb7aa52 143 */
emilmont 1:fdd22bb7aa52 144
emilmont 1:fdd22bb7aa52 145 /**
emilmont 1:fdd22bb7aa52 146 * @brief Processing function for the floating-point RFFT/RIFFT.
emilmont 1:fdd22bb7aa52 147 * @param[in] *S points to an instance of the floating-point RFFT/RIFFT structure.
emilmont 1:fdd22bb7aa52 148 * @param[in] *pSrc points to the input buffer.
emilmont 1:fdd22bb7aa52 149 * @param[out] *pDst points to the output buffer.
emilmont 1:fdd22bb7aa52 150 * @return none.
emilmont 1:fdd22bb7aa52 151 */
emilmont 1:fdd22bb7aa52 152
emilmont 1:fdd22bb7aa52 153 void arm_rfft_f32(
emilmont 1:fdd22bb7aa52 154 const arm_rfft_instance_f32 * S,
emilmont 1:fdd22bb7aa52 155 float32_t * pSrc,
emilmont 1:fdd22bb7aa52 156 float32_t * pDst)
emilmont 1:fdd22bb7aa52 157 {
emilmont 1:fdd22bb7aa52 158 const arm_cfft_radix4_instance_f32 *S_CFFT = S->pCfft;
emilmont 1:fdd22bb7aa52 159
emilmont 1:fdd22bb7aa52 160
emilmont 1:fdd22bb7aa52 161 /* Calculation of Real IFFT of input */
emilmont 1:fdd22bb7aa52 162 if(S->ifftFlagR == 1u)
emilmont 1:fdd22bb7aa52 163 {
emilmont 1:fdd22bb7aa52 164 /* Real IFFT core process */
emilmont 1:fdd22bb7aa52 165 arm_split_rifft_f32(pSrc, S->fftLenBy2, S->pTwiddleAReal,
emilmont 1:fdd22bb7aa52 166 S->pTwiddleBReal, pDst, S->twidCoefRModifier);
emilmont 1:fdd22bb7aa52 167
emilmont 1:fdd22bb7aa52 168
emilmont 1:fdd22bb7aa52 169 /* Complex radix-4 IFFT process */
emilmont 1:fdd22bb7aa52 170 arm_radix4_butterfly_inverse_f32(pDst, S_CFFT->fftLen,
emilmont 1:fdd22bb7aa52 171 S_CFFT->pTwiddle,
emilmont 1:fdd22bb7aa52 172 S_CFFT->twidCoefModifier,
emilmont 1:fdd22bb7aa52 173 S_CFFT->onebyfftLen);
emilmont 1:fdd22bb7aa52 174
emilmont 1:fdd22bb7aa52 175 /* Bit reversal process */
emilmont 1:fdd22bb7aa52 176 if(S->bitReverseFlagR == 1u)
emilmont 1:fdd22bb7aa52 177 {
emilmont 1:fdd22bb7aa52 178 arm_bitreversal_f32(pDst, S_CFFT->fftLen,
emilmont 1:fdd22bb7aa52 179 S_CFFT->bitRevFactor, S_CFFT->pBitRevTable);
emilmont 1:fdd22bb7aa52 180 }
emilmont 1:fdd22bb7aa52 181 }
emilmont 1:fdd22bb7aa52 182 else
emilmont 1:fdd22bb7aa52 183 {
emilmont 1:fdd22bb7aa52 184
emilmont 1:fdd22bb7aa52 185 /* Calculation of RFFT of input */
emilmont 1:fdd22bb7aa52 186
emilmont 1:fdd22bb7aa52 187 /* Complex radix-4 FFT process */
emilmont 1:fdd22bb7aa52 188 arm_radix4_butterfly_f32(pSrc, S_CFFT->fftLen,
emilmont 1:fdd22bb7aa52 189 S_CFFT->pTwiddle, S_CFFT->twidCoefModifier);
emilmont 1:fdd22bb7aa52 190
emilmont 1:fdd22bb7aa52 191 /* Bit reversal process */
emilmont 1:fdd22bb7aa52 192 if(S->bitReverseFlagR == 1u)
emilmont 1:fdd22bb7aa52 193 {
emilmont 1:fdd22bb7aa52 194 arm_bitreversal_f32(pSrc, S_CFFT->fftLen,
emilmont 1:fdd22bb7aa52 195 S_CFFT->bitRevFactor, S_CFFT->pBitRevTable);
emilmont 1:fdd22bb7aa52 196 }
emilmont 1:fdd22bb7aa52 197
emilmont 1:fdd22bb7aa52 198
emilmont 1:fdd22bb7aa52 199 /* Real FFT core process */
emilmont 1:fdd22bb7aa52 200 arm_split_rfft_f32(pSrc, S->fftLenBy2, S->pTwiddleAReal,
emilmont 1:fdd22bb7aa52 201 S->pTwiddleBReal, pDst, S->twidCoefRModifier);
emilmont 1:fdd22bb7aa52 202 }
emilmont 1:fdd22bb7aa52 203
emilmont 1:fdd22bb7aa52 204 }
emilmont 1:fdd22bb7aa52 205
emilmont 1:fdd22bb7aa52 206 /**
emilmont 1:fdd22bb7aa52 207 * @} end of RFFT_RIFFT group
emilmont 1:fdd22bb7aa52 208 */
emilmont 1:fdd22bb7aa52 209
emilmont 1:fdd22bb7aa52 210 /**
emilmont 1:fdd22bb7aa52 211 * @brief Core Real FFT process
emilmont 1:fdd22bb7aa52 212 * @param[in] *pSrc points to the input buffer.
emilmont 1:fdd22bb7aa52 213 * @param[in] fftLen length of FFT.
emilmont 1:fdd22bb7aa52 214 * @param[in] *pATable points to the twiddle Coef A buffer.
emilmont 1:fdd22bb7aa52 215 * @param[in] *pBTable points to the twiddle Coef B buffer.
emilmont 1:fdd22bb7aa52 216 * @param[out] *pDst points to the output buffer.
emilmont 1:fdd22bb7aa52 217 * @param[in] modifier twiddle coefficient modifier that supports different size FFTs with the same twiddle factor table.
emilmont 1:fdd22bb7aa52 218 * @return none.
emilmont 1:fdd22bb7aa52 219 */
emilmont 1:fdd22bb7aa52 220
emilmont 1:fdd22bb7aa52 221 void arm_split_rfft_f32(
emilmont 1:fdd22bb7aa52 222 float32_t * pSrc,
emilmont 1:fdd22bb7aa52 223 uint32_t fftLen,
emilmont 1:fdd22bb7aa52 224 float32_t * pATable,
emilmont 1:fdd22bb7aa52 225 float32_t * pBTable,
emilmont 1:fdd22bb7aa52 226 float32_t * pDst,
emilmont 1:fdd22bb7aa52 227 uint32_t modifier)
emilmont 1:fdd22bb7aa52 228 {
emilmont 1:fdd22bb7aa52 229 uint32_t i; /* Loop Counter */
emilmont 1:fdd22bb7aa52 230 float32_t outR, outI; /* Temporary variables for output */
emilmont 1:fdd22bb7aa52 231 float32_t *pCoefA, *pCoefB; /* Temporary pointers for twiddle factors */
emilmont 1:fdd22bb7aa52 232 float32_t CoefA1, CoefA2, CoefB1; /* Temporary variables for twiddle coefficients */
emilmont 1:fdd22bb7aa52 233 float32_t *pDst1 = &pDst[2], *pDst2 = &pDst[(4u * fftLen) - 1u]; /* temp pointers for output buffer */
emilmont 1:fdd22bb7aa52 234 float32_t *pSrc1 = &pSrc[2], *pSrc2 = &pSrc[(2u * fftLen) - 1u]; /* temp pointers for input buffer */
emilmont 1:fdd22bb7aa52 235
emilmont 1:fdd22bb7aa52 236 /* Init coefficient pointers */
emilmont 1:fdd22bb7aa52 237 pCoefA = &pATable[modifier * 2u];
emilmont 1:fdd22bb7aa52 238 pCoefB = &pBTable[modifier * 2u];
emilmont 1:fdd22bb7aa52 239
emilmont 1:fdd22bb7aa52 240 i = fftLen - 1u;
emilmont 1:fdd22bb7aa52 241
emilmont 1:fdd22bb7aa52 242 while(i > 0u)
emilmont 1:fdd22bb7aa52 243 {
emilmont 1:fdd22bb7aa52 244 /*
emilmont 1:fdd22bb7aa52 245 outR = (pSrc[2 * i] * pATable[2 * i] - pSrc[2 * i + 1] * pATable[2 * i + 1]
emilmont 1:fdd22bb7aa52 246 + pSrc[2 * n - 2 * i] * pBTable[2 * i] +
emilmont 1:fdd22bb7aa52 247 pSrc[2 * n - 2 * i + 1] * pBTable[2 * i + 1]);
emilmont 1:fdd22bb7aa52 248 */
emilmont 1:fdd22bb7aa52 249
emilmont 1:fdd22bb7aa52 250 /* outI = (pIn[2 * i + 1] * pATable[2 * i] + pIn[2 * i] * pATable[2 * i + 1] +
emilmont 1:fdd22bb7aa52 251 pIn[2 * n - 2 * i] * pBTable[2 * i + 1] -
emilmont 1:fdd22bb7aa52 252 pIn[2 * n - 2 * i + 1] * pBTable[2 * i]); */
emilmont 1:fdd22bb7aa52 253
emilmont 1:fdd22bb7aa52 254 /* read pATable[2 * i] */
emilmont 1:fdd22bb7aa52 255 CoefA1 = *pCoefA++;
emilmont 1:fdd22bb7aa52 256 /* pATable[2 * i + 1] */
emilmont 1:fdd22bb7aa52 257 CoefA2 = *pCoefA;
emilmont 1:fdd22bb7aa52 258
emilmont 1:fdd22bb7aa52 259 /* pSrc[2 * i] * pATable[2 * i] */
emilmont 1:fdd22bb7aa52 260 outR = *pSrc1 * CoefA1;
emilmont 1:fdd22bb7aa52 261 /* pSrc[2 * i] * CoefA2 */
emilmont 1:fdd22bb7aa52 262 outI = *pSrc1++ * CoefA2;
emilmont 1:fdd22bb7aa52 263
emilmont 1:fdd22bb7aa52 264 /* (pSrc[2 * i + 1] + pSrc[2 * fftLen - 2 * i + 1]) * CoefA2 */
emilmont 1:fdd22bb7aa52 265 outR -= (*pSrc1 + *pSrc2) * CoefA2;
emilmont 1:fdd22bb7aa52 266 /* pSrc[2 * i + 1] * CoefA1 */
emilmont 1:fdd22bb7aa52 267 outI += *pSrc1++ * CoefA1;
emilmont 1:fdd22bb7aa52 268
emilmont 1:fdd22bb7aa52 269 CoefB1 = *pCoefB;
emilmont 1:fdd22bb7aa52 270
emilmont 1:fdd22bb7aa52 271 /* pSrc[2 * fftLen - 2 * i + 1] * CoefB1 */
emilmont 1:fdd22bb7aa52 272 outI -= *pSrc2-- * CoefB1;
emilmont 1:fdd22bb7aa52 273 /* pSrc[2 * fftLen - 2 * i] * CoefA2 */
emilmont 1:fdd22bb7aa52 274 outI -= *pSrc2 * CoefA2;
emilmont 1:fdd22bb7aa52 275
emilmont 1:fdd22bb7aa52 276 /* pSrc[2 * fftLen - 2 * i] * CoefB1 */
emilmont 1:fdd22bb7aa52 277 outR += *pSrc2-- * CoefB1;
emilmont 1:fdd22bb7aa52 278
emilmont 1:fdd22bb7aa52 279 /* write output */
emilmont 1:fdd22bb7aa52 280 *pDst1++ = outR;
emilmont 1:fdd22bb7aa52 281 *pDst1++ = outI;
emilmont 1:fdd22bb7aa52 282
emilmont 1:fdd22bb7aa52 283 /* write complex conjugate output */
emilmont 1:fdd22bb7aa52 284 *pDst2-- = -outI;
emilmont 1:fdd22bb7aa52 285 *pDst2-- = outR;
emilmont 1:fdd22bb7aa52 286
emilmont 1:fdd22bb7aa52 287 /* update coefficient pointer */
emilmont 1:fdd22bb7aa52 288 pCoefB = pCoefB + (modifier * 2u);
emilmont 1:fdd22bb7aa52 289 pCoefA = pCoefA + ((modifier * 2u) - 1u);
emilmont 1:fdd22bb7aa52 290
emilmont 1:fdd22bb7aa52 291 i--;
emilmont 1:fdd22bb7aa52 292
emilmont 1:fdd22bb7aa52 293 }
emilmont 1:fdd22bb7aa52 294
emilmont 1:fdd22bb7aa52 295 pDst[2u * fftLen] = pSrc[0] - pSrc[1];
emilmont 1:fdd22bb7aa52 296 pDst[(2u * fftLen) + 1u] = 0.0f;
emilmont 1:fdd22bb7aa52 297
emilmont 1:fdd22bb7aa52 298 pDst[0] = pSrc[0] + pSrc[1];
emilmont 1:fdd22bb7aa52 299 pDst[1] = 0.0f;
emilmont 1:fdd22bb7aa52 300
emilmont 1:fdd22bb7aa52 301 }
emilmont 1:fdd22bb7aa52 302
emilmont 1:fdd22bb7aa52 303
emilmont 1:fdd22bb7aa52 304 /**
emilmont 1:fdd22bb7aa52 305 * @brief Core Real IFFT process
emilmont 1:fdd22bb7aa52 306 * @param[in] *pSrc points to the input buffer.
emilmont 1:fdd22bb7aa52 307 * @param[in] fftLen length of FFT.
emilmont 1:fdd22bb7aa52 308 * @param[in] *pATable points to the twiddle Coef A buffer.
emilmont 1:fdd22bb7aa52 309 * @param[in] *pBTable points to the twiddle Coef B buffer.
emilmont 1:fdd22bb7aa52 310 * @param[out] *pDst points to the output buffer.
emilmont 1:fdd22bb7aa52 311 * @param[in] modifier twiddle coefficient modifier that supports different size FFTs with the same twiddle factor table.
emilmont 1:fdd22bb7aa52 312 * @return none.
emilmont 1:fdd22bb7aa52 313 */
emilmont 1:fdd22bb7aa52 314
emilmont 1:fdd22bb7aa52 315 void arm_split_rifft_f32(
emilmont 1:fdd22bb7aa52 316 float32_t * pSrc,
emilmont 1:fdd22bb7aa52 317 uint32_t fftLen,
emilmont 1:fdd22bb7aa52 318 float32_t * pATable,
emilmont 1:fdd22bb7aa52 319 float32_t * pBTable,
emilmont 1:fdd22bb7aa52 320 float32_t * pDst,
emilmont 1:fdd22bb7aa52 321 uint32_t modifier)
emilmont 1:fdd22bb7aa52 322 {
emilmont 1:fdd22bb7aa52 323 float32_t outR, outI; /* Temporary variables for output */
emilmont 1:fdd22bb7aa52 324 float32_t *pCoefA, *pCoefB; /* Temporary pointers for twiddle factors */
emilmont 1:fdd22bb7aa52 325 float32_t CoefA1, CoefA2, CoefB1; /* Temporary variables for twiddle coefficients */
emilmont 1:fdd22bb7aa52 326 float32_t *pSrc1 = &pSrc[0], *pSrc2 = &pSrc[(2u * fftLen) + 1u];
emilmont 1:fdd22bb7aa52 327
emilmont 1:fdd22bb7aa52 328 pCoefA = &pATable[0];
emilmont 1:fdd22bb7aa52 329 pCoefB = &pBTable[0];
emilmont 1:fdd22bb7aa52 330
emilmont 1:fdd22bb7aa52 331 while(fftLen > 0u)
emilmont 1:fdd22bb7aa52 332 {
emilmont 1:fdd22bb7aa52 333 /*
emilmont 1:fdd22bb7aa52 334 outR = (pIn[2 * i] * pATable[2 * i] + pIn[2 * i + 1] * pATable[2 * i + 1] +
emilmont 1:fdd22bb7aa52 335 pIn[2 * n - 2 * i] * pBTable[2 * i] -
emilmont 1:fdd22bb7aa52 336 pIn[2 * n - 2 * i + 1] * pBTable[2 * i + 1]);
emilmont 1:fdd22bb7aa52 337
emilmont 1:fdd22bb7aa52 338 outI = (pIn[2 * i + 1] * pATable[2 * i] - pIn[2 * i] * pATable[2 * i + 1] -
emilmont 1:fdd22bb7aa52 339 pIn[2 * n - 2 * i] * pBTable[2 * i + 1] -
emilmont 1:fdd22bb7aa52 340 pIn[2 * n - 2 * i + 1] * pBTable[2 * i]);
emilmont 1:fdd22bb7aa52 341
emilmont 1:fdd22bb7aa52 342 */
emilmont 1:fdd22bb7aa52 343
emilmont 1:fdd22bb7aa52 344 CoefA1 = *pCoefA++;
emilmont 1:fdd22bb7aa52 345 CoefA2 = *pCoefA;
emilmont 1:fdd22bb7aa52 346
emilmont 1:fdd22bb7aa52 347 /* outR = (pSrc[2 * i] * CoefA1 */
emilmont 1:fdd22bb7aa52 348 outR = *pSrc1 * CoefA1;
emilmont 1:fdd22bb7aa52 349
emilmont 1:fdd22bb7aa52 350 /* - pSrc[2 * i] * CoefA2 */
emilmont 1:fdd22bb7aa52 351 outI = -(*pSrc1++) * CoefA2;
emilmont 1:fdd22bb7aa52 352
emilmont 1:fdd22bb7aa52 353 /* (pSrc[2 * i + 1] + pSrc[2 * fftLen - 2 * i + 1]) * CoefA2 */
emilmont 1:fdd22bb7aa52 354 outR += (*pSrc1 + *pSrc2) * CoefA2;
emilmont 1:fdd22bb7aa52 355
emilmont 1:fdd22bb7aa52 356 /* pSrc[2 * i + 1] * CoefA1 */
emilmont 1:fdd22bb7aa52 357 outI += (*pSrc1++) * CoefA1;
emilmont 1:fdd22bb7aa52 358
emilmont 1:fdd22bb7aa52 359 CoefB1 = *pCoefB;
emilmont 1:fdd22bb7aa52 360
emilmont 1:fdd22bb7aa52 361 /* - pSrc[2 * fftLen - 2 * i + 1] * CoefB1 */
emilmont 1:fdd22bb7aa52 362 outI -= *pSrc2-- * CoefB1;
emilmont 1:fdd22bb7aa52 363
emilmont 1:fdd22bb7aa52 364 /* pSrc[2 * fftLen - 2 * i] * CoefB1 */
emilmont 1:fdd22bb7aa52 365 outR += *pSrc2 * CoefB1;
emilmont 1:fdd22bb7aa52 366
emilmont 1:fdd22bb7aa52 367 /* pSrc[2 * fftLen - 2 * i] * CoefA2 */
emilmont 1:fdd22bb7aa52 368 outI += *pSrc2-- * CoefA2;
emilmont 1:fdd22bb7aa52 369
emilmont 1:fdd22bb7aa52 370 /* write output */
emilmont 1:fdd22bb7aa52 371 *pDst++ = outR;
emilmont 1:fdd22bb7aa52 372 *pDst++ = outI;
emilmont 1:fdd22bb7aa52 373
emilmont 1:fdd22bb7aa52 374 /* update coefficient pointer */
emilmont 1:fdd22bb7aa52 375 pCoefB = pCoefB + (modifier * 2u);
emilmont 1:fdd22bb7aa52 376 pCoefA = pCoefA + ((modifier * 2u) - 1u);
emilmont 1:fdd22bb7aa52 377
emilmont 1:fdd22bb7aa52 378 /* Decrement loop count */
emilmont 1:fdd22bb7aa52 379 fftLen--;
emilmont 1:fdd22bb7aa52 380 }
emilmont 1:fdd22bb7aa52 381
emilmont 1:fdd22bb7aa52 382 }