Check my port of two DSP libs: FFT There is also going to be a lib from NXP but it's not public yet.
As for multiply speed, the Cortex-M3 TRM says:
Multiply: 1 or 2 cycles.
MUL, MLA, and MLS. MUL is one cycle and MLA and MLS are two cycles.
Multiply with 64-bit result: 3-7 cycles. Cycle count based on input sizes. That is, ABS(inputs) < 64K terminates early.
UMULL/SMULL/UMLAL/SMLAL use early termination depending on the size of source values. These are interruptible (abandoned/restarted), with worst case latency of one cycle. MLAL versions take four to seven cycles and MULL versions take three to five cycles. For MLAL, the signed version is one cycle longer than the unsigned.
"A" instructions are accumulating, i.e. a = b*c + d.
BTW, why stop at "coprocessor"? I'm quite sure a Cortex-M3 can do everything that Propeller can, only better :)
Just got the board - very nice! I wonder if it can be used as a poor man's DSP? What is the speed of a 32x32bit multiply?
In our current project, we use a propeller (www.parallax.com) to read data from an audio A/D at 80kHz. The decimation to lower sample rates is done... poorly... inside the propeller right now and I wonder if the mbed can be used as a sort of DSP co-processor? In the approx. 12uS between samples, how many, for example FIR filter operations can be performed?
We are considering going to a "proper" DSP chip but this device looks promising as it can do other things as well...
Thanks for you help, and I look forward to your comments.
Nice job on the board, again!
Sincerely,
Sridhar