fix: bugs in dsps_fird_f32_aes3 (DSP-167)#110
fix: bugs in dsps_fird_f32_aes3 (DSP-167)#110Schuwi wants to merge 1 commit intoespressif:masterfrom
Conversation
|
Hi @Schuwi, thank you very much, I will merge next week. Dmitry |
|
@dmitry1945 Does this apply to dsps_fir_f32_aes3 ? I'm using it to process a high-pass filter, today I updated esp-dsp 1.7.0 as well as made hardware changes and now I'm getting distorted sound and I'm not sure where the problem is. |
|
@Kristian8606 this update for fird only. Fir should work. Dmitry |
|
@Schuwi Thanks |
That's okay. As file contributor it is fine if you simply put |
Description
This PR fixes critical bugs in the
dsps_fird_f32_aes3optimized assembly implementation that were causing digital noise/corruption in the FIR filter output.Issues Fixed:
Incorrect register usage in offset_3 branch: The final accumulation was overwriting
f4instead of usingf6, causing partial loss of filter results.Incorrect instruction ordering in non-offset_0 branches: The order of
madd.sandEE.LDF.128.IPinstructions in the tight FIR loops was leading to out-of-bounds data being added to FIR results..Potential out-of-bounds memory access: The optimized implementation could read 4 floats beyond the delay line buffer, which could crash if the buffer was placed at a memory boundary.
Root Cause:
The optimized AES3 implementation handles different memory alignments (offset_0, offset_1, offset_2, offset_3) but contained copy-paste errors and subtle timing issues in the non-aligned cases. The ANSI implementation worked correctly, but the optimized version would intermittently produce digital noise when the delay line pointer wasn't 16-byte aligned.
Changes Made:
.offset_3section (line 204:madd.s f4, f3, f14→madd.s f6, f3, f14).offset_1,.offset_2, and.offset_3loops for proper coefficient-data alignmentDocumentation Update Needed:
The requirement for delay line buffer to be
N + 4floats (not justN) should be documented fordsps_fird_f32_init, as it's currently only mentioned fordsps_fir_f32_init.Testing
The fix has been tested in a real-world PDM audio processing application where the digital noise was clearly audible before the fix and completely eliminated after.
Checklist
Before submitting a Pull Request, please ensure the following: