Hello, thank you very much for open-sourcing this work. While testing it, I noticed a potential issue: if the two input audio samples are swapped, the final output remains the same. I have tested this multiple times and consistently observed the same behavior.
Sample 1:
[Final Result] B
Average Score of Audio A: 5.9, Average Score of Audio B: 8.7
Sample 2 (inputs swapped):
[Final Result] B
Average Score of Audio A: 5.0, Average Score of Audio B: 8.6
Hello, thank you very much for open-sourcing this work. While testing it, I noticed a potential issue: if the two input audio samples are swapped, the final output remains the same. I have tested this multiple times and consistently observed the same behavior.
Sample 1:
[Final Result] B
Average Score of Audio A: 5.9, Average Score of Audio B: 8.7
Sample 2 (inputs swapped):
[Final Result] B
Average Score of Audio A: 5.0, Average Score of Audio B: 8.6