fix: display original format name for ffmpeg files (#46) by jaimefgdev · Pull Request #62 · systemblueteam/sounddiff

jaimefgdev · 2026-03-27T10:18:42Z

What changed

Added a mapping dictionary to translate original file extensions into readable display names. Updated load_audio to use this mapped name for the metadata instead of the default sf.info format when a file has been temporarily transcoded to WAV via ffmpeg.

Why

Fixes an issue where compressed files (like MP3s) processed by the new ffmpeg fallback were incorrectly reported as 'WAV' in the output since the metadata was being read from the temporary file.

Closes

Closes #46

coderabbitai · 2026-03-27T10:18:54Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Repository UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: deb49634-29cf-4bb9-a551-98f207251883

📥 Commits

Reviewing files that changed from the base of the PR and between ab69fb9 and dda935b.

📒 Files selected for processing (2)

src/sounddiff/formats.py
tests/test_formats.py

📝 Walkthrough

Walkthrough

Adds FORMAT_DISPLAY_NAMES and updates load_audio() so AudioMetadata.format_name is taken from the original file suffix (via the mapping or uppercased fallback) when the read path differs from the original; otherwise keeps using info.format. No other control-flow changes.

Changes

Cohort / File(s)	Summary
Format Name Preservation `src/sounddiff/formats.py`	Added module constant `FORMAT_DISPLAY_NAMES: dict[str, str]` and changed `load_audio()` to prefer the original input file suffix (mapped or uppercased) for `AudioMetadata.format_name` when `original_filepath != read_filepath`; falls back to `info.format` when paths match or suffix unmapped.
Tests `tests/test_formats.py`	Added `test_transcoded_format_name_is_correct` which mocks `shutil.which`, `subprocess.run`, and `soundfile` (`sf.info`/`sf.read`) to verify `load_audio()` reports the original format name (e.g., "MP3") even when the file is transcoded and read as WAV.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat: add ffmpeg fallback for compressed audio formats (#19) #45: Earlier changes introducing ffmpeg temp-file/read_filepath logic that this change builds upon by preserving/deriving original format metadata.

Suggested reviewers

systemblueteam

Poem

"I’m a rabbit who sniffs file tails and trails,
I hop through suffixes, follow audio trails.
No longer fooled by a temp WAV disguise,
I call MP3s, AACs by their true little names. 🐇🎶"

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: fixing the display of original format names for ffmpeg-transcoded files, matching the core objective.
Description check	✅ Passed	The description is directly related to the changeset, explaining what was added (mapping dictionary), why (fixing incorrect WAV format reporting), and the issue it closes.
Linked Issues check	✅ Passed	The PR successfully implements all requirements from `#46`: mapping original file suffixes to display names and using those names for format_name in metadata when files are ffmpeg-transcoded.
Out of Scope Changes check	✅ Passed	All changes are scoped to the issue `#46` objective: FORMAT_DISPLAY_NAMES mapping, load_audio() updates to use original format names, and corresponding test coverage with no unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

src/sounddiff/formats.py

coderabbitai

Actionable comments posted: 3

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Repository UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: d7fa166c-9adb-4f84-b8f4-0fa07b9690ea

📥 Commits

Reviewing files that changed from the base of the PR and between 1bb6ac9 and 7a1f640.

📒 Files selected for processing (1)

src/sounddiff/formats.py

src/sounddiff/formats.py

coderabbitai

♻️ Duplicate comments (2)

src/sounddiff/formats.py (2)
69-74: ⚠️ Potential issue | 🔴 Critical

Undefined variables in format-name branch cause runtime failure.

At Line 69, original_filepath and read_filepath are not defined in load_audio(), so native files will hit a NameError before metadata creation. Also, Line 50 still rejects ffmpeg formats, so this branch cannot currently fulfill ffmpeg format-name preservation.
Proposed minimal fix
-    if original_filepath != read_filepath:
-        ext = original_filepath.suffix.lower()
-        display_format = FORMAT_DISPLAY_NAMES.get(ext, ext.lstrip('.').upper())
-    else:
-        display_format = info.format
+    display_format = info.format
18-25: 🛠️ Refactor suggestion | 🟠 Major

Add an explicit type annotation for FORMAT_DISPLAY_NAMES.

This exported constant should be explicitly typed for strict mypy compliance.
Proposed fix
-FORMAT_DISPLAY_NAMES = {
+FORMAT_DISPLAY_NAMES: dict[str, str] = {
     ".mp3": "MP3",
     ".aac": "AAC",
     ".m4a": "AAC",
     ".wma": "WMA",
     ".opus": "Opus",
 }
As per coding guidelines, "Use mypy strict mode with type hints everywhere".

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Repository UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: c7d09fdd-5a77-431d-928a-1b7360be5160

📥 Commits

Reviewing files that changed from the base of the PR and between 7a1f640 and 062e766.

📒 Files selected for processing (1)

src/sounddiff/formats.py

jaimefgdev · 2026-03-27T10:47:22Z

@systemblueteam All tests are passing now! CodeRabbit has a minor refactor suggestion, but the core issue is fixed and the CI is green. Ready for human review! 🚀

@coderabbitai resolve

coderabbitai · 2026-03-27T10:47:36Z

Tip

For best results, initiate chat on the files or code changes.

@jaimefgdev Great to hear that all tests are passing and CI is green! I'll resolve all my review comments now. 🎉

(ﾉ◕ヮ◕)ﾉ*:･ﾟ✧

[resolve]

systemblueteam

@jaimefgdev Good fix. Two things before merge: add a test that verifies an MP3 or M4A file shows the correct format name instead of "WAV", and add a type annotation to FORMAT_DISPLAY_NAMES.

coderabbitai

Actionable comments posted: 2

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Repository UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1ca67c69-47c2-438a-9869-eb0d90349100

📥 Commits

Reviewing files that changed from the base of the PR and between 062e766 and c003cd7.

📒 Files selected for processing (2)

src/sounddiff/formats.py
tests/test_formats.py

coderabbitai · 2026-03-27T19:07:13Z

tests/test_formats.py

+    def test_transcoded_format_name_is_correct(self, tmp_path: Path) -> None:
+        """Verifies that transcoded files show their original format name, not WAV."""
+        from unittest.mock import patch
+
+        # 1. Creamos un archivo MP3 falso para pasar la validación de path.exists()
+        fake_mp3 = tmp_path / "test.mp3"
+        fake_mp3.write_text("fake audio content")
+
+        # 2. Simulamos que ffmpeg existe y que la lectura del WAV temporal funciona
+        with patch("sounddiff.formats.shutil.which", return_value="ffmpeg"), \
+             patch("sounddiff.formats.subprocess.run"), \
+             patch("sounddiff.formats.sf.info") as mock_info, \
+             patch("sounddiff.formats.sf.read") as mock_read:
+
+            # Configuramos el mock para simular lo que devolvería el WAV temporal
+            class MockInfo:
+                format = "WAV"
+                subtype = "PCM_16"
+
+            mock_info.return_value = MockInfo()
+            mock_read.return_value = (np.zeros((100, 2), dtype=np.float64), 44100)
+
+            # 3. Llamamos a tu función
+            _, meta = load_audio(fake_mp3)
+
+            # 4. LA COMPROBACIÓN FINAL: Debe decir MP3 y no WAV
+            assert meta.format_name == "MP3"


🧹 Nitpick | 🔵 Trivial

Test logic is sound; consider translating comments to English.

The test correctly validates the core fix by mocking the ffmpeg transcode flow and asserting that format_name reflects the original format ("MP3") rather than the intermediate WAV. Good isolation with tmp_path and proper mock setup.

The inline comments are in Spanish; for consistency with the English codebase, consider translating them:

📝 Suggested comment translations

- # 1. Creamos un archivo MP3 falso para pasar la validación de path.exists() + # 1. Create a fake MP3 file to pass path.exists() validation fake_mp3 = tmp_path / "test.mp3" fake_mp3.write_text("fake audio content") - # 2. Simulamos que ffmpeg existe y que la lectura del WAV temporal funciona + # 2. Simulate ffmpeg existing and temp WAV read succeeding with patch("sounddiff.formats.shutil.which", return_value="ffmpeg"), \ ... - # Configuramos el mock para simular lo que devolvería el WAV temporal + # Configure mock to simulate what the temp WAV would return ... - # 3. Llamamos a tu función + # 3. Call the function under test _, meta = load_audio(fake_mp3) - # 4. LA COMPROBACIÓN FINAL: Debe decir MP3 y no WAV + # 4. Final assertion: must report MP3, not WAV assert meta.format_name == "MP3"

@jaimefgdev

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

tests/test_formats.py

coderabbitai

♻️ Duplicate comments (1)

tests/test_formats.py (1)
108-108: ⚠️ Potential issue | 🟡 Minor

Fix trailing whitespace and restore final newline at EOF.

Line 108 still contains whitespace-only content, and the file is missing the trailing newline (W293, W292).
Suggested diff
@@
-        
+
@@
-        assert format_channels(6) == "6ch"
+        assert format_channels(6) == "6ch"
+
As per coding guidelines, **/*.py: Use Ruff for linting and formatting. Do not use black, isort, or flake8.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Repository UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: a7d0364c-e443-49b3-9a9c-be0167e639a6

📥 Commits

Reviewing files that changed from the base of the PR and between c003cd7 and ab69fb9.

📒 Files selected for processing (1)

tests/test_formats.py

systemblueteam · 2026-03-27T19:29:31Z

CI is failing on a lint check — two files are missing a trailing newline at the end. Run ruff check --fix . to auto-fix both, then push. Once that's clean the PR looks good.

jaimefgdev · 2026-03-27T19:43:54Z

Finally got the CI green! Fixed the trailing newlines manually. The Labeler action is still failing due to repo permissions, but the code is ready to merge.

systemblueteam

All requested changes addressed. Tests cover the transcoded format name case, type annotation is in place, CI green. Merging.

systemblueteam · 2026-03-27T19:21:29Z

tests/test_formats.py

+    def test_transcoded_format_name_is_correct(self, tmp_path: Path) -> None:
+        """Verifies that transcoded files show their original format name, not WAV."""
+        from unittest.mock import patch
+
+        # 1. Creamos un archivo MP3 falso para pasar la validación de path.exists()
+        fake_mp3 = tmp_path / "test.mp3"
+        fake_mp3.write_text("fake audio content")
+
+        # 2. Simulamos que ffmpeg existe y que la lectura del WAV temporal funciona
+        with patch("sounddiff.formats.shutil.which", return_value="ffmpeg"), \
+             patch("sounddiff.formats.subprocess.run"), \
+             patch("sounddiff.formats.sf.info") as mock_info, \
+             patch("sounddiff.formats.sf.read") as mock_read:
+
+            # Configuramos el mock para simular lo que devolvería el WAV temporal
+            class MockInfo:
+                format = "WAV"
+                subtype = "PCM_16"
+
+            mock_info.return_value = MockInfo()
+            mock_read.return_value = (np.zeros((100, 2), dtype=np.float64), 44100)
+
+            # 3. Llamamos a tu función
+            _, meta = load_audio(fake_mp3)
+
+            # 4. LA COMPROBACIÓN FINAL: Debe decir MP3 y no WAV
+            assert meta.format_name == "MP3"


@jaimefgdev

systemblueteam · 2026-03-27T21:25:23Z

@jaimefgdev yeah sorry about the ci check. I'm consolidating everything into a single check. Hopefully that'll keep things simple in the future. Also what do you think about the code rabbit/sentry reviews? Does it feel overwhelming for you personally or are you okay with it?

jaimefgdev · 2026-03-27T21:33:19Z

Thanks for the merge! Regarding the AI reviews, they can be a bit tedious sometimes (especially with the strict formatting), but I think they are necessary to make sure everything works correctly before the final review. It’s great news that you're consolidating the CI checks, that will definitely make things smoother in the future!

jaimefgdev · 2026-03-27T21:34:58Z

Also, is there anything else you'd like me to help with that isn't currently in the issues? I'm free for any other tasks or features you might have in mind.

systemblueteam · 2026-03-27T21:46:11Z

@jaimefgdev feel free to jump in anytime on what we have, but yeah I am actually working on another project right now that I'm contemplating open sourcing. If you want to talk about it feel free to shoot me an email: shelton@systemblue.io. I'm actively looking for some collaborators that would be willing to get their hands dirty so if you want just ping me in an email.

also...for the record, I'm an Arsenal fan. So we'll be seeing Real Madrid soon in CL lol. Assuming you're a Madrid fan.

fix: display original format name for ffmpeg files (systemblueteam#46)

7a1f640

jaimefgdev requested a review from systemblueteam as a code owner March 27, 2026 10:18

coderabbitai bot added the audio-core Core audio analysis modules label Mar 27, 2026

sentry bot reviewed Mar 27, 2026

View reviewed changes

src/sounddiff/formats.py Show resolved Hide resolved

coderabbitai bot requested changes Mar 27, 2026

View reviewed changes

src/sounddiff/formats.py Outdated Show resolved Hide resolved

src/sounddiff/formats.py Show resolved Hide resolved

src/sounddiff/formats.py Outdated Show resolved Hide resolved

style: add trailing newline

062e766

coderabbitai bot added the enhancement New feature or request label Mar 27, 2026

style: apply ruff formatting

ede69af

coderabbitai bot reviewed Mar 27, 2026

View reviewed changes

jaimefgdev added 3 commits March 27, 2026 11:33

fix: restore missing ffmpeg block and resolve undefined variables

47d69ce

Merge branch 'main' into fix/format-name

34c423d

style: fix trailing newline for good

d50f8dc

This comment was marked as duplicate.

Sign in to view

systemblueteam requested changes Mar 27, 2026

View reviewed changes

test: add format name test and fix typing

c003cd7

coderabbitai bot requested changes Mar 27, 2026

View reviewed changes

style: add missing trailing newline

ab69fb9

coderabbitai bot reviewed Mar 27, 2026

View reviewed changes

jaimefgdev added 4 commits March 27, 2026 20:16

style: remove trailing whitespace at EOF

f398dce

style: remove trailing whitespace at EOF

a6da74e

style: fix all formatting and linting issues

9660a0e

Merge branch 'main' into fix/format-name

ca757e0

fix: manually add trailing newlines

dda935b

systemblueteam merged commit dff05ee into systemblueteam:main Mar 27, 2026
3 of 4 checks passed

systemblueteam approved these changes Mar 27, 2026

View reviewed changes

systemblueteam mentioned this pull request Mar 27, 2026

Release v0.2.1 #68

Merged

Conversation

jaimefgdev commented Mar 27, 2026

What changed

Why

Closes

Uh oh!

coderabbitai bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

jaimefgdev commented Mar 27, 2026

Uh oh!

coderabbitai bot commented Mar 27, 2026

Uh oh!

This comment was marked as duplicate.

Uh oh!

systemblueteam left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

systemblueteam Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

systemblueteam commented Mar 27, 2026

Uh oh!

jaimefgdev commented Mar 27, 2026

Uh oh!

Uh oh!

systemblueteam left a comment

Choose a reason for hiding this comment

Uh oh!

systemblueteam Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

systemblueteam commented Mar 27, 2026

Uh oh!

jaimefgdev commented Mar 27, 2026

Uh oh!

jaimefgdev commented Mar 27, 2026

Uh oh!

systemblueteam commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Mar 27, 2026 •

edited

Loading

systemblueteam left a comment •

edited

Loading

coderabbitai bot Mar 27, 2026 •

edited

Loading