Abstract filters support #853

dvrogozh · 2025-08-26T22:37:34Z

Changes:

Added DeviceInterface::initializeFiltersContext() API which returns device specific FilterGraph initialization settings
Enabled filter graphs on SingleStreamDecoder level
Switched CPU device interface to use above changes
Enabled filter graphs in CUDA device interface to handle non-NV12 decoders output (10/12-bits videos) thru scale_cuda
- Filter graph converts 10/12-bit videos to NV12 (as scale_cuda does not currently support conversion of YUV to RGB) and then cuda device interface converts NV12 to RGB (via existing NPP path)

Basically idea behind this change is the following. Let device interface to perform trivial and performance optimized conversions in the convertAVFrameToFrameOutput() method. If device interface can not handle conversion in the convertAVFrameToFrameOutput(), it can setup ffmpeg filters pipeline by returning valid description in initializeFiltersContext().

I tested the pipeline on 10-bit videos, h264 and h265. However, this setup should be valid for 12-bit videos which I did not try.

Note:

On ffmpeg-n4.4 filters pipeline walls back to CPU as scale_cuda does not support format conversion in this ffmpeg version (was added from n5.0)
More work might be needed to align scale_cuda converted outputs with CPU implementation. I do see the difference in the outputs likely due to different handling of color standards. This is something I was not able to overcome at the moment.

CC: @scotts @NicolasHug @eromomon

dvrogozh · 2025-08-26T22:45:05Z

I forgot to mention that I did not change cpu device interface yet in a similar way to cuda (initializeFiltersContext() not implemented for cpu). This would be a next step if this design will be accepted.

dvrogozh · 2025-08-28T21:03:56Z

I've added one more commit to update CPU device interface to use filters path introduce in SingleStreamDecoder and DeviceInterface. Still a draft as #831 needs to be reviewed first.

dvrogozh · 2025-09-02T19:19:34Z

I've rebase the PR on top of the recently landed #831 prerequisite. Change is ready for review. @scotts @NicolasHug

dvrogozh · 2025-09-04T22:28:59Z

I pushed update to fix a linter issue which I have overlooked.

Error: Unable to download artifact(s): Artifact not found for name: pytorch_torchcodec__3.10_cu129_x86_64

This error in CI test seems unrelated to the change.

dvrogozh · 2025-09-05T14:10:15Z

Pushed update to address another lint issue in the last commit.

dvrogozh · 2025-09-05T21:12:49Z

Rebased on top of:

Refactor context structs to use constructors #869

scotts · 2025-09-06T02:32:22Z

src/torchcodec/_core/SingleStreamDecoder.cpp

+          avFrame->width,
+          "x",
+          avFrame->height);
+    }


I'm not sure what we're enabling by exposing DeviceInterface::initializeFiltersContext() and then calling it here in SingleStreamDecoder. That is, if DeviceInterface just didn't expose the concept of filter graphs at all, each device implementation could still decide to use a filter graph on its own.

We already have the ability to ask a device to do device-specific frame conversions with DeviceInterface::convertAVFrameToFrameOutput(). Pulling out filter graph capabilities from that, and then forcing that to be orchestrated at the level of SingleStreamDecoder does not, as far as I can tell, buy us any new capabilities. But it does come at a cost, because CpuDeviceInterface::convertAVFrameToFrameOutput() is now (to me) quite strange: it does swscale based conversion and filter graph cleanup.

The logic in this PR is effectively:

if (stream.kind == AUDIO) { convertAudio(inFrame, outFrame); } else if (stream.kind == VIDEO && device != nullptr) { filterGraph = device->getFilterGraph(); if (filterGraph != nullptr) { inFrame = filterGraph->convert(inFrame); } device->convertVideo(inFrame, outFrame); } return outFrame;

I don't see how the above buys us anything over:

if (stream.kind == AUDIO) { convertAudio(inFrame, outFrame); } else if (stream.kind == VIDEO && device != nullptr) { // device is free to use filter graph if it wants; it's // entirely an implementation detail for the device device->convertVideo(inFrame, outFrame); } return outFrame;

I do understand that in the future, we'll likely want to tell a device "Hey, here are the particular filters we want you to apply." But I don't think these interfaces actually buy us that capability. I think we can just as easily through the VideoStreamOptions we currently pass to DeviceInterface::convertAVFrameToFrameOutput().

This PR moves ownership and managing of filter graph to SingleStreamDecoder. Otherwise this will be done in each device interface separately while doing that in SingleStreamDecoder makes these behave the same and share across all devices. There is further considerations around that which I highlight in #853 (comment).

scotts · 2025-09-06T02:43:19Z

@dvrogozh, thank you for this work! I left a detailed comment where I'm trying to understand the value we're getting out of the new abstractions - please help me understand that better. At a higher-level, I wonder if we could take the following approach:

DeviceInterface is left unchanged; we don't add the concept of filter graph initialization to it.
SingleStreamDecoder is left unchanged.
CpuDeviceInterface is left unchanged.
CudaDeviceInterface is changed to add the new Cuda-specific filter graph implementation.

I'm sorry if that's the direction you were already going in - I know I pushed to see what the full generalization would look like. One quirk of the approach above is that each device may end up implementing the same patterns for their filter graph implementation. In that case, we could potentially add some member functions to DeviceInterface, but they would be private. They wouldn't be called by SingleStreamDecoder, but just guides for how best to implement filter graph support for a device.

dvrogozh · 2025-09-08T17:57:09Z

@scotts : you have highlighted the alternate design approach to take. I think we have 2 options on a plate:

Keep decoder output conversions (filter graphs, sws calls, npp calls, etc.) within device interface classes
Define output conversions separate from device interface classes. In this case device interface convertAVFrameToFrameOutput()becomes responsible only for the trivial operation - wrap AVFrame into Tensor without any conversions.

I am proposing to consider 2nd direction which gives more modular structure for the project and simplifies device interface to a set of trivial operations. I am trying to reduce complexity of writing a device interface for non-CUDA GPUs by reusing as much as possible.

In the sense of 2nd direction, this PR is the first step. As you saw, it moves ownership and management of the Filter Graph out of device interface (to SingleStreamDecoder). If we would take this direction further, I suggest to consider:

Abstract FilterGraph to allow multiple implementations, i.e. understand under Filter Graph any conversion not only via ffmpeg filters, but also via sws or npp or other libraries.
Implement SwsGraph and NppGraph, i.e. move these out of CpuDeviceInterface and CudaDeviceInterface
Above steps will leave device interface to support only trivial wrapping/copy of AVFrame into Tensor which is backend specific operation

Please, share your thoughts on the above. If you want, we can start by going with 1st direction (keep conversion in device interfaces) and continue weighing 2nd approach. I think enabling encoders might contribute to its justification as we again will consider filter operations. This time before encoding. If we will decide to go with 1st approach for now, then I will submit another PR with the focus on enabling 10-bit support in CUDA device interface.

scotts · 2025-09-12T14:10:56Z

@dvrogozh, thanks for explaining the direction you're going in! Based on where we are, I still prefer the first option: keeping all decoder output conversions within the device implementation.

I do think that in the future, we'll want better abstractions around filtergraph, swscale and NPP. That is, these are all ways to convert a raw decoded frame into our desired color space and potentially perform some native transforms. But what the exact abstractions we should build is not obvious to me right now. In particular, this PR has us in an in-between state where some stuff remains in the device implementation, and some it outside.

By pursuing option 1, I think we'll eventually get to a state where we have three device implementations: CPU and CUDA in this repo, and XPU in an Intel repo. At that point, it will be more clear to us what the patterns are and what abstractions to build.

For: pytorch#776 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

dvrogozh · 2025-09-12T23:49:45Z

By pursuing option 1, I think we'll eventually get to a state...

@scotts : I have submitted a PR to follow option 1. Please, help to review:

Use cuda filters to support 10-bit videos #899

I will probably keep #853 opened and rebase it on top of #899 to continue working on option 2.

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 26, 2025

dvrogozh mentioned this pull request Aug 26, 2025

Proper 10bit videos support on GPU #776

Open

dvrogozh force-pushed the filters-cuda branch from 58b4921 to bcde35e Compare August 28, 2025 21:01

dvrogozh mentioned this pull request Aug 29, 2025

Move filter graph to stand alone class #831

Merged

dvrogozh force-pushed the filters-cuda branch from bcde35e to a27a9e6 Compare September 2, 2025 19:08

dvrogozh marked this pull request as ready for review September 2, 2025 19:18

NicolasHug self-requested a review September 3, 2025 09:58

dvrogozh force-pushed the filters-cuda branch from a27a9e6 to 7de6c57 Compare September 4, 2025 22:28

dvrogozh force-pushed the filters-cuda branch from 7de6c57 to 78bb3b4 Compare September 5, 2025 14:09

dvrogozh force-pushed the filters-cuda branch from 78bb3b4 to f400a6d Compare September 5, 2025 21:11

scotts reviewed Sep 6, 2025

View reviewed changes

dvrogozh added 3 commits September 12, 2025 17:30

Use cuda filters to support 10-bit videos

23f3df9

For: pytorch#776 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

Implement initializeFiltersContext for CPU device interface

745ed48

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

Drop timeBase from convertAVFrameToFrameOutput API

8f899e1

Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

dvrogozh force-pushed the filters-cuda branch from f400a6d to 8f899e1 Compare September 12, 2025 17:41

dvrogozh changed the title ~~Use cuda filters to support 10-bit videos~~ Abstract filters support Sep 12, 2025

dvrogozh mentioned this pull request Sep 12, 2025

Use cuda filters to support 10-bit videos #899

Merged

dvrogozh marked this pull request as draft September 12, 2025 23:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Abstract filters support #853

Abstract filters support #853

Uh oh!

dvrogozh commented Aug 26, 2025 •

edited

Loading

Uh oh!

dvrogozh commented Aug 26, 2025

Uh oh!

dvrogozh commented Aug 28, 2025

Uh oh!

dvrogozh commented Sep 2, 2025

Uh oh!

dvrogozh commented Sep 4, 2025

Uh oh!

dvrogozh commented Sep 5, 2025

Uh oh!

dvrogozh commented Sep 5, 2025

Uh oh!

scotts Sep 6, 2025

Uh oh!

dvrogozh Sep 8, 2025

Uh oh!

scotts commented Sep 6, 2025

Uh oh!

dvrogozh commented Sep 8, 2025

Uh oh!

scotts commented Sep 12, 2025

Uh oh!

dvrogozh commented Sep 12, 2025

Uh oh!

Uh oh!

Abstract filters support #853

Are you sure you want to change the base?

Abstract filters support #853

Uh oh!

Conversation

dvrogozh commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dvrogozh commented Aug 26, 2025

Uh oh!

dvrogozh commented Aug 28, 2025

Uh oh!

dvrogozh commented Sep 2, 2025

Uh oh!

dvrogozh commented Sep 4, 2025

Uh oh!

dvrogozh commented Sep 5, 2025

Uh oh!

dvrogozh commented Sep 5, 2025

Uh oh!

scotts Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

dvrogozh Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

scotts commented Sep 6, 2025

Uh oh!

dvrogozh commented Sep 8, 2025

Uh oh!

scotts commented Sep 12, 2025

Uh oh!

dvrogozh commented Sep 12, 2025

Uh oh!

Uh oh!

dvrogozh commented Aug 26, 2025 •

edited

Loading