Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions runner/app/live/pipelines/streamdiffusion/pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,9 @@ def process_tensor_sync(self, img_tensor: torch.Tensor):
if _is_bchw_format(out_tensor):
out_tensor = out_tensor.permute(0, 2, 3, 1)

if out_tensor.dim() != 4 or out_tensor.shape[0] != 1 or out_tensor.shape[-1] != 3:
logging.error(f"[StreamDiffusion] Invalid output tensor shape for encoder: {out_tensor.shape}. Expected (1, H, W, 3). Params: {self.params}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw you can also check the H, W dimensions by checking the self.params.get_output_resolution(). Although if we do that, we should probably cache the output resolution along the params so we're not re-computing the ouput dims every frame 🤔
WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, when the output is not in the right format, we should probably return None and skip the output generation on put_video_frame. Otherwise we're still gonna break the encoder in the end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didnt return None or skip because it might cause an issue with the timestamps on the frames because the put and get are async

i just wanted to understand what params cause it so that we can replicate it offline and fix it streamdiffusion instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipping shouldn't cause any timestamp issues! Only if we don't output monotonic timestamps


return out_tensor

async def get_processed_video_frame(self) -> VideoOutput:
Expand Down
Loading