Skip to content

decoder: effort to remove some VVL issues#198

Open
dabrain34 wants to merge 19 commits intoKhronosGroup:mainfrom
dabrain34:dab_fix_vvl_issues
Open

decoder: effort to remove some VVL issues#198
dabrain34 wants to merge 19 commits intoKhronosGroup:mainfrom
dabrain34:dab_fix_vvl_issues

Conversation

@dabrain34
Copy link
Copy Markdown
Contributor

Description

Backport a set of patches from NVPro to address VVL issues in VVS

Type of change

bug fix

Issue (optional)

Tests

NVIDIA GeForce RTX 3050 Ti Laptop GPU / NVIDIA 595.44.00 / Ubuntu 24.04.4 LTS

Total Tests: 73
Passed: 58
Crashed: 0
Failed: 0
Not Supported: 14
Skipped: 1 (in skip list)
Success Rate: 100.0%

Intel(R) Iris(R) Xe Graphics (ADL GT2) / Intel open-source Mesa driver Mesa 26.2.0-devel (git-e6064cf077) / Ubuntu 24.04.4 LTS

Total Tests: 73
Passed: 51
Crashed: 0
Failed: 0
Not Supported: 13
Skipped: 9 (in skip list)
Success Rate: 100.0%

Additional Details (optional)

dabrain34 and others added 2 commits April 24, 2026 10:26
Add VkPhysicalDeviceSamplerYcbcrConversionFeatures to the device
feature chain to fix validation error VUID-vkCreateSamplerYcbcrConversion-None-01648.
Add g_ignoredValidationMessageIds[] array to VulkanDeviceContext.cpp,
matching the pattern from nvpro_core2/nvvk/context.cpp. Filter known
validation layer false positives by messageIdNumber in the debug report
callback before printing to stderr.

Suppressed VUIDs (all VVL false positives, not application bugs):

1. VUID-VkDeviceCreateInfo-pNext-pNext (0x901f59ec):
   Private/provisional extension struct type 1000552004 not recognized
   by VVL 1.4.313. Harmlessly skipped by the driver's pNext traversal.
   Resolves when VVL headers are updated.

2. VUID-VkImageViewCreateInfo-image-01762 (0x6516b437):
   VVL does not track VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT for
   video-profile-bound images (VkVideoProfileListInfoKHR in pNext).
   DPB images ARE created with MUTABLE_FORMAT_BIT, per-plane views
   use PLANE_0_BIT/PLANE_1_BIT aspects (not COLOR_BIT). Neither
   clause of the VUID condition applies.

3. VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239 (0xc36d9e29):
   Cascading from VUID-01762. DPB slots are correctly activated via
   pSetupReferenceSlot with codec-specific DPB slot info pNext.
   VVL's internal state tracking is confused by the image false
   positives on the same video session.

Note: VVL 1.4.313 uses VK_EXT_debug_utils internally for message
output. The decoder's VK_EXT_debug_report callback filters our own
stderr output but cannot suppress VVL's direct output. Full
suppression requires either upgrading to VK_EXT_debug_utils or
waiting for the VVL false positives to be fixed upstream.

Ref: KhronosGroup/Vulkan-ValidationLayers#11531
Ref: nvpro-samples/vk_video_samples#183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
@dabrain34 dabrain34 force-pushed the dab_fix_vvl_issues branch 2 times, most recently from fde254f to 68e29c4 Compare April 27, 2026 12:18
zzoon and others added 12 commits April 27, 2026 14:30
… if necessary

This fixes 06415:

   If the image view requires a sampler Y'CBCR conversion and usage contains
   VK_IMAGE_USAGE_SAMPLED_BIT, then the pNext chain must include a
   VkSamplerYcbcrConversionInfo structure with a conversion value other
   than VK_NULL_HANDLE

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Fix srcBufferOffset and srcBufferRange alignment to satisfy Vulkan spec
requirements for vkCmdDecodeVideoKHR (VUID-07131, VUID-07139).

Problem
-------
The parser's bitstreamDataOffset and bitstreamDataLen values were passed
directly into VkVideoDecodeInfoKHR without any alignment, causing
validation errors on H.264, H.265, and AV1 (VP9 already handled this).

Parser Buffer Architecture
--------------------------
The NvVideoParser manages bitstream buffers as follows:

1. Buffers are allocated via GetBitstreamBuffer() with size rounded up
   to minBitstreamBufferSizeAlignment (typically 256 bytes).

2. The parser fills the buffer with compressed frame data sequentially.
   When a frame boundary is detected (end_of_picture), the parser
   reports bitstreamDataOffset (where frame data starts in the buffer)
   and bitstreamDataLen (exact byte count of the frame's NAL units).

3. The buffer often contains BOTH the current frame's data AND the
   beginning of the next frame's data (residual). After the decode
   command is submitted, swapBitstreamBuffer() copies this residual
   data to a new aligned buffer for the next frame.

4. For H.264/H.265 (NAL-based codecs via VulkanVideoDecoder::
   end_of_picture), bitstreamDataOffset is always 0 -- the frame data
   starts at the buffer beginning.

5. For VP9, the parser explicitly handles alignment in
   VulkanVP9Decoder::ParseFrameHeader (line 251-261): offset is
   aligned down, internal offsets are adjusted, and bitstreamDataLen
   is aligned up -- all at the parser level.

6. For AV1, bitstreamDataOffset is 0 (set in VulkanAV1Decoder::
   end_of_picture).

srcBufferOffset Fix
-------------------
For H.264/H.265/AV1: Assert that bitstreamDataOffset is 0 (enforced
by the parser architecture). Force to 0 as a safety net if violated.

For VP9: Trust the parser's alignment (already correct).

srcBufferRange Fix (per-codec)
------------------------------
H.265, AV1, VP9: Round up bitstreamDataLen to minBitstreamBufferSizeAlignment.
  These codecs use explicit slice segment offsets (pSliceSegmentOffsets)
  or tile sizes (pTileSizes) for decode boundaries. NVDEC ignores bytes
  beyond the last slice/tile, so the residual data in the alignment
  padding area is harmless.

H.264: Pass exact bitstreamDataLen WITHOUT rounding up.
  NVDEC's H.264 decoder uses srcBufferRange to bound its start-code
  scan (searching for 00 00 01 patterns). The buffer's residual area
  beyond bitstreamDataLen contains the next frame's data, which starts
  with a valid start code. Rounding up exposes this start code to the
  NAL scanner, causing decode corruption. Suppress VUID-07139 for H.264.
  The proper fix requires handling alignment in the H.264 parser
  (like VP9 does), but that is a larger change to NvVideoParser's
  ByteStreamParser buffer management.

IMPORTANT: The bytes beyond bitstreamDataLen must NOT be zero-filled.
  They contain the next frame's residual data that swapBitstreamBuffer()
  copies after the decode returns. Zero-filling destroys this data and
  corrupts all subsequent frames.

Also fix VulkanBitstreamBufferImpl::GetSizeAlignment() which incorrectly
returned VkMemoryRequirements::alignment instead of m_bufferSizeAlignment
(the minBitstreamBufferSizeAlignment from VkVideoCapabilitiesKHR).

Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07131 (srcBufferOffset)
Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07139 (srcBufferRange, H.265/AV1/VP9)
Suppresses: VUID-07139 for H.264 (requires parser-level fix)
Ref: KhronosGroup/Vulkan-ValidationLayers#11531
Ref: nvpro-samples/vk_video_samples#183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
The Vulkan spec requires that vkUpdateVideoSessionParametersKHR's
pUpdateInfo->updateSequenceCount must equal the current update sequence
counter of videoSessionParameters plus one. The counter starts at 0
after vkCreateVideoSessionParametersKHR and increments after each
successful update.

Previously, the code used GetUpdateSequenceCount() from the picture
parameters set, which starts at 0, resulting in the first update
passing updateSequenceCount=0 instead of the required 1.

Fix by tracking the update counter (m_updateCount) in
VkParserVideoPictureParameters and using ++m_updateCount for each
vkUpdateVideoSessionParametersKHR call. On failure, the counter is
rolled back so the next attempt uses the same value.

Fixes: VUID-vkUpdateVideoSessionParametersKHR-pUpdateInfo-07215
Ref: nvpro-samples/vk_video_samples#183
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
OPAQUE_FD_BIT is Linux-only and fails validation on Windows with
VUID-VkExportSemaphoreCreateInfo-handleTypes-01124. Use platform-
conditional handle type selection.

Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
In coincide mode (DPB_AND_OUTPUT_COINCIDE), decoded images are in
VIDEO_DECODE_DPB_KHR layout, not VIDEO_DECODE_DST_KHR. The display
pipeline was hardcoding DST layout, causing layout mismatch at draw
time (VUID-vkCmdDraw-None-09600).

Add outputImageLayout to VulkanDisplayFrame, populate it from the
frame buffer's tracked layout in DequeueDecodedPicture, and use
it in DrawFrame/RecordCommandBuffer for correct barrier transitions.

Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
When timeline semaphores handle consumer-done sync, the fence reset
was skipped (m_useConsummerSignalSemaphore guard). But the fence is
still always passed to the graphics queue submit via
DequeueDecodedPicture, causing VUID-vkQueueSubmit2-fence-04894
(fence submitted in SIGNALED state).

Remove the m_useConsummerSignalSemaphore condition — the fence must
always be reset before reuse.

Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
VIDEO_DECODE_BIT is not valid on graphics queue families.
Use ALL_COMMANDS_BIT for the timeline semaphore wait stage
when submitting to the graphics queue.

Fixes: UNASSIGNED-CoreChecks-unhandled-queue-capabilities
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
ALL_GRAPHICS_BIT and COMPUTE_SHADER_BIT are not valid on video
decode queue families. Use ALL_COMMANDS_BIT for the consumer-done
timeline semaphore wait when submitting to the decode queue.

Fixes: UNASSIGNED-CoreChecks-unhandled-queue-capabilities
Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
The stack-local VulkanSamplerYcbcrConversion created in
VkImageResourceView::Create() was destroyed when the function returned,
taking the VkSamplerYcbcrConversion handle with it. The VkImageView(s)
created with that conversion in their pNext chain then referenced a
destroyed handle, violating
VUID-vkDestroySamplerYcbcrConversion-samplerYcbcrConversion-01597.

Own the conversion on the heap, transfer ownership to VkImageResourceView
via the constructor, and destroy it in the view's destructor after all
image views that reference it have been destroyed.

Also drop a dead #if 0 block that had accumulated around the conversion
creation site.

VulkanDeviceContext: warn when samplerYcbcrConversion is unsupported

Commit 152d9ec added VkPhysicalDeviceSamplerYcbcrConversionFeatures to
the feature chain but did not check whether the device actually exposes
samplerYcbcrConversion. On a device lacking the feature the later
vkCreateSamplerYcbcrConversion call would fail with a cryptic VkResult.

Add a CHECK_VULKAN_FEATURE so the missing support is surfaced as a
clear warning at device-init time.
VulkanPerDrawContext::RecordCommandBuffer hardcoded
VK_IMAGE_LAYOUT_VIDEO_DECODE_DST_KHR as both the oldLayout for the
decode->shader barrier and the newLayout for the shader->decode
restore barrier. In DPB coincide mode the decoded image is actually
in VIDEO_DECODE_DPB_KHR, so the barrier's oldLayout did not match the
real layout (contents become undefined per spec) and the restore
transitioned to DST instead of DPB, leaving the image in the wrong
layout for the next decode submit.

Use inputImageToDrawFrom->imageLayout at all four barrier call sites
(non-planar/planar, forward/restore). The field is already propagated
from VulkanDisplayFrame::outputImageLayout in VulkanFrame::DrawFrame.

Fixes: VUID-vkCmdDraw-None-09600 (the layout-propagation work in
VulkanVideoFrameBuffer was dormant until this consumer was updated).
g_ignoredValidationMessageIds[] silently discarded matching debug
report messages, so a legitimate regression that happened to hit one
of the suppressed VUIDs (notably VUID-07139 for H.264 range alignment)
would stay completely invisible.

Keep the suppression but print each suppressed message id once,
guarded by a mutex so it is safe against concurrent driver threads.
Further occurrences of the same id remain silent, so the signal-to-
noise ratio is unchanged for normal runs.
@dabrain34 dabrain34 force-pushed the dab_fix_vvl_issues branch from 68e29c4 to 72f8d71 Compare April 27, 2026 12:30
…me-semaphore

Commit da44351 ("VL: use OPAQUE_WIN32_BIT for semaphore export on Windows")
restructured NvPerFrameDecodeImageSet::init() to add a VkExportSemaphoreCreateInfo
pNext chain, but in doing so nested the m_consumerCompleteSemaphore creation
inside the m_frameCompleteSemaphore == VK_NULL_HANDLE block.

If init() runs again with m_frameCompleteSemaphore already non-NULL while
m_consumerCompleteSemaphore is NULL (partial-failure / partial-teardown path),
the consumer semaphore is silently never created. Downstream
QueuePictureForDecode then hands VK_NULL_HANDLE to VkSemaphoreSubmitInfo.

Restore the original two-sibling create structure and hoist the export /
timeline / sem create-info structs to the outer scope so both Create calls
share the same configuration.
…ueue

Commit 3af40a6 ("VL: reset consumer-done fence regardless of semaphore usage")
restricted output-layout propagation to VIDEO_DECODE_DPB_KHR / DST_KHR. Any
other tracked layout is silently swallowed and outputImageLayout falls back
to the constructor-default DST_KHR.

If the tracker ever reports e.g. GENERAL or UNDEFINED, the downstream display
barrier issues an oldLayout=DST_KHR transition on an image actually in another
layout - undefined behaviour.
Commit 8ffb8b0 ("decoder: Fix bitstream buffer alignment for all codecs")
added a runtime warning + force-to-zero fallback when bitstreamDataOffset is
non-zero for H.264 / H.265 / AV1. The fallback rescues release builds, but
the parser invariant says this can never legitimately happen, so silently
rewriting the offset would mask a real parser regression.

Add an assert() on top of the warning so debug builds fail loudly the first
time the invariant is violated, while release builds keep the safe
recovery path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants