decoder: effort to remove some VVL issues#198
Open
dabrain34 wants to merge 19 commits intoKhronosGroup:mainfrom
Open
decoder: effort to remove some VVL issues#198dabrain34 wants to merge 19 commits intoKhronosGroup:mainfrom
dabrain34 wants to merge 19 commits intoKhronosGroup:mainfrom
Conversation
Add VkPhysicalDeviceSamplerYcbcrConversionFeatures to the device feature chain to fix validation error VUID-vkCreateSamplerYcbcrConversion-None-01648.
Add g_ignoredValidationMessageIds[] array to VulkanDeviceContext.cpp, matching the pattern from nvpro_core2/nvvk/context.cpp. Filter known validation layer false positives by messageIdNumber in the debug report callback before printing to stderr. Suppressed VUIDs (all VVL false positives, not application bugs): 1. VUID-VkDeviceCreateInfo-pNext-pNext (0x901f59ec): Private/provisional extension struct type 1000552004 not recognized by VVL 1.4.313. Harmlessly skipped by the driver's pNext traversal. Resolves when VVL headers are updated. 2. VUID-VkImageViewCreateInfo-image-01762 (0x6516b437): VVL does not track VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT for video-profile-bound images (VkVideoProfileListInfoKHR in pNext). DPB images ARE created with MUTABLE_FORMAT_BIT, per-plane views use PLANE_0_BIT/PLANE_1_BIT aspects (not COLOR_BIT). Neither clause of the VUID condition applies. 3. VUID-vkCmdBeginVideoCodingKHR-slotIndex-07239 (0xc36d9e29): Cascading from VUID-01762. DPB slots are correctly activated via pSetupReferenceSlot with codec-specific DPB slot info pNext. VVL's internal state tracking is confused by the image false positives on the same video session. Note: VVL 1.4.313 uses VK_EXT_debug_utils internally for message output. The decoder's VK_EXT_debug_report callback filters our own stderr output but cannot suppress VVL's direct output. Full suppression requires either upgrading to VK_EXT_debug_utils or waiting for the VVL false positives to be fixed upstream. Ref: KhronosGroup/Vulkan-ValidationLayers#11531 Ref: nvpro-samples/vk_video_samples#183 Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
fde254f to
68e29c4
Compare
… if necessary This fixes 06415: If the image view requires a sampler Y'CBCR conversion and usage contains VK_IMAGE_USAGE_SAMPLED_BIT, then the pNext chain must include a VkSamplerYcbcrConversionInfo structure with a conversion value other than VK_NULL_HANDLE Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Fix srcBufferOffset and srcBufferRange alignment to satisfy Vulkan spec requirements for vkCmdDecodeVideoKHR (VUID-07131, VUID-07139). Problem ------- The parser's bitstreamDataOffset and bitstreamDataLen values were passed directly into VkVideoDecodeInfoKHR without any alignment, causing validation errors on H.264, H.265, and AV1 (VP9 already handled this). Parser Buffer Architecture -------------------------- The NvVideoParser manages bitstream buffers as follows: 1. Buffers are allocated via GetBitstreamBuffer() with size rounded up to minBitstreamBufferSizeAlignment (typically 256 bytes). 2. The parser fills the buffer with compressed frame data sequentially. When a frame boundary is detected (end_of_picture), the parser reports bitstreamDataOffset (where frame data starts in the buffer) and bitstreamDataLen (exact byte count of the frame's NAL units). 3. The buffer often contains BOTH the current frame's data AND the beginning of the next frame's data (residual). After the decode command is submitted, swapBitstreamBuffer() copies this residual data to a new aligned buffer for the next frame. 4. For H.264/H.265 (NAL-based codecs via VulkanVideoDecoder:: end_of_picture), bitstreamDataOffset is always 0 -- the frame data starts at the buffer beginning. 5. For VP9, the parser explicitly handles alignment in VulkanVP9Decoder::ParseFrameHeader (line 251-261): offset is aligned down, internal offsets are adjusted, and bitstreamDataLen is aligned up -- all at the parser level. 6. For AV1, bitstreamDataOffset is 0 (set in VulkanAV1Decoder:: end_of_picture). srcBufferOffset Fix ------------------- For H.264/H.265/AV1: Assert that bitstreamDataOffset is 0 (enforced by the parser architecture). Force to 0 as a safety net if violated. For VP9: Trust the parser's alignment (already correct). srcBufferRange Fix (per-codec) ------------------------------ H.265, AV1, VP9: Round up bitstreamDataLen to minBitstreamBufferSizeAlignment. These codecs use explicit slice segment offsets (pSliceSegmentOffsets) or tile sizes (pTileSizes) for decode boundaries. NVDEC ignores bytes beyond the last slice/tile, so the residual data in the alignment padding area is harmless. H.264: Pass exact bitstreamDataLen WITHOUT rounding up. NVDEC's H.264 decoder uses srcBufferRange to bound its start-code scan (searching for 00 00 01 patterns). The buffer's residual area beyond bitstreamDataLen contains the next frame's data, which starts with a valid start code. Rounding up exposes this start code to the NAL scanner, causing decode corruption. Suppress VUID-07139 for H.264. The proper fix requires handling alignment in the H.264 parser (like VP9 does), but that is a larger change to NvVideoParser's ByteStreamParser buffer management. IMPORTANT: The bytes beyond bitstreamDataLen must NOT be zero-filled. They contain the next frame's residual data that swapBitstreamBuffer() copies after the decode returns. Zero-filling destroys this data and corrupts all subsequent frames. Also fix VulkanBitstreamBufferImpl::GetSizeAlignment() which incorrectly returned VkMemoryRequirements::alignment instead of m_bufferSizeAlignment (the minBitstreamBufferSizeAlignment from VkVideoCapabilitiesKHR). Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07131 (srcBufferOffset) Fixes: VUID-vkCmdDecodeVideoKHR-pDecodeInfo-07139 (srcBufferRange, H.265/AV1/VP9) Suppresses: VUID-07139 for H.264 (requires parser-level fix) Ref: KhronosGroup/Vulkan-ValidationLayers#11531 Ref: nvpro-samples/vk_video_samples#183 Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
The Vulkan spec requires that vkUpdateVideoSessionParametersKHR's pUpdateInfo->updateSequenceCount must equal the current update sequence counter of videoSessionParameters plus one. The counter starts at 0 after vkCreateVideoSessionParametersKHR and increments after each successful update. Previously, the code used GetUpdateSequenceCount() from the picture parameters set, which starts at 0, resulting in the first update passing updateSequenceCount=0 instead of the required 1. Fix by tracking the update counter (m_updateCount) in VkParserVideoPictureParameters and using ++m_updateCount for each vkUpdateVideoSessionParametersKHR call. On failure, the counter is rolled back so the next attempt uses the same value. Fixes: VUID-vkUpdateVideoSessionParametersKHR-pUpdateInfo-07215 Ref: nvpro-samples/vk_video_samples#183 Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
OPAQUE_FD_BIT is Linux-only and fails validation on Windows with VUID-VkExportSemaphoreCreateInfo-handleTypes-01124. Use platform- conditional handle type selection. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
In coincide mode (DPB_AND_OUTPUT_COINCIDE), decoded images are in VIDEO_DECODE_DPB_KHR layout, not VIDEO_DECODE_DST_KHR. The display pipeline was hardcoding DST layout, causing layout mismatch at draw time (VUID-vkCmdDraw-None-09600). Add outputImageLayout to VulkanDisplayFrame, populate it from the frame buffer's tracked layout in DequeueDecodedPicture, and use it in DrawFrame/RecordCommandBuffer for correct barrier transitions. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
When timeline semaphores handle consumer-done sync, the fence reset was skipped (m_useConsummerSignalSemaphore guard). But the fence is still always passed to the graphics queue submit via DequeueDecodedPicture, causing VUID-vkQueueSubmit2-fence-04894 (fence submitted in SIGNALED state). Remove the m_useConsummerSignalSemaphore condition — the fence must always be reset before reuse. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
VIDEO_DECODE_BIT is not valid on graphics queue families. Use ALL_COMMANDS_BIT for the timeline semaphore wait stage when submitting to the graphics queue. Fixes: UNASSIGNED-CoreChecks-unhandled-queue-capabilities Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
ALL_GRAPHICS_BIT and COMPUTE_SHADER_BIT are not valid on video decode queue families. Use ALL_COMMANDS_BIT for the consumer-done timeline semaphore wait when submitting to the decode queue. Fixes: UNASSIGNED-CoreChecks-unhandled-queue-capabilities Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
The stack-local VulkanSamplerYcbcrConversion created in VkImageResourceView::Create() was destroyed when the function returned, taking the VkSamplerYcbcrConversion handle with it. The VkImageView(s) created with that conversion in their pNext chain then referenced a destroyed handle, violating VUID-vkDestroySamplerYcbcrConversion-samplerYcbcrConversion-01597. Own the conversion on the heap, transfer ownership to VkImageResourceView via the constructor, and destroy it in the view's destructor after all image views that reference it have been destroyed. Also drop a dead #if 0 block that had accumulated around the conversion creation site. VulkanDeviceContext: warn when samplerYcbcrConversion is unsupported Commit 152d9ec added VkPhysicalDeviceSamplerYcbcrConversionFeatures to the feature chain but did not check whether the device actually exposes samplerYcbcrConversion. On a device lacking the feature the later vkCreateSamplerYcbcrConversion call would fail with a cryptic VkResult. Add a CHECK_VULKAN_FEATURE so the missing support is surfaced as a clear warning at device-init time.
VulkanPerDrawContext::RecordCommandBuffer hardcoded VK_IMAGE_LAYOUT_VIDEO_DECODE_DST_KHR as both the oldLayout for the decode->shader barrier and the newLayout for the shader->decode restore barrier. In DPB coincide mode the decoded image is actually in VIDEO_DECODE_DPB_KHR, so the barrier's oldLayout did not match the real layout (contents become undefined per spec) and the restore transitioned to DST instead of DPB, leaving the image in the wrong layout for the next decode submit. Use inputImageToDrawFrom->imageLayout at all four barrier call sites (non-planar/planar, forward/restore). The field is already propagated from VulkanDisplayFrame::outputImageLayout in VulkanFrame::DrawFrame. Fixes: VUID-vkCmdDraw-None-09600 (the layout-propagation work in VulkanVideoFrameBuffer was dormant until this consumer was updated).
g_ignoredValidationMessageIds[] silently discarded matching debug report messages, so a legitimate regression that happened to hit one of the suppressed VUIDs (notably VUID-07139 for H.264 range alignment) would stay completely invisible. Keep the suppression but print each suppressed message id once, guarded by a mutex so it is safe against concurrent driver threads. Further occurrences of the same id remain silent, so the signal-to- noise ratio is unchanged for normal runs.
68e29c4 to
72f8d71
Compare
…me-semaphore Commit da44351 ("VL: use OPAQUE_WIN32_BIT for semaphore export on Windows") restructured NvPerFrameDecodeImageSet::init() to add a VkExportSemaphoreCreateInfo pNext chain, but in doing so nested the m_consumerCompleteSemaphore creation inside the m_frameCompleteSemaphore == VK_NULL_HANDLE block. If init() runs again with m_frameCompleteSemaphore already non-NULL while m_consumerCompleteSemaphore is NULL (partial-failure / partial-teardown path), the consumer semaphore is silently never created. Downstream QueuePictureForDecode then hands VK_NULL_HANDLE to VkSemaphoreSubmitInfo. Restore the original two-sibling create structure and hoist the export / timeline / sem create-info structs to the outer scope so both Create calls share the same configuration.
…ueue Commit 3af40a6 ("VL: reset consumer-done fence regardless of semaphore usage") restricted output-layout propagation to VIDEO_DECODE_DPB_KHR / DST_KHR. Any other tracked layout is silently swallowed and outputImageLayout falls back to the constructor-default DST_KHR. If the tracker ever reports e.g. GENERAL or UNDEFINED, the downstream display barrier issues an oldLayout=DST_KHR transition on an image actually in another layout - undefined behaviour.
Commit 8ffb8b0 ("decoder: Fix bitstream buffer alignment for all codecs") added a runtime warning + force-to-zero fallback when bitstreamDataOffset is non-zero for H.264 / H.265 / AV1. The fallback rescues release builds, but the parser invariant says this can never legitimately happen, so silently rewriting the offset would mask a real parser regression. Add an assert() on top of the warning so debug builds fail loudly the first time the invariant is violated, while release builds keep the safe recovery path.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Backport a set of patches from NVPro to address VVL issues in VVS
Type of change
bug fix
Issue (optional)
Tests
NVIDIA GeForce RTX 3050 Ti Laptop GPU / NVIDIA 595.44.00 / Ubuntu 24.04.4 LTS
Total Tests: 73
Passed: 58
Crashed: 0
Failed: 0
Not Supported: 14
Skipped: 1 (in skip list)
Success Rate: 100.0%
Intel(R) Iris(R) Xe Graphics (ADL GT2) / Intel open-source Mesa driver Mesa 26.2.0-devel (git-e6064cf077) / Ubuntu 24.04.4 LTS
Total Tests: 73
Passed: 51
Crashed: 0
Failed: 0
Not Supported: 13
Skipped: 9 (in skip list)
Success Rate: 100.0%
Additional Details (optional)