Draft
Conversation
…owMaskPass 6.7ms->5ms)
…ryPass 50%的性能提升(2.3ms->1.2ms)
…adowMaskPass 90%的性能提升
Member
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



接下来TODO:
阴影
代码审查结果【AI】
正确性 Bug(建议提交前修复)
Shader 实例剔除逻辑有误 — [FrustumCull.comp.hlsl:103-125](vscode-file://vscode-app/c:/Program Files/Microsoft VS Code/cfbea10c5f/resources/app/out/vs/code/electron-browser/workbench/workbench.html)
当一个 primitive 有多个 instance 且只有部分可见时,shader 只是把
instance_cnt设为visible_count,但没有重排 instance buffer。DrawIndexedIndirect会从first_instance开始画前visible_count个 instance,而不是那些真正可见的 instance。例如:5个instance中第0、2、4个可见 →
visible_count=3→ GPU画第0、1、2个(instance 1、2不该被画)为什么没有性能提升
三个原因:
RestoreDrawCommands()vkCmdDrawIndexedIndirect而非vkCmdDrawIndexedIndirectCountinstance_cnt=0的 draw 并不是真正免费的推荐优化方向
CopyFrom(BufferView, BufferView)做 GPU 端拷贝,避免每帧走 staging buffervkCmdDrawIndexedIndirectCount— 引擎已经实现了这个 API(DrawIndexedIndirectCnt),culling shader 输出 visible draw count 到单独的 count buffer,这样 GPU 可以完全跳过被剔除的 draw需要我实现这些优化吗?