Skip to content

Comments

Add support for UVD and VCE status#140

Closed
danielzgtg wants to merge 5 commits intoclbr:masterfrom
danielzgtg:uvdvce
Closed

Add support for UVD and VCE status#140
danielzgtg wants to merge 5 commits intoclbr:masterfrom
danielzgtg:uvdvce

Conversation

@danielzgtg
Copy link
Contributor

@danielzgtg danielzgtg commented Oct 7, 2022

This add a bar for UVD and VCE. This has been long-requested by more than 10 people #96 (comment) .

Tested on RX 460 (POLARIS). google-chrome-stable flagged and non-Vulkan with W3Schools video and celluloid with H.264 .mp4 successful make the UVD bar go up to around 10 or 5%. I am using Linux hostname 5.19.0-18-generic #18-Ubuntu SMP PREEMPT_DYNAMIC Wed Sep 21 15:44:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux on Ubuntu 22.10 Kwin X11 Konsole. 3 codepaths were tested, namely non-setuid root non-memory, setuid root non-memory, and setuid root memory.

The GTT and amdgpu were already in master a long time ago so this PR does not need to readd them. Testing was not done on CARRIZO, STONEY, or radeon. I can test on KAVERI if requested.

This was a difficult and tedious port of changes whose underlying code had changed extensively over the 5 intervening years. It was hard to track down where one method got scattered across multiple files. The original is at #45 , but this port did not contact the author. It seems like the patch was already broken a year after its writing as mentioned by #45 (comment)

Fixes #29
Fixes #96
Simpler than #93
Supersedes #45

This add a bar for UVD and VCE.

Tested on RX 460 (POLARIS).

This was a difficult and tedious port of changes whose underlying code had
changed extensively over the 5 intervening years.
The original is at clbr#45 , but this
port did not contact the author.

Fixes clbr#29. Tested on CARRIZO, STONEY and KAVERI.

Added GTT memory usage details

Co-authored-by: Chandu Babu Namburu <chandu@amd.com>
@clbr
Copy link
Owner

clbr commented Oct 7, 2022

Confusion in the MMAP sizes, mmap and munmap must use the same size. The multiply in ui.c should use k like the existing ones.

@clbr
Copy link
Owner

clbr commented Oct 7, 2022

Oh, and dump support is missing, and there should be checks for which families have those. I'm not sure if any supported card lacks UVD, but certainly lots of cards lack VCE.

@danielzgtg
Copy link
Contributor Author

Confusion in the MMAP sizes

Nice catch! I had the second bug with the wrong SRBM_MMAP_SIZE that cancelled out the memory leak. It probably worked because we don't use the entire buffer but yeah it's better to use the correct size.

The multiply in ui.c

5 years is a long time :)

dump support is missing

Hmm. The original PR had this feature. I'm not sure how I managed to forgot that file.

checks for which families have those

I tried to find this information on Wikipedia. The families before Kaveri are confusing, especially UVD1. I guess UVD is RV610-GCN and VCE is GCN only. I only have GCN1 and newer, and someone needs to test cards before that.

I will assume SRBM is always present whenever GRBM is. https://github.com/torvalds/linux/blob/4c86114194e644b6da9107d75910635c9e87179e/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c#L56 I don't know the AMD specifications but SRBM is 0x8000 before GRBM so it's probably always present.

@clbr
Copy link
Owner

clbr commented Oct 8, 2022

Applied, thanks.

On my radeon card, uvd is reported correctly in both mem and non-mem.

I should note that the mem path for SRBM seems broken on newer cards, at least according to kernel sources and cnamburu's PR the SRBM_STATUS and *2 are in different locations. So it's curious that your testing on your amdgpu card worked in mem?

@clbr clbr closed this Oct 8, 2022
@danielzgtg
Copy link
Contributor Author

Here is a video of what I used to get sudo ./radeontop -m to show UVD:

radeontop_mem-2022-10-08_02.21.33.mp4

@danielzgtg danielzgtg mentioned this pull request Oct 8, 2022
@bdaase
Copy link

bdaase commented Oct 29, 2022

Great work everyone for getting this in!

@clbr Could you please create a tag again so that distributions feel "safe" to pick up these improvements?

@clbr
Copy link
Owner

clbr commented Oct 30, 2022

I don't feel comfortable doing that yet - Daniel reported that on his card it shows wrong in some cases (encoding as decoding IIRC).

@danielzgtg
Copy link
Contributor Author

The fix for encoding as decoding thing might not be so minor, and it might be impossible. The code I ported in this PR came from an official AMD engineer. What this means is that things may have changed in the meantime since they designed it. This may have been the trend things were going because VCN is one name instead of the two of UVD/VCE. Perhaps H265 encoding might actually be done on the UVD chip and that chip would later be rebranded as VCN.

Unmixing one number into two is impossible. "x + y = 42", what are x and y? See, once H264/H265 decode and H265 encode are mixed into the SRBM variable, it is impossible to unmix them. The only other way is to find another register, and that is only if such a register exists.

What I need to do is read the documentation, or have someone else find an expert on AMD cards like on some mailing list. If I'm forced to dig through documentation it will be several months to years if nobody else contributes the knowledge. If I don't have to then I might have time next week when I expect to set up my VCN computer.

@Habraten
Copy link

Habraten commented Mar 3, 2023

Software
CentOS 7.9 (kernel3.10.0-1160.83.1.el7.x86_64)
Radeon™ Software for Linux® version 18.50 installed only AMDGPU All-Open

Primary ILO on board video Secondary AMD RX 550

RadeonTop 1.4

radeonto -m
no UVD and VCE indicators

@danielzgtg
Copy link
Contributor Author

RadeonTop 1.4

Compile for git master instead

@Umio-Yasuno
Copy link

I asked AMD developer Alex Deucher about the mmSRBM_STATUS/mmSRBM_STATUS2 register offsets on GFX9 and later, it doesn't seem to exist.

https://gitlab.freedesktop.org/drm/amd/-/issues/2442#note_1807261

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

question: which metric shows H264 decoding? Video encode/decode utilization (enhancement?)

5 participants