-
Notifications
You must be signed in to change notification settings - Fork 59
initial Turin CPU platform #9043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
4ef8aba
to
d5727cc
Compare
0f8bd29
to
f34e225
Compare
.set_extended_processor_and_feature_identifiers(Some(leaf)) | ||
.expect("can set leaf 8000_0001h"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in a truly distressing case of the ankle bone being connected to the wrist bone, if PerfCtrExtCore
is set and TopologyExtensions
is not, Windows Server 2022 sits in a loop at boot. I noticed this in checking out a fix for oxidecomputer/propolis#959, an initial version of which just cleared TopologyExtensions
bit to match discarding leaf 0x8000_001E. Both bits together are fine. Having topology extensions and not six perf counters (as we've had on Milan for a while) is fine. Having neither is fine. Having six perf counters and no topology extensions does a loop at boot.
I'm a little suspicious there's some relationship between this and the incomplete representation of SMT, so I'm going to set this to a more Milan-like situation where we hide perf counter extensions for now, and omit topology extensions, and then see how this looks with issues like oxidecomputer/propolis#940 sorted out.
edit: these bits are now both cleared, and boy will I feel silly if I've overlooked something here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this tie into the definition of 8000_0022 %ecx NumPerfCtrCore
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8000_0022 eax is zero so guests shouldn't care, but if we were filling it in I'd pick 4 without PerfCtrExtCore
and 6 with.
let mut leaf = | ||
cpuid.get_feature_info().expect("baseline Milan defines leaf 1"); | ||
|
||
// Set up EAX: Family 1Ah model 2h stepping 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably useful to indicate why we're setting it to show up as a C1 processor. I'm guessing this is because that's the production stepping.
// guests. | ||
const TURIN_V1_CPUID: [CpuidEntry; 25] = [ | ||
cpuid_leaf!(0x0, 0x0000000D, 0x68747541, 0x444D4163, 0x69746E65), | ||
cpuid_leaf!(0x1, 0x00B00F21, 0x00000800, 0xF6D83203, 0x078BFBFF), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%ecx bit 31 is the bit to indicate hypervisor leafs are present right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, funnily this has more words in the APM than PPR..
const TURIN_V1_CPUID: [CpuidEntry; 25] = [ | ||
cpuid_leaf!(0x0, 0x0000000D, 0x68747541, 0x444D4163, 0x69746E65), | ||
cpuid_leaf!(0x1, 0x00B00F21, 0x00000800, 0xF6D83203, 0x078BFBFF), | ||
cpuid_leaf!(0x5, 0x00000000, 0x00000000, 0x00000000, 0x00000000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is leaf 5 zero here because we don't actually indicate support for monitor / mwait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, specifically it's zero and in this list because I want to confirm the assembled profile has this leaf zeroed in addition to leaf 1 ECX monitor
being clear.
cpuid_subleaf!( | ||
0x7, 0x1, 0x00000030, 0x00000000, 0x00000000, 0x00000000 | ||
), | ||
cpuid_subleaf!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume leaf B is left out here because it's dynamically generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's right, Propolis will fill it in with a level 0 and 1 that look the same as Milan (https://github.com/oxidecomputer/propolis/blob/ff52055/lib/propolis/src/cpuid.rs#L370-L378)
0x7, 0x1, 0x00000030, 0x00000000, 0x00000000, 0x00000000 | ||
), | ||
cpuid_subleaf!( | ||
0xD, 0x0, 0x000000E7, 0x00000980, 0x00000980, 0x00000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we get 980 in %ebx at this state without feeding in the value of %xcr0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here and D.1 ebx
are managed on the read in bhyve, so the value here doesn't have any bearing on a VM. so I arbitrarily picked the largest (and I think most likely) values we'd see in these leaves at runtime.
cpuid_leaf!(0x80000003, 0x2D6E6972, 0x656B696C, 0x6F725020, 0x73736563), | ||
cpuid_leaf!(0x80000004, 0x2020726F, 0x20202020, 0x20202020, 0x00202020), | ||
cpuid_leaf!(0x80000007, 0x00000000, 0x00000000, 0x00000000, 0x00000100), | ||
cpuid_leaf!(0x80000008, 0x00003030, 0x20000005, 0x00000000, 0x00000000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just confirming, we cap %eax at 0x30/0t48 because we don't support virtualizing 5 level paging, right?
cpuid_leaf!(0x8000000A, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
cpuid_leaf!(0x8000001A, 0x0000000A, 0x00000000, 0x00000000, 0x00000000), | ||
cpuid_leaf!(0x8000001B, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
cpuid_leaf!(0x8000001C, 0x00000000, 0x00000000, 0x00000000, 0x00000000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume leaf 8000_001e is filled dynamically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually omit 8000_001E entirely (and clear TopologyExtensions
), this goes to the kind of awkward semi-SMT situation I want to fix with an early virtual platform change (oxidecomputer/propolis#940), because I think we want to disallow VM shapes with odd vCPU counts. Otherwise Linux for example will assume the 8000_001E leaf is bogus if it indicates an SMT sibling that doesn't exist. Not somewhere I'd love to rely on the grace of guest OSes..
8000_001E with ThreadsPerCore = 0
would be fine even now, but there's no API surface to not have Propolis indicate SMT when filling in CPU topology, so.. out with this leaf for now.
cpuid_leaf!(0x8000001B, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
cpuid_leaf!(0x8000001C, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
cpuid_leaf!(0x8000001F, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
cpuid_leaf!(0x80000021, 0x000D8C47, 0x00000000, 0x00000000, 0x00000000), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has some new features here in %eax versus what is in RFD 314:
- How did you make the decision for UpperAddressIgnore? Basically it's not part of our EFER support?
- I'm assuming we don't include AutomaticIBRS because we need work on all mitigation passthrough.
- I guess because we don't support SMM there's no point in bit 9.
- PMC2PreciseRetire is not there due to the existing lack of support?
- PrefetchCtlMsr jis not there due to not virtualizing the MSR?
- L2TlbSizeX32 is not there because we don't pass through any of the TLB stuff.
- GpOnUserCpuid I'm guessing because we don't virtualize this.
- Why no PREFETCHI support? This doesn't seem to require hypervisor support (bit 19).
- No FP512_DOWNGRADE seems reasonable for now without virtualization.
- Given the pass through of security facts that don't require specific enablement, why not ERAPS on bit 24?
- Similar question on bit 30, SRSO_USER_KERNEL_NO. Given the other NO stuff we have added, seems something worth asking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UpperAddressIgnore and AutomaticIBRS theoretically should both "just work". If a guest sets them, the bit gets set in the guest's EFER and that's that. I have them clear because I didn't find any OS that would try using UAI (and from the LKML conversations it did not seem like that would change at least there). AutoIBRS is probably fine. I just feel itchy advertising this before at least mentioning it in specialreg.h
. And since there'll be a rev to include mitigation MSRs and bits, yeah, I figure it's not that bad to keep it clear at first.
otherwise yeah, missing bits are because there's no support/no MSR/nothing productive for guests. except PREFETCHI, ERAPS, and SRSO_USER_KERNEL_NO, which probably should be set. I'll make sure guests look reasonable with them (though prefetchi
I'd tested last week and just didn't set when I'd set movdir
)
.set_extended_processor_and_feature_identifiers(Some(leaf)) | ||
.expect("can set leaf 8000_0001h"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this tie into the definition of 8000_0022 %ecx NumPerfCtrCore
?
cpuid_leaf!(0x8000001C, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
cpuid_leaf!(0x8000001F, 0x00000000, 0x00000000, 0x00000000, 0x00000000), | ||
cpuid_leaf!(0x80000021, 0x000D8C47, 0x00000000, 0x00000000, 0x00000000), | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that right now we're not including the extended leaf 8000_0026 bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right. that would need a smidge of Propolis work and more importantly won't be particularly interesting beyond 8000_001E for the time being.
this is so much simpler than codifying all the of the bits describing all of the CPU surface area! what a breath of fresh air!
The feature selection here is the intersection of "PPR says it's there", "useful for guests", "supported in byhve/propolis", and "doesn't seem like we're painted into a corner if a future platform changes it." the bits here are, also, a subset of what what I'd seen on a 9365 in a Cosmo.
While byhve/Propolis would let guests turn on AutoIBRS, I haven't looked at it in the context of guest OSes much at all (though they do boot when told they're allowed to use AutoIBRS). UAI is in a similar boat but I don't think anyone uses it. So both EFER features are hidden for the time being.
Otherwise, as-is, I've booted Linux, Windows, OmniOS, and FreeBSD with this profile and they seem fine. Linux for example omits mentioning caches in
/proc/cpuinfo
, which make sense since I've avoided as much cache topology information as I can here.. the reasoning for that is discussed more in RFD 314.