Skip to content

Conversation

iximeow
Copy link
Member

@iximeow iximeow commented Sep 18, 2025

this is so much simpler than codifying all the of the bits describing all of the CPU surface area! what a breath of fresh air!

The feature selection here is the intersection of "PPR says it's there", "useful for guests", "supported in byhve/propolis", and "doesn't seem like we're painted into a corner if a future platform changes it." the bits here are, also, a subset of what what I'd seen on a 9365 in a Cosmo.

While byhve/Propolis would let guests turn on AutoIBRS, I haven't looked at it in the context of guest OSes much at all (though they do boot when told they're allowed to use AutoIBRS). UAI is in a similar boat but I don't think anyone uses it. So both EFER features are hidden for the time being.

Otherwise, as-is, I've booted Linux, Windows, OmniOS, and FreeBSD with this profile and they seem fine. Linux for example omits mentioning caches in /proc/cpuinfo, which make sense since I've avoided as much cache topology information as I can here.. the reasoning for that is discussed more in RFD 314.

@iximeow iximeow marked this pull request as draft September 18, 2025 02:21
@iximeow iximeow force-pushed the ixi/turin-cpu-platform branch from 0f8bd29 to f34e225 Compare October 9, 2025 22:29
Comment on lines +680 to +681
.set_extended_processor_and_feature_identifiers(Some(leaf))
.expect("can set leaf 8000_0001h");
Copy link
Member Author

@iximeow iximeow Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in a truly distressing case of the ankle bone being connected to the wrist bone, if PerfCtrExtCore is set and TopologyExtensions is not, Windows Server 2022 sits in a loop at boot. I noticed this in checking out a fix for oxidecomputer/propolis#959, an initial version of which just cleared TopologyExtensions bit to match discarding leaf 0x8000_001E. Both bits together are fine. Having topology extensions and not six perf counters (as we've had on Milan for a while) is fine. Having neither is fine. Having six perf counters and no topology extensions does a loop at boot.

I'm a little suspicious there's some relationship between this and the incomplete representation of SMT, so I'm going to set this to a more Milan-like situation where we hide perf counter extensions for now, and omit topology extensions, and then see how this looks with issues like oxidecomputer/propolis#940 sorted out.

edit: these bits are now both cleared, and boy will I feel silly if I've overlooked something here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this tie into the definition of 8000_0022 %ecx NumPerfCtrCore?

Copy link
Member Author

@iximeow iximeow Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8000_0022 eax is zero so guests shouldn't care, but if we were filling it in I'd pick 4 without PerfCtrExtCore and 6 with.

@iximeow iximeow marked this pull request as ready for review October 11, 2025 02:02
let mut leaf =
cpuid.get_feature_info().expect("baseline Milan defines leaf 1");

// Set up EAX: Family 1Ah model 2h stepping 1.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably useful to indicate why we're setting it to show up as a C1 processor. I'm guessing this is because that's the production stepping.

// guests.
const TURIN_V1_CPUID: [CpuidEntry; 25] = [
cpuid_leaf!(0x0, 0x0000000D, 0x68747541, 0x444D4163, 0x69746E65),
cpuid_leaf!(0x1, 0x00B00F21, 0x00000800, 0xF6D83203, 0x078BFBFF),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%ecx bit 31 is the bit to indicate hypervisor leafs are present right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, funnily this has more words in the APM than PPR..

const TURIN_V1_CPUID: [CpuidEntry; 25] = [
cpuid_leaf!(0x0, 0x0000000D, 0x68747541, 0x444D4163, 0x69746E65),
cpuid_leaf!(0x1, 0x00B00F21, 0x00000800, 0xF6D83203, 0x078BFBFF),
cpuid_leaf!(0x5, 0x00000000, 0x00000000, 0x00000000, 0x00000000),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is leaf 5 zero here because we don't actually indicate support for monitor / mwait?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, specifically it's zero and in this list because I want to confirm the assembled profile has this leaf zeroed in addition to leaf 1 ECX monitor being clear.

cpuid_subleaf!(
0x7, 0x1, 0x00000030, 0x00000000, 0x00000000, 0x00000000
),
cpuid_subleaf!(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume leaf B is left out here because it's dynamically generated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's right, Propolis will fill it in with a level 0 and 1 that look the same as Milan (https://github.com/oxidecomputer/propolis/blob/ff52055/lib/propolis/src/cpuid.rs#L370-L378)

0x7, 0x1, 0x00000030, 0x00000000, 0x00000000, 0x00000000
),
cpuid_subleaf!(
0xD, 0x0, 0x000000E7, 0x00000980, 0x00000980, 0x00000000

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we get 980 in %ebx at this state without feeding in the value of %xcr0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here and D.1 ebx are managed on the read in bhyve, so the value here doesn't have any bearing on a VM. so I arbitrarily picked the largest (and I think most likely) values we'd see in these leaves at runtime.

cpuid_leaf!(0x80000003, 0x2D6E6972, 0x656B696C, 0x6F725020, 0x73736563),
cpuid_leaf!(0x80000004, 0x2020726F, 0x20202020, 0x20202020, 0x00202020),
cpuid_leaf!(0x80000007, 0x00000000, 0x00000000, 0x00000000, 0x00000100),
cpuid_leaf!(0x80000008, 0x00003030, 0x20000005, 0x00000000, 0x00000000),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirming, we cap %eax at 0x30/0t48 because we don't support virtualizing 5 level paging, right?

cpuid_leaf!(0x8000000A, 0x00000000, 0x00000000, 0x00000000, 0x00000000),
cpuid_leaf!(0x8000001A, 0x0000000A, 0x00000000, 0x00000000, 0x00000000),
cpuid_leaf!(0x8000001B, 0x00000000, 0x00000000, 0x00000000, 0x00000000),
cpuid_leaf!(0x8000001C, 0x00000000, 0x00000000, 0x00000000, 0x00000000),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume leaf 8000_001e is filled dynamically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually omit 8000_001E entirely (and clear TopologyExtensions), this goes to the kind of awkward semi-SMT situation I want to fix with an early virtual platform change (oxidecomputer/propolis#940), because I think we want to disallow VM shapes with odd vCPU counts. Otherwise Linux for example will assume the 8000_001E leaf is bogus if it indicates an SMT sibling that doesn't exist. Not somewhere I'd love to rely on the grace of guest OSes..

8000_001E with ThreadsPerCore = 0 would be fine even now, but there's no API surface to not have Propolis indicate SMT when filling in CPU topology, so.. out with this leaf for now.

cpuid_leaf!(0x8000001B, 0x00000000, 0x00000000, 0x00000000, 0x00000000),
cpuid_leaf!(0x8000001C, 0x00000000, 0x00000000, 0x00000000, 0x00000000),
cpuid_leaf!(0x8000001F, 0x00000000, 0x00000000, 0x00000000, 0x00000000),
cpuid_leaf!(0x80000021, 0x000D8C47, 0x00000000, 0x00000000, 0x00000000),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has some new features here in %eax versus what is in RFD 314:

  • How did you make the decision for UpperAddressIgnore? Basically it's not part of our EFER support?
  • I'm assuming we don't include AutomaticIBRS because we need work on all mitigation passthrough.
  • I guess because we don't support SMM there's no point in bit 9.
  • PMC2PreciseRetire is not there due to the existing lack of support?
  • PrefetchCtlMsr jis not there due to not virtualizing the MSR?
  • L2TlbSizeX32 is not there because we don't pass through any of the TLB stuff.
  • GpOnUserCpuid I'm guessing because we don't virtualize this.
  • Why no PREFETCHI support? This doesn't seem to require hypervisor support (bit 19).
  • No FP512_DOWNGRADE seems reasonable for now without virtualization.
  • Given the pass through of security facts that don't require specific enablement, why not ERAPS on bit 24?
  • Similar question on bit 30, SRSO_USER_KERNEL_NO. Given the other NO stuff we have added, seems something worth asking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UpperAddressIgnore and AutomaticIBRS theoretically should both "just work". If a guest sets them, the bit gets set in the guest's EFER and that's that. I have them clear because I didn't find any OS that would try using UAI (and from the LKML conversations it did not seem like that would change at least there). AutoIBRS is probably fine. I just feel itchy advertising this before at least mentioning it in specialreg.h. And since there'll be a rev to include mitigation MSRs and bits, yeah, I figure it's not that bad to keep it clear at first.

otherwise yeah, missing bits are because there's no support/no MSR/nothing productive for guests. except PREFETCHI, ERAPS, and SRSO_USER_KERNEL_NO, which probably should be set. I'll make sure guests look reasonable with them (though prefetchi I'd tested last week and just didn't set when I'd set movdir)

Comment on lines +680 to +681
.set_extended_processor_and_feature_identifiers(Some(leaf))
.expect("can set leaf 8000_0001h");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this tie into the definition of 8000_0022 %ecx NumPerfCtrCore?

cpuid_leaf!(0x8000001C, 0x00000000, 0x00000000, 0x00000000, 0x00000000),
cpuid_leaf!(0x8000001F, 0x00000000, 0x00000000, 0x00000000, 0x00000000),
cpuid_leaf!(0x80000021, 0x000D8C47, 0x00000000, 0x00000000, 0x00000000),
];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that right now we're not including the extended leaf 8000_0026 bits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. that would need a smidge of Propolis work and more importantly won't be particularly interesting beyond 8000_001E for the time being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants