- 
                Notifications
    You must be signed in to change notification settings 
- Fork 212
sgx: add automated DCAP registration using in-cluster PCCS caching #2121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
691dfe1    to
    65ce152      
    Compare
  
    | WORKDIR /opt/intel | ||
|  | ||
| ARG SGX_SDK_URL=https://download.01.org/intel-sgx/sgx-linux/2.26/distro/ubuntu24.04-server/sgx_linux_x64_sdk_2.26.100.0.bin | ||
|  | ||
| RUN curl -sSLfO ${SGX_SDK_URL} \ | ||
| && export SGX_SDK_INSTALLER=$(basename $SGX_SDK_URL) \ | ||
| && chmod +x $SGX_SDK_INSTALLER \ | ||
| && echo "yes" | ./$SGX_SDK_INSTALLER \ | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All probs to @ScottR-Intel for: sudo ./sgx_linux_x64_sdk_2.26.100.0.bin --prefix /opt/intel
|  | ||
| # self-signed TLS certs for pccs-tls: | ||
| # openssl req -x509 -sha256 -nodes -days 365 -newkey rsa:2048 -keyout private.pem -out file.crt -subj "/C=US/ST=Denial/L=Springfield/O=Dis/CN=www.example.com" | ||
| # token hashesh follow (with 'hellworld' changed to the desired secret tokens): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # token hashesh follow (with 'hellworld' changed to the desired secret tokens): | |
| # token hashesh follow (with 'helloworld' changed to the desired secret tokens): | 
| name: pccs-credentials | ||
| securityContext: | ||
| readOnlyRootFilesystem: true | ||
| allowPrivilegeEscalation: false | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to change this to make it work:
-          allowPrivilegeEscalation: false
+          privileged: true
+          allowPrivilegeEscalation: trueI think that for socket device plugins the container that exposes the unix socket shall be privileged (see the pr-helper example).
| I tried and the unix socket is not visible from the virt-launcher. This means that we still need something like a socket device plugin in the virt-handler to mount it. I do not think this is the place for that PR though. | 
| I observe that when remove the qgs pod, the new instance fails because the unix socket still exists. I think the unix socket should be removed when the pod is removed otherwise the unix should be removed manually. | 
| 
 I saw this too and reported a bug to QGS about this. I need to see if I can workaround that in the mean time. | 
Signed-off-by: Mikko Ylinen <mikko.ylinen@intel.com>
| Hi I've been experimenting with this PR and had much of the same issues as @MatiasVara on OpenShift. Most of these issues seem to be directly related to SELinux. I wanted to share a case of SELinux denial which had me scratching my head for a while involving the mounting failures of the EFI variables. When the  But then there appears to be a second more fundamental conflict with SELinux here. Even if you turn on permissive mode ( The fix to this was to modify  to  | 
| 
 @Aseeef thanksfor the detailed investigation. This approach does not work with containerd because it sets  | 
| 
 @Aseeef Does it mean the container runtime cannot read the file the plugin wrote? | 
| Based on this part   | 
| 
 We have other plugins doing the same. So far we have not heard this would be a problem on OCP but something we need to double check @tkatila | 
| 
 @Aseeef it looks OCI (runc) knows about mount labels and that they can also be set via mount options. looks like something worth trying at some point. | 
| 
 Thanks, I'll look into it! | 
| 
 There are two plugins using CDI somehow: FPGA and GPU. FPGA is not supported in OCP. GPU is supported in OCP but its functionality doesn't depend on CDI, it's just using it along with the generic device plugin API. It could be that CDI fails there also, but as it doesn't affect the execution, and no-one has noticed. | 
| 
 getting a bit offtopic but I wonder if DRA drivers are impacted by the same? | 
| 
 I believe DRA is being enabled in the upcoming OCP 4.20. Hopefully they are detecting these SELinux restrictions.. fyi @byako | 
| 
 Were you implying here to try to mount to efivarfs using the same selinux labels as the host? Could you clarify what you mean here? | 
| 
 SELinux is not my area but, yes, that was my thinking: but before testing anything we'd need to check if the container (and/or if some added labeling would be needed for that  | 
| @mythi I tried that and seems to result in a new error: We probably are going to need someone who is an expert with SELinux for this one. | 
| One more thing: The QGS socket is created with the owning user as root and permissions set to: Therefore, by default, the qemu user is not going to be able to access it. Not sure if this is something that should be addressed here in this PR though... | 
| 
 I'll try to play with this a bit since I now have an OCP cluster. 
 Thanks! Redhat's build of QGS adds mode controls. I believe with  | 
This setup gives an automated "online, multi-platform, PCCS based Indirect Registration" and TDX QGS deployment for Kubernetes based clusters.
Building blocks:
Pre-conditions:
Read the basics of Intel TDX remote attestation infrastructure setup and get an Intel PCS API Key. The node(s) have TDX and SGX enabled. The following also assumes that a user has cloned this PR and has a bare-metal cluster available.
Installation:
kubectl apply -k deployments/sgx_plugin/overlays/dcap-infra-resourcesNB: if a proxy setting is needed, edit
pccs.yamlNB: add
nodeSelectorto filter SGX/TDX enabled nodes if run in a multi-node clusterThe node should have
/var/run/tdx-qgs/qgs.socketavailable for QEMU to connect.Notes:
PCCS database is stored to a RAM based
EmptyDirvolume and currently not backed up (a backup mechanism is added later). Keep the PCCS deployment up. If quoting errors occur, full re-install after an SGX Factory Reset might be required.