Skip to content

Commit a7309f8

Browse files
authored
docs: and examples for OCI artifact support on LMEval (#81)
* docs: and examples for OCI artifact support on LMEval Signed-off-by: tarilabs <matteo.mortari@gmail.com> * feedback from coderabbit Signed-off-by: tarilabs <matteo.mortari@gmail.com> --------- Signed-off-by: tarilabs <matteo.mortari@gmail.com>
1 parent 1fcd62d commit a7309f8

File tree

1 file changed

+206
-0
lines changed

1 file changed

+206
-0
lines changed

docs/modules/ROOT/pages/lm-eval-tutorial.adoc

Lines changed: 206 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -291,6 +291,18 @@ Specify extra information for the lm-eval job's pod.
291291
|`outputs.pvcName`
292292
|Binds an existing PVC to a job by specifying its name. The PVC must be created separately and must already exist when creating the job.
293293

294+
|`outputs.oci`
295+
|Persist the job's traces and results as an OCI artifact. Supports the following fields:
296+
297+
* `registry`: A secret reference to the OCI registry.
298+
* `repository`: The repository to be used in the OCI registry to persist traces and results.
299+
* `tag`: Optionally specify the tag to be used for the OCI Artifact. Defaults to the job's name.
300+
* `subject`: Optional OCI reference to be used as the Subject for the OCI Artifact. OCI specification constraints applies (i.e. the Subject must be in the same registry and repository)
301+
* `username`: A secret reference to the username to be used for storing in the OCI registry.
302+
* `password`: A secret reference to the password to be used for storing in the OCI registry.
303+
* `dockerConfigJson`: A secret reference to the docker config.json to be used for storing in the OCI registry, as an alternative to username and password.
304+
* `verifySSL`: Optionally use non-encrypted connection and skip any TLS verification. Helpful for local testing. Defaults to `true`.
305+
294306
|`allowOnline`
295307
|If set to `true`, the LMEval job will download artifacts as needed (e.g. models, datasets or tokenizers). If set to `false`, these will not be downloaded and will be used from local storage. See `offline`.
296308

@@ -843,6 +855,200 @@ In this case, the PVC is not managed by the TrustyAI operator, so it will be ava
843855

844856
In the case where both managed and existing PVCs are referenced in `outputs`, the TrustyAI operator will prefer the managed PVC and ignore the existing one.
845857

858+
859+
=== Persist traces and results as OCI Artifact
860+
861+
Traces and results can be persisted as an OCI Artifact.
862+
When comparing PVC and OCI, storing traces and results as an OCI Artifact provides longer-term, portable storage that can be especially helpful for auditing and traceability scenarios.
863+
864+
To enable OCI Artifact persistent storage, configure the `LMEvalJob` using the `spec.outputs.oci` field; for example:
865+
866+
[source,yaml]
867+
----
868+
apiVersion: trustyai.opendatahub.io/v1alpha1
869+
kind: LMEvalJob
870+
metadata:
871+
name: evaljob-sample
872+
spec:
873+
# other fields omitted ...
874+
outputs:
875+
oci:
876+
registry: <1>
877+
name: my-oci-credentials
878+
key: OCI_HOST
879+
repository: my-org/my-repository <2>
880+
tag: results <3>
881+
dockerConfigJson: <4>
882+
name: my-oci-credentials
883+
key: .dockerconfigjson
884+
----
885+
<1> `registry` is a secret reference to the OCI registry to be used
886+
<2> `repository` is the OCI repository to be used to push the OCI Artifact to
887+
<3> `tag` the tag of the OCI Artifact to be used, will default to the Job's name if not specified
888+
<4> `dockerConfigJson` is a secret reference to the docker config.json to be used to push the OCI Artifact to the OCI registry
889+
890+
Below is an example of the `my-oci-credentials` secret (remember to use the same Namespace as the `LMEvalJob` as usual in Kubernetes):
891+
892+
[source,yaml]
893+
----
894+
kind: Secret
895+
apiVersion: v1
896+
metadata:
897+
name: my-oci-credentials
898+
labels:
899+
opendatahub.io/dashboard: 'true'
900+
annotations:
901+
opendatahub.io/connection-type-ref: oci-v1
902+
openshift.io/description: ''
903+
openshift.io/display-name: my-oci-credentials
904+
type: kubernetes.io/dockerconfigjson
905+
stringData:
906+
.dockerconfigjson: | <1>
907+
{
908+
"auths": {
909+
"quay.io": {
910+
"auth": "bW1...g==",
911+
"email": ""
912+
}
913+
}
914+
}
915+
ACCESS_TYPE: '["Push","Pull"]'
916+
OCI_HOST: quay.io <2>
917+
----
918+
<1> `.dockerconfigjson` the actual docker config.json to be used to for the related OCI registry
919+
<2> `OCI_HOST` the OCI registry to be used
920+
921+
This will push an OCI Artifact `quay.io/my-org/my-repository:results` to the OCI registry containing the traces and the results, in compressed format and one file per layer.
922+
923+
[NOTE]
924+
====
925+
The persisted Artifact uses the OCI Image media-type in order to be also compatible with mounting as an ImageVolume also in CRI-O based Kubernetes distributions.
926+
====
927+
928+
As an example manifest of the OCI Artifact structure:
929+
930+
[source]
931+
----
932+
{
933+
"schemaVersion": 2,
934+
"mediaType": "application/vnd.oci.image.manifest.v1+json", <1>
935+
...
936+
"layers": [
937+
{
938+
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
939+
"size": 2837,
940+
"digest": "sha256:467e2c648cf98ab5fb2f2089cf7489258e5cbce2fb6cdee724c57136c4df99c0", <2>
941+
"annotations": {
942+
"olot.title": "results_2025-11-27T15-35-14.843328.json"
943+
}
944+
},
945+
{
946+
"mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
947+
"size": 2583,
948+
"digest": "sha256:9a856f87762174283ed9470b7708d71b9820510a79ba2bde17e590700c5348a5", <3>
949+
"annotations": {
950+
"olot.title": "samples_unfair_tos_2025-11-27T15-35-14.843328.jsonl"
951+
}
952+
}
953+
]
954+
}
955+
----
956+
<1> using the Image media-type in addition of being an artifact allows for ImageVolume mount in CRI-O based Kubernetes distributions
957+
<2> results file
958+
<3> traces file
959+
960+
==== Pointing from the OCI Artifact to a known reference
961+
962+
It is possible for the OCI Artifact to reference a specific Subject, meaning "pointing" to an already existing OCI reference on the same repository.
963+
This is particularly helpful to implement a semantic of relationship; for example, from a Model on OCI to its LMEval results, or from a given LMEval result to a following result.
964+
965+
Use the `subject` field to specify the OCI reference to be used:
966+
967+
[source,yaml]
968+
----
969+
apiVersion: trustyai.opendatahub.io/v1alpha1
970+
kind: LMEvalJob
971+
metadata:
972+
name: evaljob-sample
973+
spec:
974+
# other fields omitted ...
975+
outputs:
976+
oci:
977+
registry:
978+
name: my-oci-credentials
979+
key: OCI_HOST
980+
repository: my-org/my-repository
981+
tag: evaluations
982+
subject: "quay.io/my-org/my-repository:modelcar" <1>
983+
dockerConfigJson:
984+
name: my-oci-credentials
985+
key: .dockerconfigjson
986+
----
987+
<1> `subject` Optional OCI reference to be used as Subject for the OCI Artifact being pushed
988+
989+
[NOTE]
990+
====
991+
According the OCI Specification, the Subject must reside in the same OCI registry and repository.
992+
====
993+
994+
[TIP]
995+
====
996+
It is best to use for Subject a digest-based reference, than tag-based reference.
997+
I.e., prefer using `quay.io/my-org/my-repository@sha256:...` than `quay.io/my-org/my-repository:tag`.
998+
In the above code example, the tag-based reference was used for illustrative purpose.
999+
====
1000+
1001+
This will push an OCI Artifact `quay.io/my-org/my-repository:evaluations` to the OCI registry containing the traces and the results, in compressed format and one file per layer, referencing as Subject the `quay.io/my-org/my-repository:modelcar`.
1002+
1003+
As an example manifest of the OCI Artifact structure:
1004+
1005+
[source]
1006+
----
1007+
{
1008+
"schemaVersion": 2,
1009+
"mediaType": "application/vnd.oci.image.manifest.v1+json",
1010+
"subject": { <1>
1011+
"mediaType": "application/vnd.oci.image.manifest.v1+json",
1012+
"size": 870,
1013+
"digest": "sha256:d7fe7ac73bd294141563993dcbc344512804048941128ab996b00f37df8b8daf"
1014+
},
1015+
"layers": [
1016+
... as illustrated previously ...
1017+
]
1018+
}
1019+
----
1020+
<1> descriptor pointing to the `quay.io/my-org/my-repository:modelcar`.
1021+
1022+
This allows referrer discovery on OCI registry supporting the OCI Distribution Referrer API.
1023+
1024+
For example using the `oras discover` tooling:
1025+
1026+
[source]
1027+
----
1028+
% oras discover quay.io/my-org/my-repository:modelcar --format json
1029+
{
1030+
"manifests": [
1031+
{
1032+
"reference": "quay.io/my-org/my-repository@sha256:e2f103d916bade657cbd3cfa616dbbd6a5e4488e143e75143cb24b3ef79ff376",
1033+
"mediaType": "application/vnd.oci.image.manifest.v1+json",
1034+
"digest": "sha256:e2f103d916bade657cbd3cfa616dbbd6a5e4488e143e75143cb24b3ef79ff376", <1>
1035+
"size": 1058
1036+
}
1037+
]
1038+
}
1039+
1040+
% oras discover quay.io/my-org/my-repository:modelcar --format table
1041+
Discovered 1 artifact referencing quay.io/my-org/my-repository:modelcar
1042+
Digest: sha256:d7fe7ac73bd294141563993dcbc344512804048941128ab996b00f37df8b8daf
1043+
1044+
Artifact Type Digest
1045+
sha256:e2f103d916bade657cbd3cfa616dbbd6a5e4488e143e75143cb24b3ef79ff376 <1>
1046+
----
1047+
<1> digest reference of `quay.io/my-org/my-repository:evaluations`
1048+
1049+
This provides a foundation to build a reference chain using the OCI Referrer API.
1050+
1051+
8461052
=== Using an `InferenceService`
8471053

8481054
[NOTE]

0 commit comments

Comments
 (0)