-
Notifications
You must be signed in to change notification settings - Fork 2
update application profiles to include compute requirements based on requirements in edge cloud #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
d6d6d74
1a41c8d
d829234
fef54a5
bf2ba30
71d7a73
dcabf5a
82c0b3d
1d1b64b
d4639b9
bd44e2e
45a93cf
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,14 +7,15 @@ info: | |
| name: Apache 2.0 | ||
| url: https://www.apache.org/licenses/LICENSE-2.0.html | ||
| description: | | ||
| Application profiles allows application developers to share all the | ||
| information about their application which would be relevant for network/ | ||
| CAMARA APIs related decision making. | ||
| Application profiles allow developers to specify all relevant information about their application for both network and compute resource | ||
| requirements,supporting CAMARA APIs and network decision making. | ||
|
|
||
| To start with the API will provide operations to define, read and manage | ||
| an application's thresholds for network quality (latency, jitter, loss, | ||
| throughput). This scope will be expanded further based on addtional | ||
| requirements from other applicable CAMARA APIs | ||
| This API enables defining, reading, and managing application requirements,including: | ||
| - Network quality thresholds (latency, jitter, loss, throughput) | ||
| - Compute resource thresholds (CPU, GPU, memory, storage) | ||
|
|
||
| The information captured as part of application profiles can be used in different usecases for decision making. Please refer connectivity insights and session insights for more details as a reference to see how the information in application profiles is used for decision making. | ||
| The scope will expand as new requirements from CAMARA APIs emerge. | ||
|
|
||
| ## Errors | ||
|
|
||
|
|
@@ -25,23 +26,35 @@ info: | |
| with an explanation. | ||
|
|
||
| ### Additional CAMARA error responses | ||
| The list of error codes in this API specification is not exhaustive. Therefore | ||
| the API specification may not document some non-mandatory error statuses as | ||
| indicated in `CAMARA API Design Guidelines`. | ||
| The list of error codes in this API specification is not exhaustive. | ||
| Therefore the API specification may not document some non-mandatory error | ||
| statuses as indicated in `CAMARA API Design Guidelines`. | ||
|
|
||
| Please refer to the `CAMARA_common.yaml` of the Commonalities Release associated | ||
| to this API version for a complete list of error responses. | ||
| Please refer to the `CAMARA_common.yaml` of the Commonalities Release | ||
| associated to this API version for a complete list of error responses. | ||
|
|
||
| As a specific rule, error `501 - NOT_IMPLEMENTED` can be only a possible error | ||
| response if it is explicitly documented in the API. | ||
| As a specific rule, error `501 - NOT_IMPLEMENTED` can be only a possible | ||
| error response if it is explicitly documented in the API. | ||
|
|
||
| # Authorization and authentication | ||
|
|
||
| The "Camara Security and Interoperability Profile" provides details of how an API consumer requests an access token. Please refer to Identity and Consent Management (https://github.com/camaraproject/IdentityAndConsentManagement/) for the released version of the profile. | ||
| The "Camara Security and Interoperability Profile" provides details of how | ||
| an API consumer requests an access token. Please refer to Identity and | ||
| Consent Management | ||
| (https://github.com/camaraproject/IdentityAndConsentManagement/) for the | ||
| released version of the profile. | ||
|
|
||
| The specific authorization flows to be used will be agreed upon during the onboarding process, happening between the API consumer and the API provider, taking into account the declared purpose for accessing the API, whilst also being subject to the prevailing legal framework dictated by local legislation. | ||
| The specific authorization flows to be used will be agreed upon during the | ||
| onboarding process, happening between the API consumer and the API provider, | ||
| taking into account the declared purpose for accessing the API, whilst also | ||
| being subject to the prevailing legal framework dictated by local | ||
| legislation. | ||
|
|
||
| In cases where personal data is processed by the API and users can exercise their rights through mechanisms such as opt-in and/or opt-out, the use of three-legged access tokens is mandatory. This ensures that the API remains in compliance with privacy regulations, upholding the principles of transparency and user-centric privacy-by-design. | ||
| In cases where personal data is processed by the API and users can exercise | ||
| their rights through mechanisms such as opt-in and/or opt-out, the use of | ||
| three-legged access tokens is mandatory. This ensures that the API remains | ||
| in compliance with privacy regulations, upholding the principles of | ||
| transparency and user-centric privacy-by-design. | ||
|
|
||
| contact: | ||
| email: sp-edc@lists.camaraproject.org | ||
|
|
@@ -287,12 +300,32 @@ components: | |
| RateUnitEnum: | ||
| type: string | ||
| enum: | ||
| - bps | ||
| - kbps | ||
| - Bps | ||
| - Kbps | ||
| - Mbps | ||
| - Gbps | ||
| - Tbps | ||
|
|
||
| Compute: | ||
| type: object | ||
| properties: | ||
| value: | ||
| type: integer | ||
| example: 10 | ||
| format: int32 | ||
| minimum: 0 | ||
| maximum: 1024 | ||
| unit: | ||
| $ref: "#/components/schemas/ComputeUnitEnum" | ||
|
|
||
| ComputeUnitEnum: | ||
| type: string | ||
| enum: | ||
| - Kb | ||
| - Mb | ||
| - Gb | ||
| - Tb | ||
|
|
||
| PacketDelayBudget: | ||
| description: | | ||
| The packet delay budget is the maximum allowable one-way latency | ||
|
|
@@ -307,7 +340,7 @@ components: | |
|
|
||
| PacketErrorLossRate: | ||
| type: integer | ||
| description: | | ||
| description: | ||
| The exponential power of the allowable error loss rate 10^(-N). | ||
| For instance 3 would be an error loss rate of 10 to the power of -3 | ||
| (0.001) | ||
|
|
@@ -329,7 +362,7 @@ components: | |
| example: 3 | ||
|
|
||
| Jitter: | ||
| description: | | ||
| description: | ||
| The jitter requirement aims to limit the maximum variation in | ||
| round-trip packet delay for the 99th percentile of traffic, following | ||
| ITU Y.1540 standards. It considers only acknowledged packets in a | ||
|
|
@@ -352,25 +385,75 @@ components: | |
| allOf: | ||
| - $ref: "#/components/schemas/Rate" | ||
|
|
||
| targetMinCPU: | ||
| type: number | ||
| description: | ||
| Number of vCPUs required for the application. Fractional values are allowed (e.g., 0.5 = half a vCPU). The value represents the minimum amount of CPU resources to be allocated to the application instance. | ||
| example: 0.5 | ||
|
|
||
| targetMinGPU: | ||
| description: | | ||
| This is the target minimun GPUs required by the application | ||
| format: integer | ||
| example: 1 | ||
|
|
||
| gpuVendorType: | ||
| type: string | ||
| enum: | ||
| - Nvidia | ||
| - AMD | ||
| description: GPU vendor name e.g. NVIDIA, AMD etc. | ||
| example: Nvidia | ||
|
|
||
| gpuModelName: | ||
| type: string | ||
| description: Model name corresponding to vendorType may include info e.g. for NVIDIA, model name could be “Tesla M60”, “Tesla V100” etc. | ||
|
|
||
| targetMinMemory: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Attributes like targetMinMemory seems to be applicable at application level. The applications so far have been defined may have packaging formats e.g., Helm charts or compose type etc. There may be one or more containers that those descriptors or charts can contain. How would then targetMinMemory can be applied in those circumstances to multiple components? I think it needs to be clarified or be defined when such attributes are to be used to avoid any ambiguity in my view. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The approach we have taken is that the specified resource requirements are the total for the whole application, regardless of how many containers or VMs the application instance may spawn. I think that's also the approach here, but I agree it would be good to spell it out. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A related comment here, while we generally specify resources as the total an application needs, for Kubernetes we also allow specification of a per-node minimum. For example, the total mem resources your application needs may be 30Gb. Based on totals, a 3-node cluster that has 10Gb each would work. However, if a single container in the application requires 15Gb, then application deployment will fail. I think Mahesh stated that he's ignoring this case for now, but it is relevant in this conversation. For reference, here are our resource definitions which we developed with Telefonica: https://github.com/edgexr/edge-cloud-platform/blob/main/api/edgeproto/resources.proto
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I dont think the goal here is to capture all the level of details and granularity as being done in EAM. all the details are required in EAM from an orchestration perspective. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, I understand the intent now. I'm a little worried about the duplication of data/work, though. Does this mean the application provider needs to maintain separate but potentially partially redundant application profiles (definitions) both here and in EAM and potentially other places? Also I understand the intent is to be more general here, but depending on what you intend to utilize it for, you will need more specific information for something like optimal edge placement, if you actually want to get an answer that agrees with what EAM will do/allow. For example, if an edge site only supports the ARM architecture, or only supports containerized workloads and not VMs, or doesn't support QoS (because it's running on a public cloud instead of in-network), etc. I think it would be ok if the intent was not overlapping with other API functionality. I guess I would like to understand potential use cases and how a user/client would interact with this API and how that would flow to calling the other Camara Traffic/EAM/etc APIs. Especially since these are all Camara APIs I feel like they should work together without us having to maintain duplicate schemas, or require the user to maintain duplicate profiles. Should these application profiles here be a common base definition on which application profiles in other APIs can incorporate/import/extend, without having to duplicate? I'm not sure. But I'm worried that going forward without a plan and saying we'll just optimize it later, realistically means it's unlikely to ever get optimized.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here are some of the use case but please consider this as an exhaustive list.
While your point is valid that if a given operator is supporting all CAMARA APIs , application providers will have to provide partial redundant data in different APIs but please also consider scenarios where operators might not support all the CAMARA APIs and only support a subset. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So in my view if we can really put some hint for various QoS attributes to answer some of the questions that may come to the developer so he can put the right information for the parameter value. For example for a composite multi-container app I API user may sum up the aggregate CPU, memory etc. to get the optimal outcome from the API. But you may correct if this is not needed or is explained in some other way to API consumer.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For QoS attributes the schema and description was reused from Quality on demand , as much as possible. In terms of compute requirements, application profiles currently doesnt get into the details of the app being single container of multiple containers, it just captures the compute resource requirements for the application on the whole. |
||
| description: | | ||
| This is the target minimum memory required by the application | ||
| allOf: | ||
| - $ref: "#/components/schemas/Compute" | ||
|
|
||
| targetMinEphemeralStorage: | ||
| description: | | ||
| This is the target minimum ephemeral storage required by the application | ||
| allOf: | ||
| - $ref: "#/components/schemas/Compute" | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like the schema in "Compute" is more of a value descriptor rather than compute itself. Should we change the parameter name to more appropriate one?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Compute is being reused for params like Memory, storage etc. any recommendation for using a different name? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could be something like MemoryValueUnit or something similar whatever feels more usage friendly or looks more explaining to parameter intent.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. when the yaml is view in swagger, here is how it looks with each resource having a unit. |
||
|
|
||
| targetMinPersistentStorage: | ||
| description: | | ||
| This is the target minimum persistent storage required by the | ||
| application | ||
| allOf: | ||
| - $ref: "#/components/schemas/Compute" | ||
|
|
||
| ApplicationProfile: | ||
| type: object | ||
| required: | ||
| - applicationProfileId | ||
| - networkQualityThresholds | ||
| properties: | ||
| applicationProfileId: | ||
| type: string | ||
| format: uuid | ||
| networkQualityThresholds: | ||
| $ref: "#/components/schemas/NetworkQualityThresholds" | ||
| computeResources: | ||
| $ref: "#/components/schemas/ComputeResourcesThresholds" | ||
| anyOf: | ||
| - required: [networkQualityThresholds] | ||
| - required: [computeResources] | ||
|
Comment on lines
+441
to
+445
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see either networkQualityThresholds or computeResources are required here, but in ApplicationProfileRequest, networkQualityThresholds is always required. Probably ApplicationProfileRequest needs to be updated?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Modified "ApplicationProfileRequest" with requirements of either networkQualityThresholds or computeResources.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gainsley could you confirm you are good with this?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if you could confirm on this i can go ahead and merge this PR and submit the release candidate PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes that looks good
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you @gainsley |
||
|
|
||
| ApplicationProfileRequest: | ||
| type: object | ||
| required: | ||
| - networkQualityThresholds | ||
| anyOf: | ||
| - required: [networkQualityThresholds] | ||
| - required: [computeResources] | ||
| properties: | ||
| networkQualityThresholds: | ||
| $ref: "#/components/schemas/NetworkQualityThresholds" | ||
| computeResources: | ||
| $ref: "#/components/schemas/ComputeResourcesThresholds" | ||
|
|
||
| NetworkQualityThresholds: | ||
| type: object | ||
|
|
@@ -385,6 +468,29 @@ components: | |
| $ref: "#/components/schemas/PacketErrorLossRate" | ||
| jitter: | ||
| $ref: "#/components/schemas/Jitter" | ||
| minProperties: 1 | ||
|
|
||
| ComputeResourcesThresholds: | ||
| type: object | ||
| properties: | ||
| targetMinCPU: | ||
| $ref: "#/components/schemas/targetMinCPU" | ||
| targetMinMemory: | ||
| $ref: "#/components/schemas/targetMinMemory" | ||
|
Comment on lines
+476
to
+479
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should these be min values or max values? If they are min values, that means we are allowing infinite over-provisioning of resources? Without a max value, we can't limit the amount of resources each application uses, and we can't calculate a total max value for multiple applications in case they run in a shared environment (multiple applications on a single Kubernetes cluster). From the viewpoint of managing resource allocation, it is better to require the max values that an application requires, rather than the min. In our platform, we treat any resource values as max values (resource limits in Kubernetes speak).
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. here minimum is used to identify which edge sites are able to meet the minimum resource requirements of the application. hence minimum. |
||
| gpuVendorType: | ||
| $ref: "#/components/schemas/gpuVendorType" | ||
| gpuModelName: | ||
| $ref: "#/components/schemas/gpuModelName" | ||
| targetMinGPU: | ||
| $ref: "#/components/schemas/targetMinGPU" | ||
| targetMinGPUMemory: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For GPU does only providing quantity would be enough or it may also need some kind of GPU model information that the application may depend on? As I understand there are many type or architectures that exists with a vendor with a given GPU family. With that considerable a given application may work on selected GPU architectures only. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree. Find for reference the definition of GpuInfo on the Federation API interface: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree as well. We have also adopted a GPU spec based on the EWBI APIs (this is a protobuf format):
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. right now i have GPU number and memory. Is the recommendation to add the vendor and model? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggest to keep vendor and model as the generic may not work due various disparate capabilities across vendors and models.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. recommended schema for gpuVendorType and gpuModelName has been incorporated in the latest changes. |
||
| $ref: "#/components/schemas/targetMinMemory" | ||
| targetMinEphemeralStorage: | ||
| $ref: "#/components/schemas/targetMinEphemeralStorage" | ||
| targetMinPersistentStorage: | ||
| $ref: "#/components/schemas/targetMinPersistentStorage" | ||
| description: Compute resources of a Application Profile | ||
| minProperties: 1 | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure about the urgency of this PR for Fall25. In general, I think that this element describes computing resources required. However, it would be good to try to align with the definition at E/WBI API interface for Federation, which has a method to reserve computing resources on a partner network. They are using this object there: As you see, it is not exactly the same. One thing that is missing is the CPU Arch type. Other missing fields might be considered optional at the moment in this API. |
||
|
|
||
| ErrorInfo: | ||
| type: object | ||
|
|
@@ -430,9 +536,12 @@ components: | |
| value: | ||
| status: 400 | ||
| code: INVALID_ARGUMENT | ||
| message: Client specified an invalid argument, request body or query param. | ||
| message: Client specified an invalid argument, request body | ||
| or query param. | ||
| GENERIC_400_OUT_OF_RANGE: | ||
| description: Out of Range. Specific Syntax Exception used when a given field has a pre-defined range or a invalid filter criteria combination is requested | ||
| description: Out of Range. Specific Syntax Exception used when | ||
| a given field has a pre-defined range or a invalid filter | ||
| criteria combination is requested | ||
| value: | ||
| status: 400 | ||
| code: OUT_OF_RANGE | ||
|
|
@@ -488,13 +597,17 @@ components: | |
| - INVALID_TOKEN_CONTEXT | ||
| examples: | ||
| GENERIC_403_PERMISSION_DENIED: | ||
| description: Permission denied. OAuth2 token access does not have the required scope or when the user fails operational security | ||
| description: Permission denied. OAuth2 token access does not | ||
| have the required scope or when the user fails operational | ||
| security | ||
| value: | ||
| status: 403 | ||
| code: PERMISSION_DENIED | ||
| message: Client does not have sufficient permissions to perform this action. | ||
| message: Client does not have sufficient permissions to | ||
| perform this action. | ||
| GENERIC_403_INVALID_TOKEN_CONTEXT: | ||
| description: Reflect some inconsistency between information in some field of the API and the related OAuth2 Token | ||
| description: Reflect some inconsistency between information in | ||
| some field of the API and the related OAuth2 Token | ||
| value: | ||
| status: 403 | ||
| code: INVALID_TOKEN_CONTEXT | ||
|
|
@@ -552,13 +665,15 @@ components: | |
| - TOO_MANY_REQUESTS | ||
| examples: | ||
| GENERIC_429_QUOTA_EXCEEDED: | ||
| description: Request is rejected due to exceeding a business quota limit | ||
| description: Request is rejected due to exceeding a business | ||
| quota limit | ||
| value: | ||
| status: 429 | ||
| code: QUOTA_EXCEEDED | ||
| message: Out of resource quota. | ||
| GENERIC_429_TOO_MANY_REQUESTS: | ||
| description: Access to the API has been temporarily blocked due to rate or spike arrest limits being reached | ||
| description: Access to the API has been temporarily blocked due | ||
| to rate or spike arrest limits being reached | ||
| value: | ||
| status: 429 | ||
| code: TOO_MANY_REQUESTS | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a general application developer perspective would the terms like PacketDelayBudget or targetMinUpstreamRate be relatable with what they typically use while working with other cloud like technologies. Typically i have seen these terms in 3GPP specifications but not much in the public cloud or other on-prem environments. It may pose challenge to crisply define these terms in alignment to more simpler ones in my view. Or could we find any generic and more accepted terms that could be used to represent these parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these terms from network KPIs have been used to align with quality on demand. expectation is that the application developers based on their testing know how their app performance under various conditions and are using application profiles to setup the metadata in terms if what is the minimum values they need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i missed that part of alignment with QoD profiles API. This looks fine.