update application profiles to include compute requirements based on requirements in edge cloud by maheshc01 · Pull Request #15 · camaraproject/ApplicationProfiles

maheshc01 · 2025-06-26T17:42:47Z

What type of PR is this?

enhancement/feature

What this PR does / why we need it:

The enhancements enable support to capture the compute resource requirements of the application and can be used for extended usecases in other CAMARA API (edge cloud as well as additional APIs)

The PR addresses the review comments received during the walk through as part of edge cloud call.

Which issue(s) this PR fixes:

Fixes #14

Special notes for reviewers:

Changelog input

 release-note

Additional documentation

This section can be blank.

docs

…profiles yaml

… in edge cloud

maheshc01 · 2025-06-26T18:35:52Z

@gunjald @gainsley @JoseMConde Requesting your review. I have incorporated all the feedback i had recorded during the walk through i had provided in the edge cloud call. like using persistent and ephemeral instead of internal and external storage.

I went over the EAM yaml specifically PR #280 for reference to utilize the same standards as much as possible but at the same time i also want to keep this first iteration very simple to start with and didnt want to bring in any of thr CPUPool, GPUPool etc.

gunjald · 2025-06-27T10:01:17Z

code/API_definitions/application-profiles.yaml

-    CAMARA APIs related decision making.
+    Application profiles allow developers to specify all relevant information about
+    their application for both network and compute resource requirements, supporting
+    CAMARA APIs and network decision making.


Should we also indicate other APIs which would be using the information submitted by this API or when this information will be used and if that involve any other interaction from developer like any other API invocation?

My opinion is that there would be multiple CAMARA APIs that can we related to this API, like Quality on Demand, Connectivity insights, sessions insights , Edge cloud etc. as of now i am keeping the description more generic to the capabilities of this API and not getting into details on how and where it will be used.

I may be wrong but at the face it seems that an application profile can be created, viewed, updated or deleted just as an API resource but it is not very evident that as an API user what I am suppose to do with the applicationProfileId that the POST call has created. So in my understanding that needs to be explained to API consumer that how he can use the applicationProfileId that he has created with this API as the set of methods are not seems to be indicating the use case clearly as part of the summary section of the API. Or am I am missing the point here?

"The information captured as part of application profiles can we used in different usecases for decision making. Please refer connectivity insights and session insights for more details as a reference to see how the information in application profiles is used for decision making."
Added above text to explain more on this. Hope this clarifies.

jgarciatovar · 2025-06-27T10:32:46Z

code/API_definitions/application-profiles.yaml


+    targetMinCPU:
+      type: number
+      description: >


Typo error > --> |

addressed in latest commit

jgarciatovar · 2025-06-27T10:36:37Z

code/API_definitions/application-profiles.yaml

+        targetMinPersistentStorage:
+          $ref: "#/components/schemas/targetMinPersistentStorage"
+      description: Compute resources of a Application Profile
+      minProperties: 1


I am not sure about the urgency of this PR for Fall25. In general, I think that this element describes computing resources required. However, it would be good to try to align with the definition at E/WBI API interface for Federation, which has a method to reserve computing resources on a partner network. They are using this object there:

ComputeResourceInfo: type: object required: - cpuArchType - numCPU - memory properties: cpuArchType: type: string enum: - ISA_X86_64 - ISA_ARM_64 description: CPU Instruction Set Architecture (ISA) E.g., Intel, Arm etc. numCPU: $ref: '#/components/schemas/Vcpu' memory: type: integer format: int64 description: Amount of RAM in Mbytes diskStorage: type: integer format: int32 description: Amount of disk storage in Gbytes for a given ISA type gpu: type: array items: $ref: '#/components/schemas/GpuInfo' vpu: type: integer description: Number of Intel VPUs available for a given ISA type fpga: type: integer description: Number of FPGAs available for a given ISA type hugepages: type: array items: $ref: '#/components/schemas/HugePage' cpuExclusivity: type: boolean description: Support for exclusive CPUs

As you see, it is not exactly the same. One thing that is missing is the CPU Arch type. Other missing fields might be considered optional at the moment in this API.

jgarciatovar · 2025-06-27T10:37:48Z

code/API_definitions/application-profiles.yaml

+    ComputeUnitEnum:
+      type: string
+      enum:
+        - kb


Is it correct that all the other ENUM values start by capital letter, while kb is lower case?

addressed in latest commit

gunjald · 2025-06-27T10:40:20Z

code/API_definitions/application-profiles.yaml

-    an application's thresholds for network quality (latency, jitter, loss,
-    throughput). This scope will be expanded further based on addtional
-    requirements from other applicable CAMARA APIs
+    This API enables defining, reading, and managing application requirements, including:


Is it for a general application or we want to say "Edge Application requirements". Just for better clarity as we are having Edge Specific Appl APIs.

application profiles are generic meta data about application. Its not necessary that all the application profiles are specific to edge applications.

This part is fine. But then the question will be which are the applications that the application profile is pointing to? Is it pointing to any application being managed by a telco on behalf of the API consumer or telco or API platform is unaware of the application being referred by the application profile? I think some documentation may help for API consumer perspective.

Application profile is meta data associated to an application. its not against a specific instance of the application that has been deployed. Application profile once create can then we used as reference via the application profile id in other camara APIs where the meta data is required for various decision making.

gunjald · 2025-06-27T10:51:50Z

code/API_definitions/application-profiles.yaml

+        - Gb
+        - Tb
+
    PacketDelayBudget:


From a general application developer perspective would the terms like PacketDelayBudget or targetMinUpstreamRate be relatable with what they typically use while working with other cloud like technologies. Typically i have seen these terms in 3GPP specifications but not much in the public cloud or other on-prem environments. It may pose challenge to crisply define these terms in alignment to more simpler ones in my view. Or could we find any generic and more accepted terms that could be used to represent these parameters?

these terms from network KPIs have been used to align with quality on demand. expectation is that the application developers based on their testing know how their app performance under various conditions and are using application profiles to setup the metadata in terms if what is the minimum values they need.

I think i missed that part of alignment with QoD profiles API. This looks fine.

gunjald · 2025-06-27T10:56:38Z

code/API_definitions/application-profiles.yaml

+      format: integer
+      example: 1
+
+    targetMinMemory:


Attributes like targetMinMemory seems to be applicable at application level. The applications so far have been defined may have packaging formats e.g., Helm charts or compose type etc. There may be one or more containers that those descriptors or charts can contain. How would then targetMinMemory can be applied in those circumstances to multiple components? I think it needs to be clarified or be defined when such attributes are to be used to avoid any ambiguity in my view.

The approach we have taken is that the specified resource requirements are the total for the whole application, regardless of how many containers or VMs the application instance may spawn. I think that's also the approach here, but I agree it would be good to spell it out.

A related comment here, while we generally specify resources as the total an application needs, for Kubernetes we also allow specification of a per-node minimum. For example, the total mem resources your application needs may be 30Gb. Based on totals, a 3-node cluster that has 10Gb each would work. However, if a single container in the application requires 15Gb, then application deployment will fail. I think Mahesh stated that he's ignoring this case for now, but it is relevant in this conversation. For reference, here are our resource definitions which we developed with Telefonica: https://github.com/edgexr/edge-cloud-platform/blob/main/api/edgeproto/resources.proto

I dont think the goal here is to capture all the level of details and granularity as being done in EAM. all the details are required in EAM from an orchestration perspective.
But here we are only capturing the basic meta data about the app to help in certain decision making. for example with the compute resource requirements, edge cloud APIs should be able to find an optimal edge cloud based on the platforms capabilities and resource availability. but then for actual application deployment users will have to leverage EAM APIs.
In future we can always look at optimizing this to avoid developers giving similar information at multiple places but the vision is application profiles are more generic and high level meta data which will be used in multiple CAMARA APIs but then for more specific intents developers will have to use the respective APIs like EAM.

I see, I understand the intent now. I'm a little worried about the duplication of data/work, though. Does this mean the application provider needs to maintain separate but potentially partially redundant application profiles (definitions) both here and in EAM and potentially other places?

Also I understand the intent is to be more general here, but depending on what you intend to utilize it for, you will need more specific information for something like optimal edge placement, if you actually want to get an answer that agrees with what EAM will do/allow. For example, if an edge site only supports the ARM architecture, or only supports containerized workloads and not VMs, or doesn't support QoS (because it's running on a public cloud instead of in-network), etc. I think it would be ok if the intent was not overlapping with other API functionality.

I guess I would like to understand potential use cases and how a user/client would interact with this API and how that would flow to calling the other Camara Traffic/EAM/etc APIs. Especially since these are all Camara APIs I feel like they should work together without us having to maintain duplicate schemas, or require the user to maintain duplicate profiles. Should these application profiles here be a common base definition on which application profiles in other APIs can incorporate/import/extend, without having to duplicate? I'm not sure. But I'm worried that going forward without a plan and saying we'll just optimize it later, realistically means it's unlikely to ever get optimized.

Here are some of the use case but please consider this as an exhaustive list.

Identify network performance as compared to what the application need and flag it to the application developer if the network is not able to meet the minimum threshold defined. Connectivity Insights supports this usecase.
Specific to Edge Cloud:

Identify the optimal application end point to connect to for a given UE. Based on the latency and other thresholds defined in the application profile, operators can return a list of edge clouds where the application is deployed and meets the requirements.

similarly, for application deployment, metadata available in the application profiles can be used to make a determination based on capabilities and resource availability across the edge cloud.

While your point is valid that if a given operator is supporting all CAMARA APIs , application providers will have to provide partial redundant data in different APIs but please also consider scenarios where operators might not support all the CAMARA APIs and only support a subset.
For example, if operator has a partnership based approach with hyperscalers for edge cloud, they might want to limit to only support decision making but actual application deployment might be using hyperscalers provided tools.
My goal to to capture enough details about the application as part of the application profile to support the decision making.

So in my view if we can really put some hint for various QoS attributes to answer some of the questions that may come to the developer so he can put the right information for the parameter value. For example for a composite multi-container app I API user may sum up the aggregate CPU, memory etc. to get the optimal outcome from the API. But you may correct if this is not needed or is explained in some other way to API consumer.

For QoS attributes the schema and description was reused from Quality on demand , as much as possible. In terms of compute requirements, application profiles currently doesnt get into the details of the app being single container of multiple containers, it just captures the compute resource requirements for the application on the whole.
this could be a continue discussion for any future enhancements that can be planned as needed.

gunjald · 2025-06-27T10:58:39Z

code/API_definitions/application-profiles.yaml

+      description: |
+        This is the target minimum ephemeral storage required by the application
+      allOf:
+        - $ref: "#/components/schemas/Compute"


Looks like the schema in "Compute" is more of a value descriptor rather than compute itself. Should we change the parameter name to more appropriate one?

Compute is being reused for params like Memory, storage etc. any recommendation for using a different name?

Could be something like MemoryValueUnit or something similar whatever feels more usage friendly or looks more explaining to parameter intent.

when the yaml is view in swagger, here is how it looks with each resource having a unit.
"targetMinGPUMemory": {
"value": 10,
"unit": "Kb"
},
"targetMinEphemeralStorage": {
"value": 10,
"unit": "Kb"
},
"targetMinPersistentStorage": {
"value": 10,
"unit": "Kb"
}

same compute unit is used across memory, ephemeral storage, Persistent storage and Memory. But creating MemoryValueUnit then i would need to create duplicate entries in the schema which in mu opinion can be avoided by this approach.

gunjald · 2025-06-27T11:06:54Z

code/API_definitions/application-profiles.yaml

+          $ref: "#/components/schemas/targetMinMemory"
+        targetMinGPU:
+          $ref: "#/components/schemas/targetMinGPU"
+        targetMinGPUMemory:


For GPU does only providing quantity would be enough or it may also need some kind of GPU model information that the application may depend on? As I understand there are many type or architectures that exists with a vendor with a given GPU family. With that considerable a given application may work on selected GPU architectures only.
So do we need to enable developer to express the GPU related information by defining a GPU model? So far there is no standardization of GPU flavors I have seen to be referred to though.

Agree. Find for reference the definition of GpuInfo on the Federation API interface:

GpuInfo: type: object required: - gpuVendorType - gpuModeName - gpuMemory - numGPU properties: gpuVendorType: type: string enum: - GPU_PROVIDER_NVIDIA - GPU_PROVIDER_AMD description: GPU vendor name e.g. NVIDIA, AMD etc. example: Nvidia gpuModeName: type: string description: Model name corresponding to vendorType may include info e.g. for NVIDIA, model name could be “Tesla M60”, “Tesla V100” etc. gpuMemory: type: integer description: GPU memory in Mbytes numGPU: type: integer description: Number of GPUs

Agree as well. We have also adopted a GPU spec based on the EWBI APIs (this is a protobuf format):

message GPUResource { // GPU model unique identifier string model_id = 1; // Count of how many of this GPU are required/present uint32 count = 2; // GPU vendor (nvidia, amd, etc) string vendor = 3; // Memory in GB uint64 memory = 4; }

right now i have GPU number and memory. Is the recommendation to add the vendor and model?

I suggest to keep vendor and model as the generic may not work due various disparate capabilities across vendors and models.

recommended schema for gpuVendorType and gpuModelName has been incorporated in the latest changes.
I do have some questions around this which i will create separate discussion points which can be taken up as enhancements,

gainsley · 2025-06-27T15:57:49Z

code/API_definitions/application-profiles.yaml

+      format: integer
+      example: 1
+
+    targetMinMemory:


The approach we have taken is that the specified resource requirements are the total for the whole application, regardless of how many containers or VMs the application instance may spawn. I think that's also the approach here, but I agree it would be good to spell it out.

gainsley · 2025-06-27T15:59:45Z

code/API_definitions/application-profiles.yaml

+          $ref: "#/components/schemas/targetMinMemory"
+        targetMinGPU:
+          $ref: "#/components/schemas/targetMinGPU"
+        targetMinGPUMemory:


Agree as well. We have also adopted a GPU spec based on the EWBI APIs (this is a protobuf format):

message GPUResource { // GPU model unique identifier string model_id = 1; // Count of how many of this GPU are required/present uint32 count = 2; // GPU vendor (nvidia, amd, etc) string vendor = 3; // Memory in GB uint64 memory = 4; }

gainsley · 2025-06-27T16:13:02Z

code/API_definitions/application-profiles.yaml

+        targetMinCPU:
+          $ref: "#/components/schemas/targetMinCPU"
+        targetMinMemory:
+          $ref: "#/components/schemas/targetMinMemory"


Should these be min values or max values? If they are min values, that means we are allowing infinite over-provisioning of resources? Without a max value, we can't limit the amount of resources each application uses, and we can't calculate a total max value for multiple applications in case they run in a shared environment (multiple applications on a single Kubernetes cluster). From the viewpoint of managing resource allocation, it is better to require the max values that an application requires, rather than the min. In our platform, we treat any resource values as max values (resource limits in Kubernetes speak).

here minimum is used to identify which edge sites are able to meet the minimum resource requirements of the application. hence minimum.

gainsley · 2025-06-27T16:21:48Z

code/API_definitions/application-profiles.yaml

+      format: integer
+      example: 1
+
+    targetMinMemory:


A related comment here, while we generally specify resources as the total an application needs, for Kubernetes we also allow specification of a per-node minimum. For example, the total mem resources your application needs may be 30Gb. Based on totals, a 3-node cluster that has 10Gb each would work. However, if a single container in the application requires 15Gb, then application deployment will fail. I think Mahesh stated that he's ignoring this case for now, but it is relevant in this conversation. For reference, here are our resource definitions which we developed with Telefonica: https://github.com/edgexr/edge-cloud-platform/blob/main/api/edgeproto/resources.proto

maheshc01 · 2025-07-02T23:50:03Z

have address a number of review comments. For rest of them i have given an explaining on how i see it.
I will reiterate that dont look at application profiles to have all the information that is being captured as part of EAM. the intents of the APIs are different.
EAM, the intent is to know all the information what is required for deploying the application.
Whereas in application profiles the intent is to capture the metadata about the application to enable specific decision making like, is the network able to meet the application's requirements for optimal performance, what are the capabilities and minimum resources we might need to have in the edge cloud if we want to deploy it at the edge etc.

Kevsy · 2025-07-07T09:30:59Z

@jgarciatovar @gainsley @gunjald are you happy to proceed with the changes made by @maheshc01 , or does this require further discussion? We need to resolve urgently if this is to be included in Fall 25 :)

jgarciatovar · 2025-07-07T13:48:36Z

@jgarciatovar @gainsley @gunjald are you happy to proceed with the changes made by @maheshc01 , or does this require further discussion? We need to resolve urgently if this is to be included in Fall 25 :)

I am fine going with this version for Fall 25. Trying to discuss and address all these aspects is not realistic given the dates. In the benefit of use case for Optimal Edge Discovery API, I think it is better to just go ahead with the current version on Fall25. At the same time we can create an Issue here to start discussion about open points.

I understand that the intent of ApplicationProfiles and EAM is not the same. However, I think that it is important to align components defined by CAMARA API with EWBI Federation API interface. Otherwise, integrating APIs on federated cases will be complex (i.e. Optimal Edge Zone API request for an app that is federated with several partner OPs).

maheshc01 · 2025-07-08T21:18:36Z

Have addressed the review comments raised by @gunjald . Once he provides his go ahead, as agreed during the call i will go ahead merge this PR to create a release candidate for Fall 25

maheshc01 · 2025-07-09T13:24:19Z

@gunjald @JoseMConde request to review and share your comments. this is holding up the release PR.
If you guys think this enhancement is not ready i can also take this out of the scope and go ahead with rest of the changes in application profiles which is more specific to aligning it with latest commonalities guidelines.
Please let me know your thoughts at the earliest.

JoseMConde · 2025-07-09T14:05:30Z

@maheshc01 from my side looks good, let see what @gunjald think.

gunjald · 2025-07-09T18:56:48Z

@maheshc01 from my side looks good, let see what @gunjald think.

I think in general changes look good to me. However I still think that from this API perspective a link or description to its association with other connectivity APIs may have been useful from API user perspective. If the operations defined here were part of the other connectivity APIs then correlating the usage of applicationProfileId would have become implicit. And though the other connectivity insight APIs might be referencing this API, a reverse description in this API as an example to other connectivity insight APIs would have been helpful to visualize the correlation between applicationProfileId and application defined in other APIs.
But the comment is mostly on to API documentation we may reconsider taking it up in next cycle and may open a discussion to see if there are more inputs from other members.

gainsley · 2025-07-09T21:30:37Z

code/API_definitions/application-profiles.yaml

+        computeResources:
+          $ref: "#/components/schemas/ComputeResourcesThresholds"
+      anyOf:
+        - required: [networkQualityThresholds]
+        - required: [computeResources]


I see either networkQualityThresholds or computeResources are required here, but in ApplicationProfileRequest, networkQualityThresholds is always required. Probably ApplicationProfileRequest needs to be updated?

Modified "ApplicationProfileRequest" with requirements of either networkQualityThresholds or computeResources.

@gainsley could you confirm you are good with this?

if you could confirm on this i can go ahead and merge this PR and submit the release candidate PR.

yes that looks good

Thank you @gainsley

maheshc01 and others added 4 commits June 24, 2025 19:59

added compute resource requirements relared meta data in application …

d6d6d74

…profiles yaml

Merge branch 'camaraproject:main' into edgeCloudRequirements

1a41c8d

update to capture compute resource requirements based on requirements…

d829234

… in edge cloud

limited columns to 80 and addtional changes for description

fef54a5

maheshc01 requested review from Kevsy and urvika-v as code owners June 26, 2025 17:42

maheshc01 added 4 commits June 26, 2025 23:17

fixing linting errors

bf2ba30

linting errors

71d7a73

fixing linting error

dcabf5a

fixing linting errors

82c0b3d

maheshc01 requested review from JoseMConde, gainsley and gunjald June 26, 2025 18:27

gunjald reviewed Jun 27, 2025

View reviewed changes

jgarciatovar reviewed Jun 27, 2025

View reviewed changes

gunjald reviewed Jun 27, 2025

View reviewed changes

gainsley reviewed Jun 27, 2025

View reviewed changes

address the review feedback

1d1b64b

maheshc01 mentioned this pull request Jul 3, 2025

Prepare r1.1 v0.5.0-rc.1 #16

Merged

maheshc01 added 2 commits July 9, 2025 01:49

incorporated review comments

d4639b9

addressing linting errors

bd44e2e

maheshc01 requested a review from gunjald July 9, 2025 10:51

JoseMConde approved these changes Jul 9, 2025

View reviewed changes

gainsley reviewed Jul 9, 2025

View reviewed changes

Update application-profiles.yaml

45a93cf

urvika-v requested a review from gainsley July 10, 2025 09:44

urvika-v approved these changes Jul 11, 2025

View reviewed changes

urvika-v merged commit a7f99c9 into camaraproject:main Jul 11, 2025
2 checks passed

Conversation

maheshc01 commented Jun 26, 2025

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for reviewers:

Changelog input

Additional documentation

Uh oh!

maheshc01 commented Jun 26, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maheshc01 Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maheshc01 Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

maheshc01 Jul 8, 2025 •

edited

Loading

maheshc01 Jul 8, 2025 •

edited

Loading