Skip to content

update application profiles to include compute requirements based on requirements in edge cloud#15

Merged
urvika-v merged 12 commits intocamaraproject:mainfrom
maheshc01:edgeCloudRequirements
Jul 11, 2025
Merged

update application profiles to include compute requirements based on requirements in edge cloud#15
urvika-v merged 12 commits intocamaraproject:mainfrom
maheshc01:edgeCloudRequirements

Conversation

@maheshc01
Copy link
Contributor

What type of PR is this?

  • enhancement/feature

What this PR does / why we need it:

The enhancements enable support to capture the compute resource requirements of the application and can be used for extended usecases in other CAMARA API (edge cloud as well as additional APIs)

The PR addresses the review comments received during the walk through as part of edge cloud call.

Which issue(s) this PR fixes:

Fixes #14

Special notes for reviewers:

Changelog input

 release-note

Additional documentation

This section can be blank.

docs

@maheshc01 maheshc01 requested review from Kevsy and urvika-v as code owners June 26, 2025 17:42
@maheshc01
Copy link
Contributor Author

@gunjald @gainsley @JoseMConde Requesting your review. I have incorporated all the feedback i had recorded during the walk through i had provided in the edge cloud call. like using persistent and ephemeral instead of internal and external storage.

I went over the EAM yaml specifically PR #280 for reference to utilize the same standards as much as possible but at the same time i also want to keep this first iteration very simple to start with and didnt want to bring in any of thr CPUPool, GPUPool etc.

CAMARA APIs related decision making.
Application profiles allow developers to specify all relevant information about
their application for both network and compute resource requirements, supporting
CAMARA APIs and network decision making.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also indicate other APIs which would be using the information submitted by this API or when this information will be used and if that involve any other interaction from developer like any other API invocation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion is that there would be multiple CAMARA APIs that can we related to this API, like Quality on Demand, Connectivity insights, sessions insights , Edge cloud etc. as of now i am keeping the description more generic to the capabilities of this API and not getting into details on how and where it will be used.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong but at the face it seems that an application profile can be created, viewed, updated or deleted just as an API resource but it is not very evident that as an API user what I am suppose to do with the applicationProfileId that the POST call has created. So in my understanding that needs to be explained to API consumer that how he can use the applicationProfileId that he has created with this API as the set of methods are not seems to be indicating the use case clearly as part of the summary section of the API. Or am I am missing the point here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The information captured as part of application profiles can we used in different usecases for decision making. Please refer connectivity insights and session insights for more details as a reference to see how the information in application profiles is used for decision making."
Added above text to explain more on this. Hope this clarifies.


targetMinCPU:
type: number
description: >

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo error > --> |

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in latest commit

targetMinPersistentStorage:
$ref: "#/components/schemas/targetMinPersistentStorage"
description: Compute resources of a Application Profile
minProperties: 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the urgency of this PR for Fall25. In general, I think that this element describes computing resources required. However, it would be good to try to align with the definition at E/WBI API interface for Federation, which has a method to reserve computing resources on a partner network. They are using this object there:

    ComputeResourceInfo:
      type: object
      required:
        - cpuArchType
        - numCPU
        - memory
      properties:
        cpuArchType:
          type: string
          enum:
            - ISA_X86_64
            - ISA_ARM_64
          description: CPU Instruction Set Architecture (ISA) E.g., Intel, Arm etc.
        numCPU:
          $ref: '#/components/schemas/Vcpu'
        memory:
          type: integer
          format: int64
          description: Amount of RAM in Mbytes
        diskStorage:
          type: integer
          format: int32
          description: Amount of disk storage in Gbytes for a given ISA type
        gpu:
          type: array
          items:
            $ref: '#/components/schemas/GpuInfo'
        vpu:
          type: integer
          description: Number of Intel VPUs available for a given ISA type
        fpga:
          type: integer
          description: Number of FPGAs available for a given ISA type
        hugepages:
          type: array
          items:
            $ref: '#/components/schemas/HugePage'
        cpuExclusivity:
          type: boolean
          description: Support for exclusive CPUs

As you see, it is not exactly the same. One thing that is missing is the CPU Arch type. Other missing fields might be considered optional at the moment in this API.

ComputeUnitEnum:
type: string
enum:
- kb

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct that all the other ENUM values start by capital letter, while kb is lower case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in latest commit

an application's thresholds for network quality (latency, jitter, loss,
throughput). This scope will be expanded further based on addtional
requirements from other applicable CAMARA APIs
This API enables defining, reading, and managing application requirements, including:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it for a general application or we want to say "Edge Application requirements". Just for better clarity as we are having Edge Specific Appl APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

application profiles are generic meta data about application. Its not necessary that all the application profiles are specific to edge applications.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is fine. But then the question will be which are the applications that the application profile is pointing to? Is it pointing to any application being managed by a telco on behalf of the API consumer or telco or API platform is unaware of the application being referred by the application profile? I think some documentation may help for API consumer perspective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Application profile is meta data associated to an application. its not against a specific instance of the application that has been deployed. Application profile once create can then we used as reference via the application profile id in other camara APIs where the meta data is required for various decision making.

- Gb
- Tb

PacketDelayBudget:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a general application developer perspective would the terms like PacketDelayBudget or targetMinUpstreamRate be relatable with what they typically use while working with other cloud like technologies. Typically i have seen these terms in 3GPP specifications but not much in the public cloud or other on-prem environments. It may pose challenge to crisply define these terms in alignment to more simpler ones in my view. Or could we find any generic and more accepted terms that could be used to represent these parameters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these terms from network KPIs have been used to align with quality on demand. expectation is that the application developers based on their testing know how their app performance under various conditions and are using application profiles to setup the metadata in terms if what is the minimum values they need.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i missed that part of alignment with QoD profiles API. This looks fine.

format: integer
example: 1

targetMinMemory:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attributes like targetMinMemory seems to be applicable at application level. The applications so far have been defined may have packaging formats e.g., Helm charts or compose type etc. There may be one or more containers that those descriptors or charts can contain. How would then targetMinMemory can be applied in those circumstances to multiple components? I think it needs to be clarified or be defined when such attributes are to be used to avoid any ambiguity in my view.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach we have taken is that the specified resource requirements are the total for the whole application, regardless of how many containers or VMs the application instance may spawn. I think that's also the approach here, but I agree it would be good to spell it out.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A related comment here, while we generally specify resources as the total an application needs, for Kubernetes we also allow specification of a per-node minimum. For example, the total mem resources your application needs may be 30Gb. Based on totals, a 3-node cluster that has 10Gb each would work. However, if a single container in the application requires 15Gb, then application deployment will fail. I think Mahesh stated that he's ignoring this case for now, but it is relevant in this conversation. For reference, here are our resource definitions which we developed with Telefonica: https://github.com/edgexr/edge-cloud-platform/blob/main/api/edgeproto/resources.proto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think the goal here is to capture all the level of details and granularity as being done in EAM. all the details are required in EAM from an orchestration perspective.
But here we are only capturing the basic meta data about the app to help in certain decision making. for example with the compute resource requirements, edge cloud APIs should be able to find an optimal edge cloud based on the platforms capabilities and resource availability. but then for actual application deployment users will have to leverage EAM APIs.
In future we can always look at optimizing this to avoid developers giving similar information at multiple places but the vision is application profiles are more generic and high level meta data which will be used in multiple CAMARA APIs but then for more specific intents developers will have to use the respective APIs like EAM.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I understand the intent now. I'm a little worried about the duplication of data/work, though. Does this mean the application provider needs to maintain separate but potentially partially redundant application profiles (definitions) both here and in EAM and potentially other places?

Also I understand the intent is to be more general here, but depending on what you intend to utilize it for, you will need more specific information for something like optimal edge placement, if you actually want to get an answer that agrees with what EAM will do/allow. For example, if an edge site only supports the ARM architecture, or only supports containerized workloads and not VMs, or doesn't support QoS (because it's running on a public cloud instead of in-network), etc. I think it would be ok if the intent was not overlapping with other API functionality.

I guess I would like to understand potential use cases and how a user/client would interact with this API and how that would flow to calling the other Camara Traffic/EAM/etc APIs. Especially since these are all Camara APIs I feel like they should work together without us having to maintain duplicate schemas, or require the user to maintain duplicate profiles. Should these application profiles here be a common base definition on which application profiles in other APIs can incorporate/import/extend, without having to duplicate? I'm not sure. But I'm worried that going forward without a plan and saying we'll just optimize it later, realistically means it's unlikely to ever get optimized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some of the use case but please consider this as an exhaustive list.

  1. Identify network performance as compared to what the application need and flag it to the application developer if the network is not able to meet the minimum threshold defined. Connectivity Insights supports this usecase.
    Specific to Edge Cloud:
  2. Identify the optimal application end point to connect to for a given UE. Based on the latency and other thresholds defined in the application profile, operators can return a list of edge clouds where the application is deployed and meets the requirements.
  3. similarly, for application deployment, metadata available in the application profiles can be used to make a determination based on capabilities and resource availability across the edge cloud.

While your point is valid that if a given operator is supporting all CAMARA APIs , application providers will have to provide partial redundant data in different APIs but please also consider scenarios where operators might not support all the CAMARA APIs and only support a subset.
For example, if operator has a partnership based approach with hyperscalers for edge cloud, they might want to limit to only support decision making but actual application deployment might be using hyperscalers provided tools.
My goal to to capture enough details about the application as part of the application profile to support the decision making.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in my view if we can really put some hint for various QoS attributes to answer some of the questions that may come to the developer so he can put the right information for the parameter value. For example for a composite multi-container app I API user may sum up the aggregate CPU, memory etc. to get the optimal outcome from the API. But you may correct if this is not needed or is explained in some other way to API consumer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For QoS attributes the schema and description was reused from Quality on demand , as much as possible. In terms of compute requirements, application profiles currently doesnt get into the details of the app being single container of multiple containers, it just captures the compute resource requirements for the application on the whole.
this could be a continue discussion for any future enhancements that can be planned as needed.

description: |
This is the target minimum ephemeral storage required by the application
allOf:
- $ref: "#/components/schemas/Compute"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the schema in "Compute" is more of a value descriptor rather than compute itself. Should we change the parameter name to more appropriate one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compute is being reused for params like Memory, storage etc. any recommendation for using a different name?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be something like MemoryValueUnit or something similar whatever feels more usage friendly or looks more explaining to parameter intent.

Copy link
Contributor Author

@maheshc01 maheshc01 Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the yaml is view in swagger, here is how it looks with each resource having a unit.
"targetMinGPUMemory": {
"value": 10,
"unit": "Kb"
},
"targetMinEphemeralStorage": {
"value": 10,
"unit": "Kb"
},
"targetMinPersistentStorage": {
"value": 10,
"unit": "Kb"
}

same compute unit is used across memory, ephemeral storage, Persistent storage and Memory. 
But creating MemoryValueUnit then i would need to create duplicate entries in the schema which in mu opinion can be avoided by this approach.

$ref: "#/components/schemas/targetMinMemory"
targetMinGPU:
$ref: "#/components/schemas/targetMinGPU"
targetMinGPUMemory:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For GPU does only providing quantity would be enough or it may also need some kind of GPU model information that the application may depend on? As I understand there are many type or architectures that exists with a vendor with a given GPU family. With that considerable a given application may work on selected GPU architectures only.
So do we need to enable developer to express the GPU related information by defining a GPU model? So far there is no standardization of GPU flavors I have seen to be referred to though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Find for reference the definition of GpuInfo on the Federation API interface:

    GpuInfo:
      type: object
      required:
        - gpuVendorType
        - gpuModeName
        - gpuMemory
        - numGPU
      properties:
        gpuVendorType:
          type: string
          enum:
            - GPU_PROVIDER_NVIDIA
            - GPU_PROVIDER_AMD
          description: GPU vendor name e.g. NVIDIA, AMD etc.
          example: Nvidia
        gpuModeName:
          type: string
          description: Model name corresponding to vendorType may include info e.g. for NVIDIA, model name could be “Tesla M60”, “Tesla V100” etc.
        gpuMemory:
          type: integer
          description: GPU memory in Mbytes
        numGPU:
          type: integer
          description: Number of GPUs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree as well. We have also adopted a GPU spec based on the EWBI APIs (this is a protobuf format):

message GPUResource {
  // GPU model unique identifier
  string model_id = 1;
  // Count of how many of this GPU are required/present
  uint32 count = 2;
  // GPU vendor (nvidia, amd, etc)
  string vendor = 3;
  // Memory in GB
  uint64 memory = 4;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now i have GPU number and memory. Is the recommendation to add the vendor and model?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to keep vendor and model as the generic may not work due various disparate capabilities across vendors and models.

Copy link
Contributor Author

@maheshc01 maheshc01 Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommended schema for gpuVendorType and gpuModelName has been incorporated in the latest changes.
I do have some questions around this which i will create separate discussion points which can be taken up as enhancements,

format: integer
example: 1

targetMinMemory:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach we have taken is that the specified resource requirements are the total for the whole application, regardless of how many containers or VMs the application instance may spawn. I think that's also the approach here, but I agree it would be good to spell it out.

$ref: "#/components/schemas/targetMinMemory"
targetMinGPU:
$ref: "#/components/schemas/targetMinGPU"
targetMinGPUMemory:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree as well. We have also adopted a GPU spec based on the EWBI APIs (this is a protobuf format):

message GPUResource {
  // GPU model unique identifier
  string model_id = 1;
  // Count of how many of this GPU are required/present
  uint32 count = 2;
  // GPU vendor (nvidia, amd, etc)
  string vendor = 3;
  // Memory in GB
  uint64 memory = 4;
}

Comment on lines +465 to +468
targetMinCPU:
$ref: "#/components/schemas/targetMinCPU"
targetMinMemory:
$ref: "#/components/schemas/targetMinMemory"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be min values or max values? If they are min values, that means we are allowing infinite over-provisioning of resources? Without a max value, we can't limit the amount of resources each application uses, and we can't calculate a total max value for multiple applications in case they run in a shared environment (multiple applications on a single Kubernetes cluster). From the viewpoint of managing resource allocation, it is better to require the max values that an application requires, rather than the min. In our platform, we treat any resource values as max values (resource limits in Kubernetes speak).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here minimum is used to identify which edge sites are able to meet the minimum resource requirements of the application. hence minimum.

format: integer
example: 1

targetMinMemory:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A related comment here, while we generally specify resources as the total an application needs, for Kubernetes we also allow specification of a per-node minimum. For example, the total mem resources your application needs may be 30Gb. Based on totals, a 3-node cluster that has 10Gb each would work. However, if a single container in the application requires 15Gb, then application deployment will fail. I think Mahesh stated that he's ignoring this case for now, but it is relevant in this conversation. For reference, here are our resource definitions which we developed with Telefonica: https://github.com/edgexr/edge-cloud-platform/blob/main/api/edgeproto/resources.proto

@maheshc01
Copy link
Contributor Author

have address a number of review comments. For rest of them i have given an explaining on how i see it.
I will reiterate that dont look at application profiles to have all the information that is being captured as part of EAM. the intents of the APIs are different.
EAM, the intent is to know all the information what is required for deploying the application.
Whereas in application profiles the intent is to capture the metadata about the application to enable specific decision making like, is the network able to meet the application's requirements for optimal performance, what are the capabilities and minimum resources we might need to have in the edge cloud if we want to deploy it at the edge etc.

@maheshc01 maheshc01 mentioned this pull request Jul 3, 2025
@Kevsy
Copy link
Contributor

Kevsy commented Jul 7, 2025

@jgarciatovar @gainsley @gunjald are you happy to proceed with the changes made by @maheshc01 , or does this require further discussion? We need to resolve urgently if this is to be included in Fall 25 :)

@jgarciatovar
Copy link

@jgarciatovar @gainsley @gunjald are you happy to proceed with the changes made by @maheshc01 , or does this require further discussion? We need to resolve urgently if this is to be included in Fall 25 :)

I am fine going with this version for Fall 25. Trying to discuss and address all these aspects is not realistic given the dates. In the benefit of use case for Optimal Edge Discovery API, I think it is better to just go ahead with the current version on Fall25. At the same time we can create an Issue here to start discussion about open points.

I understand that the intent of ApplicationProfiles and EAM is not the same. However, I think that it is important to align components defined by CAMARA API with EWBI Federation API interface. Otherwise, integrating APIs on federated cases will be complex (i.e. Optimal Edge Zone API request for an app that is federated with several partner OPs).

@maheshc01
Copy link
Contributor Author

Have addressed the review comments raised by @gunjald . Once he provides his go ahead, as agreed during the call i will go ahead merge this PR to create a release candidate for Fall 25

@maheshc01 maheshc01 requested a review from gunjald July 9, 2025 10:51
@maheshc01
Copy link
Contributor Author

@gunjald @JoseMConde request to review and share your comments. this is holding up the release PR.
If you guys think this enhancement is not ready i can also take this out of the scope and go ahead with rest of the changes in application profiles which is more specific to aligning it with latest commonalities guidelines.
Please let me know your thoughts at the earliest.

@JoseMConde
Copy link

@maheshc01 from my side looks good, let see what @gunjald think.

@gunjald
Copy link

gunjald commented Jul 9, 2025

@maheshc01 from my side looks good, let see what @gunjald think.

I think in general changes look good to me. However I still think that from this API perspective a link or description to its association with other connectivity APIs may have been useful from API user perspective. If the operations defined here were part of the other connectivity APIs then correlating the usage of applicationProfileId would have become implicit. And though the other connectivity insight APIs might be referencing this API, a reverse description in this API as an example to other connectivity insight APIs would have been helpful to visualize the correlation between applicationProfileId and application defined in other APIs.
But the comment is mostly on to API documentation we may reconsider taking it up in next cycle and may open a discussion to see if there are more inputs from other members.

Comment on lines +441 to +445
computeResources:
$ref: "#/components/schemas/ComputeResourcesThresholds"
anyOf:
- required: [networkQualityThresholds]
- required: [computeResources]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see either networkQualityThresholds or computeResources are required here, but in ApplicationProfileRequest, networkQualityThresholds is always required. Probably ApplicationProfileRequest needs to be updated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified "ApplicationProfileRequest" with requirements of either networkQualityThresholds or computeResources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gainsley could you confirm you are good with this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you could confirm on this i can go ahead and merge this PR and submit the release candidate PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that looks good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @gainsley

@urvika-v urvika-v requested a review from gainsley July 10, 2025 09:44
@urvika-v urvika-v merged commit a7f99c9 into camaraproject:main Jul 11, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

update application profiles to support meta data about the application's compute resource requirements

7 participants