-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Extract reward_lambda_arn from Nova recipes to training job hyperparameters #5316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
malav-shastri
wants to merge
264
commits into
aws:master
Choose a base branch
from
malav-shastri:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+22,494
−3,919
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ws#5002) * fix: fix ValueError when updating a data quality monitoring schedule * Add unit test * black formatting --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com> Co-authored-by: parknate@ <parknate@amazon.com>
Co-authored-by: Keshav Chandak <chakesh@amazon.com>
* Add cleanup logic to model builder integ tests for endpoints * Fix endpoint api call
…lly (aws#5014) * fix: bug in get latest version was getting the max sorted alphabetically instead of sem-ver * handle invalid sev ver and incompatible sagemaker versions --------- Co-authored-by: Eli Davidson <elleedee@amazon.com> Co-authored-by: parknate@ <parknate@amazon.com>
Co-authored-by: pintaoz <pintaoz@amazon.com>
* Fix sourcedir.tar.gz filenames in docstrings * Fix pylint --------- Co-authored-by: pintaoz <pintaoz@amazon.com>
Co-authored-by: pintaoz <pintaoz@amazon.com>
Co-authored-by: pintaoz <pintaoz@amazon.com>
Co-authored-by: pintaoz <pintaoz@amazon.com>
Co-authored-by: pintaoz <pintaoz@amazon.com>
* Fix all type hint and docstrings for callable * Fix codestyle --------- Co-authored-by: pintaoz <pintaoz@amazon.com>
* fix: keep sagemaker_session from being overridden to None, add unit/integ tests * remove commented code * fix styling issues --------- Co-authored-by: Zhaoqi <jzhaoqwa@amazon.com>
… and sagemaker.deserialzers (aws#5037) * Move RecordSerializer and RecordDeserializer to sagemaker.serializers and sagemaker.deserializers * fix codestyle * fix test --------- Co-authored-by: pintaoz <pintaoz@amazon.com>
* Add framework_version to all TensorFlowModel examples * update framework_version to x.x.x --------- Co-authored-by: pintaoz <pintaoz@amazon.com>
…5043) * fix: pass in inference_ami_version to model_based endpoint type * documentation: update contributing.md w/ venv instructions and pip install fixes --------- Co-authored-by: Zhaoqi <jzhaoqwa@amazon.com>
* Add warning about not supporting * update wording --------- Co-authored-by: pintaoz <pintaoz@amazon.com>
* added ap-southeast-7 and mx-central-1 for Jumpstart * added BKK dlc to djl-neuronx --------- Co-authored-by: Isha Chidrawar <ishachid@amazon.com>
…DK (aws#5050) Co-authored-by: malavhs <malavhs@amazon.com>
* add TEI 1.8.2 * add test
* tei * tests --------- Co-authored-by: pintaoz-aws <167920275+pintaoz-aws@users.noreply.github.com> Co-authored-by: Molly He <mollyhe@amazon.com>
* Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Fix incompatible_dependecies test * Fix incompatible_dependecies test * Fix incompatible_dependecies test * Fix incompatible_dependecies test * Fix incompatible_dependecies test * update tensorflow artifacts * update tensorflow artifacts * update tensorflow artifacts * testfile codestyle fixes * testfile codestyle fixes * update SKLearn image URI config * update SKLearn image URI config * docstyle fixes * docstyle fixes * numpy fixes * numpy fixes * numpy fixes * numpy fixes * numpy fixes * numpy fixes * numpy fixes * numpy fixes * numpy fix for slow test * numpy fix for slow test * numpy fix for slow test * numpy fix for slow test * Revert 'Add numpy 2.0 support' * Revert 'Add numpy 2.0 support' * Revert 'Add numpy 2.0 support' --------- Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com> Co-authored-by: parknate@ <parknate@amazon.com> Co-authored-by: Gokul Anantha Narayanan <166456257+nargokul@users.noreply.github.com>
* image * tests --------- Co-authored-by: Gokul Anantha Narayanan <166456257+nargokul@users.noreply.github.com>
* new image * Update src/sagemaker/image_uri_config/huggingface.json removed missing CPU image * add cpu back --------- Co-authored-by: Molly He <mollyhe@amazon.com>
* add image * inf on dlc * neuron tgi dlcs * fix test --------- Co-authored-by: Zhaoqi <52220743+zhaoqizqwang@users.noreply.github.com>
* Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Fix incompatible_dependecies test * Fix incompatible_dependecies test * Fix incompatible_dependecies test * Fix incompatible_dependecies test * Fix incompatible_dependecies test * update tensorflow artifacts * update tensorflow artifacts * update tensorflow artifacts * testfile codestyle fixes * testfile codestyle fixes * update SKLearn image URI config * update SKLearn image URI config * docstyle fixes * docstyle fixes * numpy fixes * numpy fixes * numpy fixes * numpy fixes * numpy fixes * numpy fixes * numpy fixes * numpy fixes * numpy fix for slow test * numpy fix for slow test * numpy fix for slow test * numpy fix for slow test * Revert 'Add numpy 2.0 support' * Revert 'Add numpy 2.0 support' * Revert 'Add numpy 2.0 support' * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support * Add numpy 2.0 support --------- Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com> Co-authored-by: parknate@ <parknate@amazon.com> Co-authored-by: Gokul Anantha Narayanan <166456257+nargokul@users.noreply.github.com>
* image * add py312 * fix * test fix * typo --------- Co-authored-by: Molly He <mollyhe@amazon.com>
…n if it presents in the resource metadata file (aws#5315) Co-authored-by: Jun Lyu <junlyu@amazon.com>
Contributor
|
Hi, as part of incoming PySDK V3 release, we request you to change the target branch to master-v2. Master branch will only contain incoming v3 source code |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
…erparameters
Issue #, if available:
Description of changes:
This PR adds support for extracting
reward_lambda_arnfrom Nova recipe files and passing it as a hyperparameter to SageMaker training jobsChanges made:
src/sagemaker/modules/train/sm_recipes/utils.py: Addedreward_lambda_arnextraction in_get_args_from_nova_recipe()for ModelTrainer code pathsrc/sagemaker/pytorch/estimator.py: Addedreward_lambda_arnextraction in_setup_for_nova_recipe()for PyTorch Estimator code pathtests/unit/test_pytorch_nova.py: Added comprehensive test coverage following existingeval_lambda_arntest patternstests/unit/sagemaker/modules/train/sm_recipes/test_utils.py: Existing test coverage for utils.py pathTesting done:
Unit tests added:
test_setup_for_nova_recipe_with_reward_lambda()- verifies successful extraction in PyTorch estimatortest_setup_for_nova_recipe_without_reward_lambda()- verifies no extraction when parameter absenttest_get_args_from_nova_recipe_with_reward_lambda()- verifies ModelTrainer pathMerge Checklist
Put an
xin the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
unique_name_from_baseto create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.