Skip to content

Feat/download input#1116

Draft
larrygao001 wants to merge 3 commits intoaws-deadline:mainlinefrom
larrygao001:feat/download-input
Draft

Feat/download input#1116
larrygao001 wants to merge 3 commits intoaws-deadline:mainlinefrom
larrygao001:feat/download-input

Conversation

@larrygao001
Copy link
Copy Markdown
Contributor

Fixes:

What was the problem/requirement? (What/Why)

What was the solution? (How)

What is the impact of this change?

How was this change tested?

See DEVELOPMENT.md for information on running tests.

  • Have you run the unit tests?
  • Have you run the integration tests?
  • Have you made changes to the download or asset_sync modules? If so, then it is highly recommended
    that you ensure that the docker-based unit tests pass.

Was this change documented?

  • Are relevant docstrings in the code base updated?
  • Has the README.md been updated? If you modified CLI arguments, for instance.

Does this PR introduce new dependencies?

This library is designed to be integrated into third-party applications that have bespoke and customized deployment environments. Adding dependencies will increase the chance of library version conflicts and incompatabilities. Please evaluate the addition of new dependencies. See the Dependencies section of DEVELOPMENT.md for more details.

  • This PR adds one or more new dependency Python packages. I acknowledge I have reviewed the considerations for adding dependencies in DEVELOPMENT.md.
  • This PR does not add any new dependencies.

Is this a breaking change?

A breaking change is one that modifies a public contract in a way that is not backwards compatible. See the
Public Contracts section
of the DEVELOPMENT.md for more information on the public contracts.

If so, then please describe the changes that users of this package must make to update their scripts, or Python applications.

Does this change impact security?

  • Does the change need to be threat modeled? For example, does it create or modify files/directories that must only be readable by the process owner?
    • If so, then please label this pull request with the "security" label. We'll work with you to analyze the threats.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Add selective download support for job output attachments:

- Add --include-path and --include-path-stdin options to download-output
  for downloading specific files or directory prefixes
- Add _filter_paths() and _matches_any_filter() in download.py
- Add path_filters parameter to OutputDownloader
- Add _validate_and_normalize_include_paths() with path traversal
  rejection, backslash-to-forward-slash conversion, and normalization
- Fix bug in get_job_input_paths_by_asset_root() where S3 root prefix
  was not prepended to input manifest keys
- Extract _run_download_ux(), _get_job_download_context(),
  _parse_filters_and_config(), _handle_download_error() shared helpers
- Update job_attachments_guide.md to document --include-path
- Add unit tests for path filtering and validation

Signed-off-by: larrygao <larrygao@amazon.com>
Remove remaining download-input references from docstrings and comments.
download-input command will be added in a follow-up PR.

Signed-off-by: larrygao <larrygao@amazon.com>
Add new download-input command for downloading input attachment files,
using the same --include-path filtering and shared UX flow established
in the download-output PR.

- Add InputDownloader class (inherits from OutputDownloader)
- Add _download_job_input() and download-input CLI command
- Update job_attachments_guide.md to document download-input

Signed-off-by: larrygao <larrygao@amazon.com>
@github-actions github-actions Bot added the waiting-on-maintainers Waiting on the maintainers to review. label Apr 17, 2026
@sonarqubecloud
Copy link
Copy Markdown

Job attachments uses your configured S3 bucket as a [content-addressable storage](https://en.wikipedia.org/wiki/Content-addressable_storage), which creates a snapshot of the files used in your job submission in [asset manifests](#asset-manifests), only uploading files that aren't already in S3. This saves you time and bandwidth when iterating on jobs. When an [AWS Deadline Cloud worker agent][worker-agent] starts working on a job with job attachments, it recreates the file system snapshot in the worker agent session directory, and uploads any outputs back to your S3 bucket.

You can then easily download your outputs with the [deadline job download-output] command, or using the [protocol handler](#protocol-handler) to download from a click of a button in the [AWS Deadline Cloud monitor][monitor].
You can then easily download your outputs with the [deadline job download-output] command, or your inputs with the [deadline job download-input] command. You can also use the [protocol handler](#protocol-handler) to download from a click of a button in the [AWS Deadline Cloud monitor][monitor]. Both commands support `--include-path` for downloading specific files or directories.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks like a duplicate of https://github.com/aws-deadline/deadline-cloud/pull/1108/changes, should we review the other one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting-on-maintainers Waiting on the maintainers to review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants