Skip to content

TheIbrahimMalik/ec2-training-sagemaker-endpoint-lambda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EC2 Training → SageMaker Endpoint → Lambda Inference (AWS)

Author: Ibrahim Malik

This project demonstrates a practical AWS machine learning deployment workflow:

  • Train a PyTorch image classifier on a GPU-backed EC2 instance
  • Deploy the trained model to an Amazon SageMaker real-time endpoint
  • Invoke the endpoint via an AWS Lambda function
  • Configure provisioned concurrency (Lambda) and auto-scaling (endpoint)

The repository is structured as a portfolio-quality project with production-style code, examples, and supporting artefacts.


Architecture Overview

  1. Dataset stored in S3
  2. Training runs on EC2 (GPU) and produces model artefacts
  3. Model is deployed to a SageMaker real-time endpoint
  4. Lambda receives requests and forwards payloads to the endpoint
  5. Endpoint scaling and Lambda concurrency are configured for latency and cost control

Repository Structure

.
├── examples/
│   ├── invoke_endpoint_example.py    # Local endpoint invocation script
│   └── lambda_event.json             # Example Lambda invocation payload
├── figures/
│   └── *.png                         # EC2, SageMaker, Lambda, and IAM screenshots
├── notebooks/
│   └── train_and_deploy.ipynb        # End-to-end orchestration notebook
├── reports/
│   └── writeup.md                    # Technical write-up
├── src/
│   ├── ec2_training/
│   │   └── train.py                  # EC2 training entrypoint
│   └── lambda_handler/
│       └── handler.py                # Lambda inference handler
├── legacy/
│   ├── ec2train1.py                  # Original submission files (archived)
│   ├── lambdafunction.py
│   └── test-output.txt
├── .gitignore
├── LICENSE
├── pyproject.toml
└── requirements.txt

How to Run

Option 1 — View results (no AWS required)

Open notebooks/train_and_deploy.ipynb to inspect the end-to-end workflow, review screenshots in figures/ for EC2, SageMaker, Lambda, and IAM setup evidence, and read the technical write-up in reports/writeup.md.

requirements.txt is provided for optional local development and notebook viewing. Training and deployment are intended to run in AWS.


Option 2 — Run end-to-end (AWS required)

⚠️ Requires valid AWS credentials and will incur charges.

  1. Upload the dataset to S3
  2. Train on EC2 using src/ec2_training/train.py
  3. Run the orchestration notebook: notebooks/train_and_deploy.ipynb
  4. Deploy a SageMaker endpoint
  5. Configure Lambda with the following environment variable:
SAGEMAKER_ENDPOINT_NAME=<your-endpoint-name>
  1. Invoke the endpoint locally via examples/invoke_endpoint_example.py, or via Lambda using examples/lambda_event.json

Key Concepts Demonstrated

  • GPU-backed model training on EC2
  • SageMaker real-time endpoint deployment
  • Lambda-based inference integration using boto3 invoke_endpoint
  • Environment-based configuration with no hard-coded infrastructure values
  • Least-privilege IAM design and security review
  • Provisioned concurrency and endpoint auto-scaling considerations
  • Production-style repository organisation and modular code structure

Attribution

Originally completed as part of the Udacity AWS Machine Learning Engineer Nanodegree. Refactored and documented for professional portfolio presentation.


Licence

MIT Licence.

About

☁️ End-to-end AWS ML pipeline: PyTorch training on GPU-backed EC2, model hosting via SageMaker Real-Time Endpoints, and serverless inference via Lambda. Features auto-scaling and IAM least-privilege.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors