New FPGA Model Streaming #20

Daayim · 2025-01-13T00:24:20Z

Related Task

This PR Closes https://github.com/g14capstone/submissions/issues/103

Changes

Remove code related to traditional model inference method using ssm and backend
Added github tokens url and other .env variables for fastapi deployment on ec2. Added to confluence page for .env variables
Refactored script to pull python3.8 and deploy fastapi server during machine creations via ec2 script
FastAPI server can now be inferenced and data can be streamed back to user over public Ip
FastAPI server has multiple options for temperature, max_token, streaming.

Documentation

FastAPI Doc

Model Streaming Testing

Get Inference URL Endpoint

Local Testing

Terminal Log confirming the automated fastapi deployment

Example query prompt

cmatthews20

@Daayim merge main to your branch and resolve conflicts - looks like umama merged before you and you didn't have those changes

K-rolls

umama-rahman1

Overall looks like great work, make sure to create pull request on Config Engine as well so that code is visible for review too. I had a peek on the branch.

Just make sure to add the get_fpga_inference_url endpoint and it should be good.

app/api/v1/endpoints/machine_endpoints.py

K-rolls

Great work getting those edits in sir!

umama-rahman1

Looks good. Thanks for adding in the get_inference_url for FPGA.
Approving now.
The only thing to check is the Frontend @cmatthews20 using the FPGA inference url to do the chat. If there are any issues, we should be able to update the config-engine repo. (Might need to do JSON)

Daayim added 10 commits December 14, 2024 20:38

Fixed user_id machine inference auth issue

9703597

Refactoring ec2 setup script for api deployment & streaming

263829e

fixed ec2 script to setup fastapi on ec2

baa844c

eddited script for testing

17d19cb

Fixed script error

184c3a1

Fixing script issues

b10a838

Fixed ec2 script

e94f1ca

Fixed script for ec2 setup model streaming

b112bc2

Fixed python install in ec2 setup script

bf72a74

Removed old inference method and dependencies

a4a703b

Daayim added enhancement New feature or request help wanted Extra attention is needed dev Code Development work labels Jan 13, 2025

Daayim requested review from K-rolls, cmatthews20 and umama-rahman1 January 13, 2025 00:24

Daayim self-assigned this Jan 13, 2025

Daayim changed the title ~~Da/model streaming~~ model streaming implementation Jan 13, 2025

Daayim changed the title ~~model streaming implementation~~ Model streaming implementation Jan 13, 2025

cmatthews20 changed the title ~~Model streaming implementation~~ FPGA Model Streaming Jan 13, 2025

cmatthews20 changed the title ~~FPGA Model Streaming~~ New FPGA Model Streaming Jan 13, 2025

cmatthews20 requested changes Jan 13, 2025

View reviewed changes

Daayim added 8 commits January 12, 2025 21:44

Resolved conflicts

8788651

resolved changes part 2

beb62d8

Removed ssm implemented by umama for ollama ec2 setup script

3321ea2

Resolved ruff linting errors

4c0554a

Added new .env variables to CI and Docker configs

63272d2

resolving merge conflict

f123bc3

Resolving merge conflicts

b57b0fc

Resolving conflicts

97ee31b

Fixing lint errors

58ed25d

K-rolls approved these changes Jan 13, 2025

View reviewed changes

umama-rahman1 requested changes Jan 14, 2025

View reviewed changes

app/api/v1/endpoints/machine_endpoints.py Show resolved Hide resolved

Added inference url endpoint for fpga

702b4c1

Daayim requested review from K-rolls, cmatthews20 and umama-rahman1 January 14, 2025 20:47

updated after pre commit

f54a0f5

K-rolls approved these changes Jan 15, 2025

View reviewed changes

cmatthews20 approved these changes Jan 15, 2025

View reviewed changes

umama-rahman1 approved these changes Jan 16, 2025

View reviewed changes

Daayim merged commit e927a80 into main Jan 16, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New FPGA Model Streaming #20

New FPGA Model Streaming #20

Uh oh!

Daayim commented Jan 13, 2025 •

edited

Loading

Uh oh!

cmatthews20 left a comment

Uh oh!

K-rolls left a comment

Uh oh!

umama-rahman1 left a comment

Uh oh!

Uh oh!

K-rolls left a comment

Uh oh!

umama-rahman1 left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

New FPGA Model Streaming #20

New FPGA Model Streaming #20

Uh oh!

Conversation

Daayim commented Jan 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Task

Changes

Documentation

FastAPI Doc

Model Streaming Testing

Get Inference URL Endpoint

Local Testing

Terminal Log confirming the automated fastapi deployment

Example query prompt

Uh oh!

cmatthews20 left a comment

Choose a reason for hiding this comment

Uh oh!

K-rolls left a comment

Choose a reason for hiding this comment

Uh oh!

umama-rahman1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

K-rolls left a comment

Choose a reason for hiding this comment

Uh oh!

umama-rahman1 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Daayim commented Jan 13, 2025 •

edited

Loading

umama-rahman1 left a comment •

edited

Loading