Skip to content

Sprint 2

Matthew Altenburg edited this page May 16, 2025 · 9 revisions

All Sprint 2 (April 16th, 2025 - April 30th, 2025) related issues

📝 Stand-Up Meeting Minutes 1

Date: April 18th, 2025
Time: 1:00 PM
Location: Internal Team Check-In

🎯 North Star Goal for the Sprint

The key objective for this sprint is: To fully set up the GPT-2 model environment, ensure all team members understand the model's structure, and resolve Docker-related issues so that the model can run reliably on the client’s devices.

🛠️ Discussion Points

🚩 Milestone Planning for the Sprint

  • Established sprint milestones with a three-day execution window.
  • Focus on achieving both technical setup and shared understanding across the team.

📆 Rough Timeline Overview

Date Focus Areas Key Deliverables
April 18 Architecture Familiarization + code review GPT-2 readings
April 23 Testing + Documentation + successful run on Client's docker architecture summary
April 30 Client Compatibility + Knowledge Solidification Model run confirmation, internal guide

✅ Acceptance Criteria

  • Dockerized model runs successfully on at least two different system types.
  • Each member demonstrates understanding of GPT-2 core components.
  • All planned tasks are assigned and tracked through GitLab Issues.
  • Internal model guide and flowchart are uploaded and accessible.

Next Check-In: April 23rd, 3PM (Stand-up)

📝 Stand-Up Meeting Minutes 2

Date: April 23rd, 2025
Time: 1:00 PM
Location: Internal (All Members Present)
Facilitator: [Team Lead or rotating member]


✅ Summary of Discussion

📌 General Updates

  • Reviewed updated rubric guidelines (#33 - Rubric Rewind) and confirmed that all members now understand the evaluation criteria and how it affects our deliverables.

✅ Completed Tasks (Last Week)

  • Successfully filtered Australian-related corpus from Hugging Face datasets.
    (Issue #39 - Closed)
  • Held productive meetings with Alex and Andrew to align our direction and clarify open questions.
    (Issue #36 - Closed)
  • Completed preprocessing and loading of the Project Gutenberg Dataset.
    (Issue #26 - Closed)
  • Initiated implementation of the GPT-2 model architecture.
    (Issue #25 - Closed)
  • Researched GPT-2 architecture to enhance internal understanding and support peer learning.
    (Issue #24 - Closed)
  • Attended exemplar presentations and took detailed notes.
    (Issue #23 - Closed)
  • Contributed to community growth through ongoing interaction and support.
    (Issue #20 - Closed)
  • Practiced Agile workflow improvements and applied learnings to current sprint.
    (Issue #6 - Closed)

📈 Planning and Next Steps

  • Defined clear next steps for today's development sprint.
  • Assigned new tasks with estimated time and priority.(reflected in issue board)
  • Emphasized continued focus on documentation quality and clarity, more granular and even assignment.

✅ Acceptance Criteria

This stand-up discussion will be considered successfully documented when:

  • Progress and tasks completed are clearly outlined.
  • All team members have reviewed and acknowledged the summary.
  • At least one team member has confirmed the accuracy and clarity of the minutes.

Acknowledged by:
✅ [Unesh] ✅ [Pranav] ✅ [Erica] ✅ [Ashley] ✅ [Michael] ✅ [Chenyu] ✅ [Haoqing] ✅ [Juncheng]


JoeyLLM Client Meeting Minutes 1

Date: Wednesday, 16 April 2025
Time: 11:00 AM
Location: Discord voice channel


1. Attendees

  • JoeyLLM Core Team (all members present)
  • Community Programmers (≈10 participants)

2. Agenda

  1. Walk-through for community programmers:
    • Environment setup
    • Codebase orientation
    • Preparing to start model training
  2. Internal prep (morning of 16 April):
    • Docker setup
    • GitHub code organization
    • Weights & Biases integration
    • Hugging Face data structure
  3. Goal alignment: commence training by week’s end

3. Discussion Summary

Environment Setup

  • Reviewed system requirements (OS, RAM, GPU drivers).
  • Shared base Docker image and mounting instructions.

Codebase Familiarization

  • Navigated key directories:
    • src/ for core modules
    • data/ for preprocessing scripts
    • scripts/ for training launchers
  • Highlighted naming conventions and branch strategy.

Training Preparation

  • Outlined high-level training pipeline: data ingestion → preprocessing → model launch.
  • Confirmed availability of sample dataset on Hugging Face Hub.

Internal Workflow Prep

  • Docker: Use docker-compose with defined service files.
  • GitHub: Repo layout (monorepo vs. multi-repo discussion).
  • W&B: Experiment naming schema and access controls.
  • Hugging Face: datasets library structure and metadata fields.

4. Action Items

Task Week Assigned to
Prepare and share finalized Docker & docker-compose files This week Pranav
Refactor/organize GitHub repo according to agreed layout This week Unesh
Implement W&B experiment tracking boilerplate in training script This week Michael
Draft Hugging Face dataset config and upload sample manifests This week Erica,Ashley
Develop initial training script and verify end-to-end run Next week Haoqing, Pranav
Run first training experiment and report core metrics Next week Chenyu, Juncheng
Schedule follow-up check-in meeting Next week Ashley,Erica

5. Next Steps & Follow-Up

  • Goal: Perfect Model for successful run
  • Next meeting: Friday 23 April at 10 AM AEST to review progress and troubleshoot issues.

_Minutes prepared by: [Ashley]


📝 Sprint Retrospective Meeting Minutes

Date: April 30th, 2025
Time: 2:00 PM
Location: Hive Facilitator: Erica (Scrum Lead this week)


✅ Summary of Discussion

📌 General Updates

  • The team reviewed overall progress for Sprint 2, which was divided into three milestone phases.
  • Client recently requested a complete pivot in project direction, discarding previous implementation work to refocus on learning GPT-3.0 from scratch.
  • We now have restricted repository access: team members can push to main but require client approval to merge or comment.
  • Due to this change, we agreed that Sprint 3 will begin with a re-scoping and replanning meeting.

✅ Completed Tasks by Sprint Milestones

🧠 Sprint 2.1 – Getting Familiar with GPT-2

Status: ✅ 11/11 complete (100%)

  • Conducted foundational research on GPT-2 architecture and capabilities.
  • Ran tokenizer examples and decoded token → ID behavior.
  • Shared and summarized a key research paper ("Language Models are Few-Shot Learners").
  • Created glossaries and beginner-friendly walkthroughs for internal learning.

🧱 Sprint 2.2 – Model Structure Definition and Implementation

Status: ✅ 21/21 complete (100%)

  • Built core transformer block (attention, feed-forward, residuals).
  • Implemented scaled dot-product attention and causal masking.
  • Developed training config system with model presets (tiny–xl).
  • Supported single-GPU training workflow and checkpoint saving system.
  • Added WandB integration and centralized YAML-based config support.

⚙️ Sprint 2.3 – Training and Evaluation

Status: 🟡 12/18 complete (66% Still working on)

  • Completed data loading and implemented gradient-based training loop.
  • Conducted evaluation using loss trend tracking.
  • Began setting up multi-GPU support and partial testing.
  • Tasks paused mid-sprint due to client resetting project scope.

📈 Planning and Next Steps (Sprint 3)

  • Host a Sprint 3 Kickoff Meeting on Day 1 to:
    • Define a new project direction aligned with GPT-3.0 learning goals.
    • Reassign team roles based on upcoming focus areas.
    • Discuss how to manage dual documentation requirements (GitHub + internal).
    • Clarify what work can continue independently before client merges.
  • Update the issue board and roadmap to reflect the new learning-first direction.
  • Increase transparency on push-merge workflows due to limited repository permissions.

✅ Acceptance Criteria

This retrospective will be considered complete when:

  • Completed sprint milestones are clearly recorded.
  • Strategic pivot and rationale are documented.
  • Next sprint direction and planning needs are outlined.
  • All team members acknowledge the summary.

Acknowledged by:
✅ [Unesh] ✅ [Pranav] ✅ [Erica] ✅ [Ashley] ✅ [Michael] ✅ [Chenyu] ✅ [Haoqing] ✅ [Juncheng]

Clone this wiki locally