Skip to content

projectshft/big_bax

Repository files navigation

BigBax™ - AI-Powered Calorie Tracker

An AI-powered calorie tracking app that uses Google Gemini with structured outputs to parse natural language and calculate net calories.

Based on the article about testing LLM applications the right way - no mocks, just structure.

Why This Exists

This app demonstrates how to:

  • ✅ Use structured outputs with LLMs (Gemini)
  • ✅ Write meaningful tests for non-deterministic systems
  • ✅ Keep logic deterministic and testable
  • ❌ Avoid the false confidence of mocked LLM tests

Features

  • Natural language calorie parsing
  • Intelligent number parsing ("3k" → 3000)
  • Structured JSON output via Gemini
  • Real tests that actually catch regressions
  • Deterministic math functions

Setup

1. Get a Gemini API Key

  1. Go to Google AI Studio
  2. Create a new API key
  3. Copy it

2. Install Dependencies

npm install

3. Configure Environment

cp .env.example .env
# Edit .env and add your GEMINI_API_KEY

Usage

Run the Web App (Next.js)

npm run dev

Then open http://localhost:3000 in your browser!

The app features:

  • 🎨 Beautiful modern UI with animated logo
  • 🍔 Cute cherub mascot holding a burger
  • 🔥 Real-time AI calorie analysis
  • 📊 Visual breakdown of consumed vs burned calories

Run the CLI

npm run cli "I ate 10 donuts and binged Netflix"

Example Inputs

# Explicit calories
npm run cli "I ate 3k calories but burned 500 running"

# Natural language
npm run cli "Had pizza and wings, then hit the gym for an hour"

# The classic BigBax user
npm run cli "Yesterday I ate 10 donuts, a few bags of hot chips, a sweet pickle and binged Netflix. Oh, and I also threw my remote at the TV a few times during the big game!"

Testing

Run All Tests

npm test

Run Only AI Tests

npm run test:ai

What Gets Tested

✅ Structure-based tests (with real API calls):

  • Schema validation (arrays contain numbers)
  • Behavior (k notation, consumed vs burned)
  • Sanity checks (nonsense → empty arrays)

✅ Deterministic tests (fast, free):

  • Math calculations
  • Edge cases (empty arrays, negatives)

❌ What we DON'T test:

  • Exact wording of responses
  • Mocked LLM outputs

Project Structure

bigbax/
├── app/
│   ├── api/
│   │   └── track/
│   │       └── route.ts         # Next.js API route
│   ├── layout.tsx               # Root layout
│   ├── page.tsx                 # Main page component
│   └── globals.css              # Global styles
├── src/
│   ├── lib/
│   │   └── calorieAgent.ts      # Gemini integration with structured outputs
│   ├── utils/
│   │   └── calorieUtils.ts      # Deterministic math functions
│   ├── tests/
│   │   └── calorieAgent.test.ts # Real tests, no mocks!
│   └── index.ts                 # CLI interface
├── public/
│   └── logo.svg                 # BigBax cherub logo
├── package.json
├── tsconfig.json
├── next.config.js
├── vitest.config.ts
└── README.md

Key Concepts

Structured Outputs

Instead of parsing free-form text:

// ❌ Bad: Free-form text
const response = await ai('parse this food');
const calories = parseInt(response.match(/\d+/)[0]); // Brittle!

Use schemas:

// ✅ Good: Structured output
const CalorieSchema = z.object({
	consumedCalories: z.array(z.number()),
	burnedCalories: z.array(z.number()),
});

const response = await gemini.generate({
	responseSchema: zodToJsonSchema(CalorieSchema),
});

Why No Mocks?

// ❌ This test is useless
jest.mock('./calorieAgent', () => ({
	calorieAgent: () => ({ consumed: [3000], burned: [500] }),
}));

test('calculates calories', () => {
	const result = await calorieAgent('anything');
	expect(result.consumed).toEqual([3000]); // Always passes!
});

Real tests catch real problems:

// ✅ This test catches regressions
test('parses k notation', async () => {
	const parsed = await calorieAgent('I ate 3k calories');
	expect(parsed.consumedCalories).toContain(3000); // Real API call
});

Deterministic Logic

Keep math outside the LLM:

// ✅ Testable, predictable, fast
export function calculateNetCalories(data: CalorieData): number {
	const consumed = data.consumedCalories.reduce((a, b) => a + b, 0);
	const burned = data.burnedCalories.reduce((a, b) => a + b, 0);
	return consumed - burned;
}

Cost Management

LLM tests cost money. Run them strategically:

  • ✅ After prompt changes
  • ✅ Before production deploys
  • ✅ Nightly schedules
  • ❌ NOT on every PR commit

Use test filters:

# Only AI tests
npm run test:ai

# Everything else
npm test -- --testNamePattern="^(?!.*Calorie Agent).*"

The Article

This project implements concepts from the article about testing LLM applications properly:

Key Takeaways:

  • ❌ Don't mock LLM outputs - they lie to you
  • ✅ Force structured responses with schemas
  • ✅ Keep logic deterministic
  • ✅ Test behavior and structure, not phrasing
  • ✅ Run tests selectively to control cost

License

MIT


Built with Gemini, Zod, and zero tolerance for mocked tests.

BigBax™ - Because your 0.5% equity depends on it.

About

Save the Big Bax company

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published