NoBullFit AI

NoBullFit AI is a custom-trained language model specialized in fitness, health, nutrition, and mathematical conversions. It automatically collects training data from the web and can be trained, stopped, and resumed at any time.

About

NoBullFit AI is part of the NoBullFit ecosystem, providing domain-specific AI capabilities for fitness and health-related tasks. Unlike general-purpose AI models, this model is trained from scratch specifically on fitness, health, nutrition, and mathematical conversion data to provide accurate, context-aware responses in these domains.

The model uses a GPT-2 architecture and is trained entirely on domain-specific data, ensuring that responses are tailored to fitness and health contexts. Training data is automatically collected from the web, eliminating the need for manual data preparation.

What It Does

Automatic Data Collection

The AI automatically searches the web for fitness, health, nutrition, and math-related content, extracting Q&A pairs and generating synthetic data for training. This ensures the model has access to current, relevant information without manual data curation.

Domain-Specific Training

The model is trained from scratch on fitness, health, nutrition, and mathematical conversion tasks, including:

Fitness Guidance: Workout recommendations, exercise form advice, training principles
Health Information: Evidence-based health information and wellness tips
Nutrition Planning: Meal planning, macro calculations, dietary advice
Mathematical Conversions: Weight conversions (lbs to kg), volume conversions (ml to L), and other fitness-related calculations

Flexible Training

Training can be started, stopped, and resumed at any time. Checkpoints are automatically saved, allowing you to pause training and continue later without losing progress.

Self-Learning Mode (Default)

By default, the model trains continuously with self-learning enabled. It explores and generates its own training data by asking questions, generating answers, evaluating quality, and adding good examples back to its training set, creating a self-improving learning loop.

How It Works

The training process is fully automated:

Data Collection: If training data doesn't exist, the system automatically searches the web for relevant Q&A pairs covering fitness, health, nutrition, and math topics
Data Processing: Collected data is formatted and split into training and validation sets
Model Training: The GPT-2 based model is trained from scratch on the collected data
Self-Learning (continuous mode): The model generates its own questions and answers, evaluates them, and adds quality examples back to the training set
Checkpointing: Checkpoints are saved after each epoch, allowing training to be resumed
Model Saving: The best model (lowest validation loss) is automatically saved

Training can be stopped gracefully at any time using Ctrl+C, and the current checkpoint will be saved automatically.

Quick Start

Install dependencies
```
pip install -r requirements.txt
```
Start training
```
python train.py
```
By default, training runs continuously with self-learning enabled. The model will automatically collect data from the web if needed, then train indefinitely while generating its own training data.
Resume training
```
python train.py --resume
```
Resume from the latest checkpoint and continue training.
Fixed epochs training
```
python train.py --epochs 10
```
Train for a specific number of epochs instead of continuously.
Stop training Press Ctrl+C to stop gracefully. The current checkpoint will be saved automatically.

Usage

After training, load and use the model:

from nobullfit_ai.model import NoBullFitModel

model = NoBullFitModel.from_pretrained("./models/best_model")
response = model.generate("Question: Convert 500 grams to kilograms\nAnswer:")
print(response[0])

Configuration

All settings are configured via the .env file:

Model: VOCAB_SIZE, MAX_SEQ_LENGTH, EMBED_DIM, NUM_HEADS, NUM_LAYERS
Training: BATCH_SIZE, LEARNING_RATE, NUM_EPOCHS, WARMUP_STEPS
Data: DATA_DIR, MIN_QA_PAIRS (minimum Q&A pairs to collect)
Self-Learning: SELF_LEARNING_INTERVAL (generate new data every N epochs, default: 5)
Checkpoints: KEEP_CHECKPOINTS (number of checkpoints to keep, default: 5)
Device: DEVICE (use cuda for GPU, cpu for CPU)

Technology Stack

NoBullFit AI is built with Python and modern machine learning libraries:

Deep Learning: PyTorch
Model Architecture: Transformers (GPT-2 based)
Data Collection: DuckDuckGo Search, BeautifulSoup, Requests
Training: Custom training loop with checkpointing and resume support

Privacy Commitment

NoBullFit AI collects training data from publicly available web sources. The model is trained locally on your machine, ensuring that your training process and model remain private. No data is sent to external services during training.

License

This project is part of the NoBullFit ecosystem. See the LICENSE file for details.

For commercial licensing inquiries, please contact us at https://nobull.fit.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
nobullfit_ai		nobullfit_ai
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NoBullFit AI

About

What It Does

Automatic Data Collection

Domain-Specific Training

Flexible Training

Self-Learning Mode (Default)

How It Works

Quick Start

Usage

Configuration

Technology Stack

Privacy Commitment

License

About

Uh oh!

Languages

License

pathvoid/nobullfit-ai

Folders and files

Latest commit

History

Repository files navigation

NoBullFit AI

About

What It Does

Automatic Data Collection

Domain-Specific Training

Flexible Training

Self-Learning Mode (Default)

How It Works

Quick Start

Usage

Configuration

Technology Stack

Privacy Commitment

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages