SnapClass.AI is a fully Offline intelligent educational platform that combines multiple AI models to process, analyze, and generate educational content from various input sources including audio, PDFs, and text.
SnapClass/
├── 📁 server/ # Main application backend
│ ├── 🚀 desktop_app.py # Main GUI application (PyQt5/Tkinter)
│ ├── 🌐 app.py # Flask web server
| ├── 🌐 desktop_app.py
| ├── 🌐 chat.py
│ ├── 🔄 trans.py # Main processing orchestrator
│ ├── 🎤 stt.py # Speech-to-Text (Whisper)
│ ├── 📄 pdf_reader.py # PDF processing (Nougat + BLIP)
│ ├── ❓ question_gen.py # Question generation (LLaMA)
│ ├── 📊 slm_analyse.py # Student analysis (LLaMA)
│ ├── 🛠️ utils.py # Utility functions
│ ├── ⚙️ setup.py # Model downloader
│ ├── 📁 templates/ # HTML templates
│ ├── 🎨 static/ # CSS/JS assets
│ ├── 📁 uploads/ # User file uploads
│ └── 📁 output/ # Processed results
│ ├── 📁 llama3/ # LLaMA 3.2 3B model
│ │ ├── genie-t2t-run.exe # Genie inference engine
│ │ ├── genie_config.json # Model configuration
│ │ ├── *.bin # Model weights (3 parts)
│ │ └── *.dll # Windows dependencies
│ │
│ ├── 📁 whisper/ # OpenAI Whisper model
│ │ ├── model.safetensors # Speech recognition model
│ │ ├── tokenizer.json # Tokenizer
│ │ └── config.json # Model configuration
│ │
│ └── 📁 poppler/ # PDF processing utilities
│
└── 📋 requirements.txt # Python dependencies
- Audio Processing: Speech-to-text conversion using Whisper
- Document Processing: PDF text extraction module
- Text Generation: Question generation and analysis using LLaMA 3.2
- Desktop Application: Native GUI built with customtkinter
- Web Interface: Flask-based web server with React frontend
- Input Processing: Audio files, PDFs, or text input
- AI Analysis: Multi-model AI processing pipeline
- Content Generation: Questions, summaries, and insights
- Output Delivery: Structured results via GUI or web interface
- Framework: Flask (web server)
- GUI: customtkinter (desktop app)
- AI/ML: PyTorch, Transformers, Whisper, Llama3.2
- Audio: librosa, soundfile
- PDF: pytesseract, pypdf
- Data: numpy, PIL, PyYAML
- LLaMA 3.2 3B: Text generation and analysis
- Whisper: Speech-to-text transcription
- OS: Windows 10/11, macOS, or Linux
- Processor: Snapdragon X Elite
- RAM: Minimum 8GB, Recommended 16GB+
- Storage: 10GB+ free space for models
- GPU: Optional but recommended for faster inference
- Python: 3.8 - 3.11
- Git: For version control
git clone https://github.com/YOUR_USERNAME/SnapClass.git
cd SnapClass# Create virtual environment
python -m venv venv
# Activate virtual environment
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt# Navigate to server directory
cd server
# Download and setup AI models
python setup.pyDwonload NPU optimised Llama3.2 model Click here Download poppler from here Note: This will download ~5GB of AI models. Ensure stable internet connection.
cd server
python desktop_app.py- LLaMA: Edit
server/llama3/genie_config.json - Whisper: Configure in
server/whisper/config.json - Nougat: Settings in
server/nougat/config.json - BLIP: Options in
server/blip/config.json
# Use MSIX build script
.\build.ps1 -Version "1.0.0.0" -Publisher "CN=SnapClass.A"# Verify model files exist
ls -la server/llama3/
ls -la server/whisper/- Reduce batch sizes in model configurations
- Use CPU-only inference for lower memory usage
- Close other applications to free RAM
- Check the availability of Snapdragon NPU
# Reinstall dependencies
pip uninstall -r requirements.txt
pip install -r requirements.txt- GPU Acceleration: Install CUDA-enabled PyTorch
- Model Quantization: Use quantized models for faster inference
- Batch Processing: Process multiple files simultaneously
- Fork the repository
- Create feature branch:
git checkout -b feature-name - Make changes and test thoroughly
- Commit:
git commit -m 'Add feature' - Push:
git push origin feature-name - Create Pull Request
- Python: Follow PEP 8 guidelines
- Documentation: Update README for new features
This project is licensed under the MIT License - see the LICENSE file for details.
- Meta AI: LLaMA 3.2 model
- OpenAI: Whisper speech recognition
- Open Source Community: Various libraries and tools
- A T Abbilaash (abbilaashat@gmail.com)
Made with ❤️ for the educational community