| title | Question Level Prediction |
|---|---|
| emoji | 🎓 |
| colorFrom | purple |
| colorTo | blue |
| sdk | docker |
| app_port | 7860 |
| app_file | app.py |
| python_version | 3.9 |
| pinned | false |
A machine learning project that classifies educational questions into Bloom's Taxonomy levels and Difficulty categories using Logistic Regression. The project implements a custom NLP pipeline with Sentence Transformers and provides a beautified Streamlit interface for real-time predictions.
This project performs an end-to-end machine learning pipeline for analyzing the cognitive complexity and difficulty of educational questions. It covers:
- Data Exploration — Analyzing question text distributions and student performance metadata
- Feature Engineering — Generating semantic embeddings and normalizing student success metrics
- Model Training & Evaluation — Developing calibrated Logistic Regression models for multi-class classification
- Deployment — Creating a modern, interactive dashboard for production-ready inference
Note
This dataset was synthesized using a Large Language Model (LLM) due to the scarcity of publicly available datasets for multi-class Bloom's Taxonomy classification on specific educational content.
| Property | Details |
|---|---|
| File | final.csv |
| Rows | 5,500 |
| Columns | ~11 predictive columns |
| Target Variables | bloom_level, difficulty |
The model uses a total of 12 features (9 base features from the dataset + 3 engineered features):
| Feature | Type | Source | Description |
|---|---|---|---|
| Question Text | string | Base | The raw text of the question (Vectorized via NLP) |
| Subject | object | Base | Broad subject category (e.g., Science, Maths) |
| Topic | object | Base | Specific topic within the subject |
| Avg Score | float64 | Base | Average score achieved by students (0.0 - 1.0) |
| Correct % | float64 | Base | Percentage of students who got the question right |
| Students Attempted | int64 | Base | Total count of students who answered the question |
| Students Correct | int64 | Base | Total count of students who answered correctly |
| Time Taken | float64 | Base | Average time spent on the question (minutes) |
| Success Rate | float64 | Engineered | Calculated as (Correct / Attempted) |
| Log Attempts | float64 | Engineered | Log-transformed attempt count for better scaling |
| Question Length | int64 | Engineered | Total word count of the question text |
- NLP Processing: Text is vectorized using
SentenceTransformer('all-MiniLM-L6-v2') to capture semantic intent. - Categorical Encoding: One-Hot Encoding applied to Subject and Topic features.
- Scaling: Standardized numerical metrics using
StandardScalerfor model stability.
- Standalone Module: All logic encapsulated in
BloomModelDeployerclass for modular usage. - Balanced Weights: Implemented
class_weight='balanced'to handle imbalanced levels in Bloom's Taxonomy.
- Splitting data into 80% training and 20% testing sets.
- Model Selection: During experimentation, XGBoost and Random Forest were tested. However, they did not provide a significant improvement in accuracy for this specific categorical text task, leading to the selection of Logistic Regression for its better generalization and interpretability.
- Persistence of all artifacts (models, encoders, scalers) into the
models/directory.
| Metric (Accuracy) | Bloom Level | Difficulty |
|---|---|---|
| Accuracy Score | 0.33 | 0.40 |
| F1 Score (Macro) | 0.33 | 0.39 |
Important
The current accuracy levels are primarily limited by the synthesized nature of the dataset. LLM-generated data, while useful for bootstrapping, often lacks the subtle nuances of real-world educational assessments, which affects the model's ability to reach higher precision.
| Library | Purpose |
|---|---|
| Python 3.9 | Programming language |
| Streamlit | UI Framework & Dashboard |
| Sentence-Transformers | NLP & Semantic Embeddings |
| scikit-learn | ML Models, Preprocessing, and Metrics |
| Pandas / NumPy | Data manipulation and Numerical logic |
| Docker | Containerization |
| Hugging Face | Deployment |
capstone_genai/
├── models/ # Saved .pkl joblib artifacts
├── final.csv # Main dataset
├── milestone1.ipynb # Research and Development Notebook
├── logistic_regression_deployment.py # Core ML class
├── app.py # Beautified Streamlit Frontend
├── requirements.txt # Deployment dependencies
└── README.md # Project documentation
- Python 3.9+
- Pip (Python Package Manager)
# Install dependencies
pip install -r requirements.txt
# Train the models (if pkl files are missing)
python3 logistic_regression_deployment.py --train
# Launch the dashboard
streamlit run app.py# Build the Docker image
docker build -t question-classifier .
# Run the container
docker run -p 7860:7860 question-classifierThis project is part of the GenAI Capstone project.