AI Assistant Usage in Student Life — Classification Project

This project analyzes synthetic data of AI assistant usage among students, performs exploratory data analysis (EDA), feature engineering, preprocessing, and builds a classification model to predict whether a student will use the AI assistant again in the future.

📂 Dataset

The dataset is publicly available on Kaggle:

AI Assistant Usage in Student Life (Synthetic)

It contains 10,000 records with the following columns:

SessionID — Unique identifier for each AI assistant usage session
StudentLevel — Education level (e.g., Undergraduate, Graduate, High School)
Discipline — Field of study (e.g., Computer Science, Psychology)
SessionDate — Date of the AI assistant session
SessionLengthMin — Duration of session in minutes
TotalPrompts — Number of prompts sent during the session
TaskType — Type of task performed (e.g., Studying, Coding)
AI_AssistanceLevel — Level of AI assistance (1 to 5)
FinalOutcome — Outcome of the session (e.g., Assignment Completed)
UsedAgain — Target variable (1 = Yes, 0 = No)
SatisfactionRating — Satisfaction rating given by the student

📊 Project Workflow

1. EDA (Exploratory Data Analysis)

Checked dataset shape, data types, and missing values
Summary statistics (describe())
Class distribution visualization for the target variable
Distribution plots for numeric variables
Count plots for categorical variables

2. Feature Engineering

Additional features were created to improve model performance:

Date-based features extracted from SessionDate:
- year
- month
- dayofweek
- day
- weekofyear
Interaction-based feature:
- prompts_per_min = TotalPrompts / SessionLengthMin

3. Preprocessing

Encoding categorical variables with LabelEncoder (for simplicity)
Scaling numeric features using StandardScaler
SMOTE applied to handle class imbalance
Train-test split (80/20 ratio)

4. Modeling

Models tested:

Logistic Regression
Random Forest Classifier
XGBoost Classifier

Final chosen model: RandomForestClassifier

Achieved accuracy: ~75%

5. Evaluation

Confusion Matrix
Classification Report (Precision, Recall, F1-score)
Accuracy Score
imbalanced-learn xgboost

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
ai_assistant_usage.ipynb		ai_assistant_usage.ipynb
ai_assistant_usage_student_life.csv		ai_assistant_usage_student_life.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Assistant Usage in Student Life — Classification Project

📂 Dataset

📊 Project Workflow

1. EDA (Exploratory Data Analysis)

2. Feature Engineering

3. Preprocessing

4. Modeling

5. Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Assistant Usage in Student Life — Classification Project

📂 Dataset

📊 Project Workflow

1. EDA (Exploratory Data Analysis)

2. Feature Engineering

3. Preprocessing

4. Modeling

5. Evaluation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages