RealSpark is a self-contained application aimed to analyze and decide the chances of whether an uploaded image was AI-generated or human-made art creation. The images type the application is focused on are works of hand made paintings, eg. oil on canvas or acrylic on board, watercolor, etc.
The application uses a pre-trained Vision Transformer (ViT) model locally for high-accuracy detection of AI-generated content.
- Model:
Ateeqq/ai-vs-human-image-detector(available on Hugging Face). - Architecture: Vision Transformer (ViT).
- Implementation: Uses
transformersandtorchfor offline inference.
It provides a probability score (0.0 to 1.0) where higher values indicate high likelihood of AI generation.
The application performs a multi-stage analysis to identify the artistic medium and verify its physical consistency.
- Physical Texture Analysis (DINOv2):
- The image is tiled into overlapping 224x224 patches.
- DINOv2 (
facebook/dinov2-base) generates high-dimensional embeddings for each patch. - Vector Search & Clustering: Patches are compared using cosine similarity. Repeatable textures (e.g., "scratchiness" of a dry brush or specific impasto strokes) cluster together.
- Consistency Scoring: Measures texture uniformity. Low consistency suggests complex physical brushwork, while high consistency often points to digital media or uniform washes.
- High-Level Labeling (CLIP):
- CLIP (
openai/clip-vit-base-patch32) provides zero-shot classification for the entire image against labels like Watercolor, Oil, Acrylic, Digital painting, etc.
- CLIP (
- Cross-Verification: The local texture findings from DINOv2 are contrasted with global CLIP labels to provide a nuanced description of the medium and its authenticity.
The application identifies physical objects within the artwork to assist in authentication and context.
- Model:
hustvl/yolos-tiny - Architecture: Vision Transformer (ViT) specialized for object detection.
- Function: Detects elements like "person", "frame", "signature", or "canvas" and includes them in the technical summary.
A specialized "synthesizer" step that processes all previous technical findings into a single, professional conclusion for an appraiser.
- Model:
google/flan-t5-small. - Architecture: Instruction-tuned Text-to-Text Transfer Transformer (T5).
- Function: It translates metrics like "85% AI probability" and "DINOv2 consistency scores" into a human-readable insight.
- Web Interface:
- Open your browser and navigate to
http://localhost:8080. - Use the upload form to select an image file (supported formats: JPEG, PNG, etc.).
- Submit the form to upload the image.
- The application will analyze the image and display statistics related to its composition and features.
- Open your browser and navigate to
- Python: 3.10.x, 3.11.x, 3.12.x, or 3.13.x
- pip: Python package manager
- Node.js: v18+ (Required for API generation and Frontend tests)
- npm: Node package manager
-
Clone the repository.
-
Create a virtual environment (optional but recommended):
python3 -m venv venv source venv/bin/activate -
Install the required dependencies:
# For production: pip install -r requirements.txt # For development (tests, linting, api-gen): pip install -r requirements-dev.txt
The preferred development workflow is using VS Code Dev Containers. This project uses a multi-stage Dockerfile to provide a perfectly configured environment.
- Automated Setup: Installs Python 3.12, Node.js v18 (for API generation), and all development tools.
- Isolated Environment: Bypasses local dependency conflicts (especially with PyTorch/Transformers on Intel Macs).
- Integrated Tools: Pre-configures extensions and services (Formatter:
ruff, Linter:PyLance).
How to start:
- Open the project folder in VS Code.
- Click "Reopen in Container" when prompted (or via the Command Palette).
- The environment will automatically build using the
developmenttarget, installing both production and development requirements.
The application will be available at http://localhost:8080.
When developing within the Dev Container, you have two options to ensure your Python code changes are reflected:
- Automatic Reload (Recommended): The
Dockerfileis configured with the--reload. Any save to a.pyfile will trigger an automatic restart of the Uvicorn server. Speficically use--reload-dir appflag to exclude observing frequent changes of.pyin /cache directory.- Note: If you have just updated your local files and your container was already running, you may need to "Rebuild Container" to apply the new
Dockerfileconfiguration.
- Note: If you have just updated your local files and your container was already running, you may need to "Rebuild Container" to apply the new
- Manual Restart script: If you don't want to rebuild the container, you can manually trigger a restart with auto-reload enabled by running:
This is useful if the auto-detection fails or if you've made changes to the environment.
bash app_restart.sh
.
├── app/ # Main application package
│ ├── main.py # FastAPI entry point
│ ├── models.py # Generated Pydantic models [GENERATED]
│ ├── database.py # Database management (DuckDB)
│ ├── analysis/ # Analysis sub-package
│ │ ├── analysis.py # Image processing & Feature extraction
│ │ ├── aiclassifiers.py # AI classification logic (ViT)
│ │ ├── object_detection.py # Object detection logic (YOLOS-Tiny)
│ │ ├── fractaldim.py # Fractal dimension computation
│ │ ├── histogram.py # Color histogram computation
│ │ ├── artmedium/ # Art Medium classification (DINOv2, CLIP)
│ │ └── summarizer.py # AI Insight generation (Flan-T5)
│ ├── static/ # Frontend assets
│ │ └── js/modules/ # Modular UI logic [Alpine.js]
│ └── templates/ # Jinja2 HTML templates
├── tests/ # Test suite
│ └── ...
├── data/ # Local data storage (DuckDB files)
├── tmp/ # Temporary file storage
├── cache/ # Local cache (AI model weights)
├── openapi.yaml # OpenAPI specification
├── generate-api.sh # API code generation script
├── pytest.ini # Pytest configuration
├── package.json # Node.js dependencies/scripts
├── Dockerfile # Docker image definition
├── docker-compose.yml # Docker composition
├── requirements.txt # Production dependencies
├── requirements-dev.txt # Development & Test dependencies
└── ...
The project uses pytest for backend testing. To run the test suite:
On local environment:
./venv/bin/pytestOn docker environment:
pytestThis will execute all tests located in the tests/ directory.
The project uses Jest for unit testing the frontend logic (validators, renderers, etc.).
Setup and Execution:
-
Select Node.js version:
nvm use v18.20.8
-
Install dependencies:
npm install
-
Run tests:
npm test -
Run with coverage:
npm run test:coverage
The project uses Playwright for browser-based integration testing of the complete upload and analysis workflow.
Prerequisites:
- Python 3.10-3.13 (Fully compatible with NumPy 2.0+).
- Python virtual environment activated
- Playwright browsers installed
Setup (One-time):
-
Ensure you're using the correct Python version:
python3 --version # Should show 3.10.x - 3.13.x -
Install Python dependencies:
pip install -r requirements.txt
-
Install Playwright browsers:
playwright install chromium
Running Frontend Integration Tests:
-
Run all frontend integration tests (headless mode):
pytest tests/test_frontend_integration.py -v
-
Run with visible browser (for debugging):
pytest tests/test_frontend_integration.py -v --headed
-
Run a specific test:
pytest tests/test_frontend_integration.py::test_upload_displays_all_steps -v
-
Run all tests (backend + frontend):
pytest -v
What the Frontend Integration Tests Cover:
- ✅ All execution steps are displayed after upload
- ✅ Parallel step execution indicators (running state)
- ✅ Progressive display of partial results
- ✅ Final results display with all cards visible
- ✅ Step progression sequence and status indicators
- ✅ UI element visibility flow
- ✅ Timeout handling
- ✅ Multiple timeouts displayed correctly
- ✅ Error handling and error messages
- ✅ Invalid file upload rejection
- ✅ Task abandonment (canceling old task on new upload)
The project utilizes an OpenAPI-driven development workflow. The central source of truth for the API is openapi.yaml.
You must run the generation script whenever you modify the API definition in openapi.yaml. This is essential because:
- Backend (Python): The Pydantic models in
app/models.pyare automatically updated to match the latest schema. These models are used throughoutapp/main.py. - Frontend (JavaScript): The UI uses standard
fetch()calls and Alpine.js for reactivity. Noopenapi.yamlbased SDK is generated. This ensures a lightweight, build-free frontend while maintaining alignment with the backend through manual implementation of the shared API contract.
The easiest way is to use the provided npm script:
Recommended Method (npm):
npm run generate-apiAlternative (Bash):
bash generate-api.sh- Python Models: Generated using
datamodel-code-generator(installed automatically invenv). It parsesopenapi.yamland producesapp/models.py.
The project uses a multi-stage build process to ensure a secure and lean production environment.
base: Common runtime environment.development: Includes full build toolchain (git, curl, wget, Node.js, build-essential) and development dependencies for testing and API generation.production: Minimal image containing only the application code and production dependencies for maximum security and performance.
This method handles volume mapping for database persistence and model caching.
To start for local development (includes dev tools):
docker compose up -d --buildNote: The docker-compose.yml is configured to use the development target by default, which includes all tools for API generation and testing.
To view logs:
docker compose logs -fTo stop:
docker compose down -vTo build a lean image without development tools:
docker build --target production -t art-analysis-prod .If running with docker compose, the following volumes are mapped to persist data:
data/: Persists the analysis history (image_stats.duckdb). Mounting the directory allows for data locking and temporary files required by DuckDB.tmp/: Persists temporary files.cache/transformers/: Persists the AI model weights to avoid downloading them on every restart.
You can pass environment variables to the container for custom configuration (e.g., in docker-compose.yml or using -e flag):
POST /upload
- Starts an asynchronous image analysis task.
- Body:
multipart/form-datawithfilefield. - Response:
{"task_id": "uuid..."}
GET /progress/{task_id}
- Returns the status and progress of a task.
- Response: JSON containing:
status: (string) Current status message.progress: (int) Progress percentage (0-100).steps: (list) List of total steps:Preprocessing: Basic image loading and metadata extraction.Metadata Analysis: Examining EXIF and software tags for AI signatures.Color Intensity Distribution: Computing RGB color histograms.AI Classifier: Running ViT inference for AI-vs-human detection.Fractal Dimension: Computing fractal dimensionality.Art Medium Analysis: DINOv2 patch analysis and CLIP classification.Object Detection: Running YOLOS-Tiny for element identification.Saving to Database: Storing results and metadata in DuckDB.Insight Summary: Generates the final human-readable conclusion.
current_step: (string) The step currently executing.completed_steps: (list) List of completed steps.partial_results: (object, optional) Real-time results as they become available:histogram_r,histogram_g,histogram_b: (arrays) RGB histogram data (256 bins each).ai_probability: (float) AI detection probability (0.0-1.0).fd_default: (float) Fractal dimension value.object_detection: (array) List of detected objects with labels, scores, and boxes.summary: (string) The generated AI Insight text.
result: (object, optional) Final result when complete (includesid,url, andstats).error: (string, optional) Error message if failed.
GET /stats
- Retrieves aggregate statistics of all analyzed images.
GET /tmp/{filename}
- Serves generated files.
GET /ready_models
- Returns whether the heavy AI models (ViT, T5, YOLOS) have finished loading and warming up.
- Response:
{"status": "ready" | "loading"}
This project prioritizes security and environment integrity through the following measures:
The application is configured to listen on 127.0.0.1 (localhost) when running natively. When running via Docker, it is exposed only to the local machine via the published port on 127.0.0.1:8080. This ensures the application is not accessible from your local network or public internet unless explicitly configured.
Due to the use of specific AI/ML frameworks (PyTorch 2.2+, Transformers), the project uses multi-stage builds to mitigate risks:
- Environment Isolation: Production images exclude all high-risk development tools (git, curl, build-essential, etc.).
- Known Vulnerabilities: Some local Intel-Mac versions of libraries may have CVEs; Docker provides a modern Linux environment where the latest non-vulnerable versions can be run regardless of your host OS.
The application only downloads and loads pre-trained weights from verified official repositories (OpenAI, Facebook/Meta) on Hugging Face.
Docker containers run as a non-root user (vscode) to prevent privilege escalation within the container environment.
- Application Code: Licensed under the MIT License - see the LICENSE file for details.
- AI Models: All pre-trained models used in this project are released under the Apache 2.0 License:
- AI Classifier:
Ateeqq/ai-vs-human-image-detector - Texture Analysis:
facebook/dinov2-base - Labeling:
openai/clip-vit-base-patch32 - Summarizer:
google/flan-t5-small - Object Detection:
hustvl/yolos-tiny
- AI Classifier:
