A System for Ethical AI Use & Authorship Transparency in Assessments
INTEGRITY SHIELD is a document-layer watermarking system that embeds schema-aware, item-level watermarks into assessment PDFs while keeping their human-visible appearance unchanged. These watermarks consistently prevent MLLMs from answering shielded exam PDFs and encode stable, item-level signatures that can be reliably recovered from model or student responses.
Large language models (LLMs) can now solve entire exams directly from uploaded PDF assessments, raising urgent concerns about academic integrity and the reliability of grades and credentials. INTEGRITY SHIELD addresses this challenge through document-layer watermarking that:
- ✅ Prevents AI solving: 91–94% exam-level blocking across GPT-5, Claude Sonnet-4.5, Grok-4.1, and Gemini-2.5 Flash
- ✅ Enables authorship detection: 89–93% signature retrieval from model responses
- ✅ Maintains visual integrity: PDFs remain visually unchanged for human readers
- ✅ Supports ethical assessment: Provides interpretable authorship signals without invasive monitoring
INTEGRITY SHIELD exploits the render-parse gap in PDFs: what humans see often differs from what AI parsers ingest. By injecting invisible text, glyph remappings, and lightweight overlays, we influence model interpretation while leaving exams visually unchanged.
┌─────────────────┐
│ Upload Exam │
│ PDF + Answers │
└────────┬────────┘
│
▼
┌─────────────────────────────────────┐
│ Extraction & Structure Analysis │
│ • PyMuPDF + MLLM extraction │
│ • Question type detection │
│ • Answer schema identification │
└────────┬────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Schema-Aware Watermark Planning │
│ • LLM-based tactic selection │
│ • Per-question strategy │
│ • Answer type logic │
└────────┬────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Watermark Engine Application │
│ • code-glyph remapping │
│ • Invisible text injection │
│ • TrapDoc phantom tokens │
│ • In-context watermarks (ICW) │
└────────┬────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Output Generation │
│ • Shielded PDF variants (IS-v1/v2) │
│ • Vulnerability reports │
│ • Attribution signatures │
└─────────────────────────────────────┘
- Python 3.9+ (PyMuPDF compatibility)
- Node.js 18+ and npm
- PostgreSQL 14+ (or SQLite for local dev)
- API Keys: OpenAI, Anthropic, Google AI, and/or Mistral (at least one required)
# Navigate to backend directory
cd backend
# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtConfigure Environment Variables:
Create a .env file in the backend/ directory:
# Environment Configuration
INTEGRITYSHIELD_ENV=development
INTEGRITYSHIELD_PORT=8000
INTEGRITYSHIELD_LOG_LEVEL=INFO
# Database Configuration
# SQLite (default for local dev):
INTEGRITYSHIELD_DATABASE_URL=sqlite:///./data/integrityshield.db
# PostgreSQL (production):
# INTEGRITYSHIELD_DATABASE_URL=postgresql+psycopg2://user:pass@localhost:5432/integrityshield
# AI Provider API Keys (at least one required)
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
GOOGLE_AI_KEY=your_google_ai_key_here
MISTRAL_API_KEY=your_mistral_api_key_here
# Model Configuration
ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
POST_FUSER_MODEL=gpt-5
# Development Tools
INTEGRITYSHIELD_ENABLE_DEV_TOOLS=true
INTEGRITYSHIELD_AUTO_APPLY_MIGRATIONS=trueStart the Backend:
# From the project root
bash backend/scripts/run_dev_server.shThe server will start on http://localhost:8000. The startup script automatically:
- Loads environment variables from
backend/.env - Verifies required API keys
- Runs database migrations
- Starts the Flask server
# Navigate to frontend directory
cd frontend
# Install dependencies
npm install
# Start development server
npm run devThe frontend will be available at http://localhost:5173. The Vite dev server automatically proxies API requests to the backend.
- Upload Assessment: Navigate to the dashboard and upload your exam PDF and answer key
- Structure Extraction: The system automatically extracts question structure, detecting:
- Multiple-choice questions (MCQ)
- True/False questions
- Long-form questions
- Diagrams and tables
- Preview Strategy: Review the planned watermark tactics for each question
- Generate Shielded PDFs: The system creates two watermark variants:
- IS-v1: Lighter watermarking for minimal perturbation
- IS-v2: Stronger multi-layer watermarking for maximum robustness
- AI Calibration: Automatically evaluates watermark effectiveness across multiple MLLMs
- Review Reports: Inspect prevention rates and detection reliability
- Deploy Assessment: Distribute shielded PDF to students
- Collect Responses: Export student answers from your LMS
- Analyze Authorship: Upload responses to view:
- Per-question watermark retrieval scores
- Exam-level authorship degrees
- Cohort-level distributions
- Human Review: Use high authorship scores as signals for follow-up (oral checks, additional assessments)
Evaluated across 30 multi-page exams spanning STEM, humanities, and medical reasoning:
| Metric | GPT-5 | Claude Sonnet-4.5 | Grok-4.1 | Gemini-2.5 Flash |
|---|---|---|---|---|
| Prevention (Exam-Level Blocking) | 93.6% | 92.9% | 92.3% | 91.7% |
| Detection (Signature Retrieval) | 92.8% | 92.1% | 91.6% | 91.0% |
Compared to baselines:
- ICW (In-Context Watermarking): 3-7% prevention/detection
- code-glyph: 81-86% prevention/detection
- TRAPDOC: 40-89% prevention/detection (unstable across models)
- Framework: Flask with SQLAlchemy ORM
- Database: PostgreSQL (production) or SQLite (development)
- Services:
DocumentIngestionService: PDF parsing and structure extractionWatermarkPlanningService: LLM-based strategy selectionPdfRewritingService: Document-layer watermark applicationAuthorshipService: Response scoring and attribution
- Framework: React 18 + TypeScript
- Build Tool: Vite
- UI Library: InstUI (Instructure's Canvas design system)
- State Management: React Context API
- Invisible Text Injection: Hidden spans anchored near stems and options
- Glyph Remapping: CMap-based font substitutions (visually identical, parsed differently)
- Off-Page Overlays: Clipped content that influences parsing without visual changes
- Phantom Tokens: TrapDoc-inspired document-layer perturbations
- Setup Guide: Detailed installation and configuration
- API Reference: Backend endpoints and contracts
- Pipeline Stages: Detailed stage descriptions
- Architecture: System design and data flow
- Troubleshooting: Common issues and solutions
We welcome contributions from the research community! To contribute:
- Fork the repository
- Create a feature branch (
git checkout -b feature/your-feature) - Make your changes with clear commit messages
- Add or update tests as appropriate
- Update documentation
- Submit a pull request
Please ensure your code follows the existing style and passes all tests.
If you use INTEGRITY SHIELD in your research, please cite our paper:
@inproceedings{shekhar2025integrityshield,
title={INTEGRITY SHIELD: A System for Ethical AI Use \& Authorship Transparency in Assessments},
author={Shekhar, Ashish Raj and Agarwal, Shiven and Bordoloi, Priyanuj and Shah, Yash and Anvekar, Tejas and Gupta, Vivek},
booktitle={Proceedings of the 2025 Conference of the European Chapter of the Association for Computational Linguistics (EACL)},
year={2025}
}This project is licensed under the MIT License - see the LICENSE file for details.
INTEGRITY SHIELD is designed for ethical and transparent AI use in educational assessment settings. The system:
- Does NOT monitor students: No keystroke logging, webcam tracking, or device control
- Respects privacy: All data stays within institutional infrastructure
- Requires transparency: Institutions should communicate AI-use policies and watermarking presence to students
- Supports human judgment: Authorship scores are signals for review, not automatic evidence for sanctions
- ✅ Formal educational assessments with clear governance
- ✅ Research on AI-assisted learning and academic integrity
- ✅ Institutional policy development for ethical AI use
- ❌ Surveillance or covert monitoring
- ❌ Automatic sanctions without human review
- ❌ Non-assessment documents without clear authorization
This work was conducted at Arizona State University. We thank the research community for valuable feedback and discussions on ethical AI use in education.
Authors: Ashish Raj Shekhar*, Shiven Agarwal*, Priyanuj Bordoloi, Yash Shah, Tejas Anvekar, Vivek Gupta *Equal contribution
For questions, issues, or collaboration opportunities:
- Project Page: https://shivena99.github.io/IntegrityShield/
- Demo: https://shivena99.github.io/IntegrityShield/
- Issues: GitHub Issues