video.1.mov
trim1.mov
trim.2.mov
trim3.mov
The Credit for Prior Learning (CPL) Website is an AI-powered system that automates the evaluation of student documents for credit transfer and prior learning assessment at Northeastern University. The system integrates IBM watsonx.ai, vector databases, and conversational AI to provide intelligent document analysis and streamlined workflows for both students and faculty.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Frontend │────│ Node.js │────│ Python │
│ (HTML/JS) │ │ Gateway │ │ Backend │
│ Port: 3000 │ │ Port: 3000 │ │ Port: 5000 │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ ┌─────────────────┐ │
└──────────────│ Watson Assistant│ │
│ (IBM Cloud) │ │
└─────────────────┘ │
│
┌──────────────────────────────────────────────┼─────────────┐
│ │ │
┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌──────────────┐
│ watsonx.ai │ │ Milvus │ │ IBM COS │ │ watsonx.data │
│ (Embeddings) │ │ (Vectors) │ │ (Documents) │ │ (Iceberg) │
└───────────────┘ └───────────────┘ └───────────────┘ └──────────────┘
Frontend:
- HTML5, CSS3, JavaScript (ES6+)
- IBM Watson Assistant (Conversational AI)
- Google Fonts (Lato)
Backend:
- Node.js Gateway: Express.js, CORS, Multer, FormData
- Python AI Service: Flask, IBM watsonx.ai SDK, PyMilvus
- Document Processing: PyPDF2, python-docx, LangChain
External Services:
- IBM watsonx.ai: AI embeddings and vector search
- Milvus: Vector database for semantic search
- IBM Cloud Object Storage: Original document storage
- IBM watsonx.data: Apache Iceberg tables for metadata
CPL-Website/
├── backend/ # Python AI processing services
│ ├── services/ # Main application services
│ ├── handlers/ # External service handlers
│ ├── scripts/ # Setup and maintenance scripts
│ ├── utils/ # Utility functions
│ └── config/ # Configuration files
├── deployments/ # Jupyter Notebook that contains the prompt + grounded documents deployment as an ai service
│ ├── watsonx-ai/
├── frontend/ # Web application
│ ├── pages/ # HTML pages
│ ├── assets/css/ # Stylesheets
│ ├── assets/js/ # JavaScript files
│ └── assets/images/ # Image assets
├── docs/ # Documentation
│ ├── setup/ # Installation guides
│ ├── api/ # API documentation
│ └── architecture/ # System design docs
├── tests/ # Test suites
│ ├── backend/ # Python backend tests
│ ├── frontend/ # JavaScript frontend tests
│ ├── integration/ # API integration tests
│ └── e2e/ # End-to-end browser tests
├── sql/schemas/ # Database schemas
├── scripts/ # Deployment and dev scripts
└── config/ # Project configuration
- Document Upload: Upload transcripts, resumes, course syllabi via Watson Assistant
- AI-Powered Analysis: Automatic text extraction and embedding generation
- Status Tracking: Check CPL request status and progress
- Natural Language Interface: Conversational interaction through Watson Assistant
- Request Management: Review and evaluate CPL requests
- Document Access: Download and preview original student documents
- Status Updates: Approve/deny requests with credit allocation
- Metadata Search: Find requests by student, course, or request type
- Semantic Search: Vector-based similarity search through document content
- Intelligent Chunking: Context-aware document segmentation
- Metadata Enrichment: Embedded student context in searchable content
- Multi-format Support: PDF, DOCX, and TXT document processing
- Node.js: 18.17.0 or higher
- Python: 3.9 or higher
- IBM Cloud Account: For watsonx.ai, COS, and watsonx.data access
- Clone Repository:
git clone <repository-url>
cd CPL-Website- Install Node.js Dependencies:
npm install- Install Python Dependencies:
pip install -r docs/setup/requirements.txt- Configure Environment Variables:
Create a
.envfile in the project root with the following variables:
# IBM watsonx.ai Configuration
WATSONX_AI_APIKEY=your_watsonx_api_key
WATSONX_AI_SERVICE_URL=https://us-south.ml.cloud.ibm.com
WATSONX_AI_PROJECT_ID=your_project_id
# Milvus Vector Database
MILVUS_CONNECTION_ID=your_milvus_connection_id
MILVUS_HOST=your_milvus_host
MILVUS_PORT=32668
MILVUS_USERNAME=your_username
MILVUS_PASSWORD=your_password
# IBM Cloud Object Storage
COS_API_KEY=your_cos_api_key
COS_INSTANCE_ID=your_cos_instance_id
COS_ENDPOINT=https://s3.us-south.cloud-object-storage.ibmcloud.com
COS_BUCKET_NAME=cpl-documents
# IBM watsonx.data (Iceberg)
WATSONX_DATA_HOST=your_presto_host
WATSONX_DATA_PORT=30670
WATSONX_DATA_USER=ibmlhapikey_your_email
WATSONX_DATA_PASSWORD=your_api_key
ICEBERG_CATALOG=iceberg_data
ICEBERG_SCHEMA=cpl_schema
ICEBERG_TABLE=cpl_requests- Create Milvus Collection:
cd backend/scripts
python create_cpl_collection.py- Create Iceberg Table:
# Execute the SQL schema in watsonx.data
cat sql/schemas/CREATE-TABLE.sql- Start Python Backend:
cd backend/services
python watson_upload.py
# Service runs on http://localhost:5000- Start Node.js Gateway:
npm start
# Service runs on http://localhost:3000- Access Application:
Open browser to:
http://localhost:3000/frontend/pages/index.html
POST /api/upload- Upload student documentsGET /api/download-document/:documentId/:filename- Download documentsGET /api/preview-document/:documentId/:filename- Preview documentsGET /api/requests- Get all CPL requestsGET /api/requests-by-nuid/:nuid- Get requests by student IDPUT /api/requests/:id/status- Update request statusGET /health- Service health check
POST /api/upload-to-watsonx- Process documents with AIGET /api/get-requests- Query Iceberg for requestsPUT /api/update-status- Update request status in IcebergPOST /api/search- Vector search through documentsGET /health- Service health check
npm testcd tests
python -m pytest backend/ -v --cov=backendcd tests/frontend
npm testpython -m pytest tests/integration/ -vnpx playwright test tests/e2e/The system uses 800-character chunks with 150-character overlap, optimized for the 512-token limit of the IBM embedding model.
- Collection:
cpl_documents_v5 - Index Type: HNSW with L2 metric
- Dimensions: 768 (embedding vector size)
- All API keys stored in environment variables
- CORS enabled for cross-origin requests
- File type validation for uploads
- Authentication through IBM Cloud services
- Use environment-specific
.envfiles - Enable HTTPS for all services
- Configure load balancing for high availability
- Set up monitoring and logging
- Implement backup strategies for vector data
Update the following for production deployment:
- Use production IBM Cloud service URLs
- Configure production Milvus cluster
- Set up production COS buckets
- Use production Iceberg catalogs
- Code Style: Follow ESLint for JavaScript, Black for Python
- Testing: Maintain test coverage above 80%
- Documentation: Update README for significant changes
- Version Control: Use feature branches and pull requests
Service Connection Errors:
- Verify environment variables are correctly set
- Check IBM Cloud service status
- Ensure Milvus and Iceberg services are running
File Upload Failures:
- Check file size limits (default: 50MB)
- Verify supported file types: PDF, DOCX, TXT
- Ensure COS bucket permissions are correct
Watson Assistant Issues:
- Verify Watson Assistant credentials
- Check CORS configuration for embedded iframe
- Ensure proper integration ID and region settings
- Node.js logs: Console output from
npm start - Python logs: Console output from backend services
- Browser logs: Developer Tools Console
- Vector database logs: Milvus server logs
This project is licensed under the ISC License - see the package.json file for details.
- Northeastern University College of Professional Studies
- IBM watsonx.ai and IBM Cloud services
- Open source community for Python and Node.js packages
- Reach out to me for any collaborations - ummaraali2020@gmail.com
- Please see the contributions.md file