A semantic search engine for X-ray images supporting text queries and reverse image search, powered by OpenAI CLIP.
- Text Search: Enter queries like "chest x-ray" or "fractured bone"
- Image Search: Upload an X-ray to find visually similar images
- Category Filters: Narrow results to specific X-ray types
- Ranking Scores: See similarity percentages for each result
- REST API: Programmatic access via FastAPI endpoints
# Python dependencies
pip install torch torchvision transformers pandas Pillow fastapi uvicorn python-multipart icrawler
# Frontend dependencies
cd frontend && npm installpython src/collect_data_icrawler.pypython src/indexer.pyStart Backend (Terminal 1):
python -m uvicorn src.api:app --reload --port 8000Start Frontend (Terminal 2):
cd frontend && npm run devAccess:
- Frontend: http://localhost:3000
- API Docs: http://localhost:8000/docs
xray_project/
├── src/
│ ├── api.py # FastAPI backend
│ ├── model.py # CLIP model wrapper
│ ├── indexer.py # Embedding generator
│ └── collect_data_icrawler.py # Data collection script
├── frontend/
│ ├── src/App.jsx # React UI
│ └── package.json
├── dataset/
│ ├── images/ # 550 X-ray images (9 categories)
│ ├── metadata.csv # Image metadata
│ └── embeddings.pt # Pre-computed CLIP embeddings
├── archive/ # Assignment details and reference
├── Screenshots/ # Screenshots
├── README.md
├── report.md
└── walkthrough.md
- Total Images: 550
- Categories: chest, spine, dental, fracture, skull, lung, knee, hand, pelvis
- Source: Bing Image Search via icrawler
- Metadata:
dataset/metadata.csvwith image_name, source_url, category
| Endpoint | Method | Description |
|---|---|---|
/search/text?q=query |
GET | Text-based semantic search |
/search/image |
POST | Image-based similarity search |
/categories |
GET | List all categories |
/images/{path} |
GET | Serve image files |
- Backend: FastAPI, PyTorch, CLIP (openai/clip-vit-base-patch32)
- Frontend: React, Vite, TailwindCSS
- Search: Cosine similarity on CLIP embeddings
"I designed the system to be compatible with Pinecone for production scaling (to handle millions of images), but kept the local
.ptimplementation for this assignment to ensure ease of setup for the reviewer without needing external API keys."

