AI-Powered Photo Metadata Generation for SmugMug
Automatically generate descriptive captions and relevant tags for your SmugMug photos using local AI vision models. smugVision combines computer vision, face recognition, and EXIF metadata to create rich, context-aware descriptions for your photo albums.
✨ AI-Powered Metadata Generation
- Generate descriptive captions using local Llama 3.2 Vision model
- Create relevant keyword tags automatically
- Context-aware prompts with location and person information
👤 Face Recognition
- Identify people in photos automatically
- Organize reference faces in a simple folder structure
- Configurable confidence thresholds
📍 Location Intelligence
- Extract GPS coordinates from EXIF data
- Reverse geocoding for human-readable locations
- Automatic location context in captions and tags
🖼️ Smart Image Processing
- Support for HEIC/HEIF formats
- Automatic orientation correction
- Skip already-processed images
- Video file detection and exclusion (optional)
🔄 SmugMug Integration
- OAuth 1.0a authentication
- Batch album processing
- Preserve existing metadata (optional)
- Dry-run mode for safe previewing
🚀 Performance & Reliability
- Local caching to avoid re-downloading
- Configurable image sizes
- Progress tracking and detailed logging
- Comprehensive error handling
- Python 3.9 or higher
- Ollama with
llama3.2-visionmodel - SmugMug account with API credentials
# Install from source
pip install git+https://github.com/yourusername/smugvision.git
# Or install locally for development
git clone https://github.com/yourusername/smugvision.git
cd smugvision
pip install -e .git clone https://github.com/yourusername/smugvision.git
cd smugvision
pip install -r requirements.txtollama pull llama3.2-visionRun the interactive configuration setup:
# If installed via pip:
smugvision-config
# Or using Python module:
python -m smugvision.config.manager --setupThis will create ~/.smugvision/config.yaml and prompt you for:
- SmugMug API key and secret
- SmugMug user token and secret
- Default processing options
-
Get API Key & Secret:
- Visit https://api.smugmug.com/api/developer/apply
- Create a new application
- Note your API Key and Secret
-
Get User Token & Secret:
- Run the OAuth helper:
# If installed via pip: smugvision-get-tokens # Or using the script: python scripts/get_smugmug_tokens.py
- Follow the OAuth flow in your browser
- Copy the user token and secret to your config
- Run the OAuth helper:
Process an album by SmugMug album key:
# If installed via pip:
smugvision --gallery abc123
# Or using Python module:
python -m smugvision --gallery abc123Process an album by URL:
smugvision --url "https://site.smugmug.com/path/to/n-XXXXX/album-name"Preview what changes would be made without updating SmugMug:
smugvision --gallery abc123 --dry-runReprocess images even if they already have the smugvision marker tag:
smugvision --gallery abc123 --force-reprocessBy default, video files are skipped. To include them:
smugvision --gallery abc123 --include-videosEnable detailed debug logging:
smugvision --gallery abc123 --verboseUse a custom configuration file:
smugvision --gallery abc123 --config /path/to/config.yamlsmugVision stores its configuration in ~/.smugvision/config.yaml. Here's an overview of the key settings:
smugmug:
api_key: "your_api_key"
api_secret: "your_api_secret"
user_token: "your_user_token"
user_secret: "your_user_secret"vision:
model: "llama3.2-vision"
endpoint: "http://localhost:11434"
temperature: 0.7
max_tokens: 150face_recognition:
enabled: true
reference_faces_dir: "~/.smugvision/reference_faces"
tolerance: 0.6
model: "hog"
detection_scale: 0.5
min_confidence: 0.25processing:
generate_captions: true
generate_tags: true
preserve_existing: true
marker_tag: "smugvision"
image_size: "Medium"
skip_videos: true
use_exif_location: truecache:
directory: "~/.smugvision/cache"
preserve_structure: trueFor a complete configuration example, see config.yaml.example.
-
Create reference faces directory:
mkdir -p ~/.smugvision/reference_faces -
Organize reference faces:
~/.smugvision/reference_faces/ ├── John_Doe/ │ ├── photo1.jpg │ ├── photo2.jpg │ └── photo3.jpg └── Jane_Smith/ ├── photo1.jpg └── photo2.jpg -
Optimize reference faces (optional but recommended):
# If installed via pip: smugvision-optimize-faces # Or using the script: python scripts/optimize_reference_faces.py ~/.smugvision/reference_faces
This resizes images for faster loading and processing.
- Use 3-5 clear, well-lit photos per person
- Include photos from different angles
- Avoid sunglasses or heavy shadows
- Larger faces work better (crop to face if needed)
- Name folders like
First_Last(underscores will be converted to spaces)
- Album Retrieval: Connects to SmugMug API and fetches album metadata and image list
- Image Download: Downloads images to local cache (configurable size: Thumb, Small, Medium, Large, XLarge)
- EXIF Extraction: Reads EXIF data for GPS coordinates, date/time, orientation
- Location Lookup: Reverse geocodes coordinates to human-readable location names
- Face Recognition: Detects and identifies known people in photos
- Context Building: Combines location, people, and EXIF data into context
- AI Generation: Sends images with context to Llama 3.2 Vision for captions and tags
- Metadata Formatting: Combines AI-generated metadata with extracted context
- SmugMug Update: Patches image metadata via SmugMug API
- Progress Tracking: Reports statistics and any errors
smugVision can extract GPS coordinates from EXIF data and convert them to readable locations:
- Geocoding Provider: Uses Nominatim (OpenStreetMap) by default
- Custom User Agent: Configure in
~/.smugvision/geocoding_config.yaml - Rate Limiting: Respects Nominatim's usage policy (1 request/second)
- Caching: Location lookups are cached to minimize API calls
Add context about relationships between people in photos:
Create ~/.smugvision/relationships.yaml:
relationships:
John_Doe:
Jane_Smith: "wife"
Billy_Doe: "son"
Jane_Smith:
John_Doe: "husband"This helps the AI generate more contextual captions like "John with his wife Jane at the beach."
Customize the AI prompts in your config.yaml:
prompts:
caption: |
Analyze this image and provide a detailed, engaging caption
that describes the scene, subjects, and atmosphere.
tags: |
Generate descriptive keyword tags for this image.
Focus on subjects, activities, location, mood, and composition.python tests/test_smugmug.py --gallery abc123python tests/test_vision.py path/to/image.jpgpython tests/debug_face_recognition.py path/to/image.jpgpython tests/test_processor.py --gallery abc123 --dry-runpip install -e ".[dev]"This installs additional tools for testing and development:
pytestfor running testsblackfor code formattingflake8for lintingmypyfor type checking
- Ensure Ollama is running:
ollama serve - Verify model is installed:
ollama list - Check endpoint in config:
vision.endpoint
- Verify API credentials in
~/.smugvision/config.yaml - Regenerate user tokens with
smugvision-get-tokens - Check that your SmugMug account has API access enabled
- Ensure reference faces directory exists and contains images
- Try lowering
face_recognition.tolerance(more permissive) - Verify reference faces are clear and well-lit
- Run
smugvision-optimize-facesto improve performance
- Check SmugMug album permissions (must be accessible via API)
- Verify album key or URL is correct
- Try a different
image_sizein config - Check network connectivity
- Reduce
face_recognition.detection_scale(e.g., 0.25) - Use smaller
image_sizefor processing - Process albums in smaller batches
- Use Medium-sized images: Balances quality and speed
- Optimize reference faces: Run
smugvision-optimize-facesonce - Enable caching: Avoid re-downloading images (default: enabled)
- Skip videos: Video processing is slower (default: skipped)
- Adjust detection scale: Lower values = faster face detection
- Use marker tags: Automatically skip already-processed images
smugVision is organized into modular components:
smugmug/: SmugMug API client and data modelsvision/: Vision model abstraction and Llama integrationface/: Face detection and recognition systemprocessing/: Image processing orchestration and metadata formattingcache/: Local image caching and managementutils/: EXIF extraction, geocoding, and utilitiesconfig/: Configuration management and validation
For detailed architecture documentation, see DESIGN.md.
- Local Processing Only: Requires local Ollama installation
- Single Album at a Time: No batch folder processing yet (planned)
- SmugMug API Rate Limits: Respects SmugMug's rate limiting
- Face Recognition Accuracy: Depends on quality of reference faces
- Geocoding Rate Limits: Nominatim allows 1 request/second
See DESIGN.md for detailed roadmap. Planned features include:
- Batch folder processing
- Web UI for monitoring and control
- Multiple vision model support (GPT-4V, Claude Vision)
- Smart duplicate detection
- Custom metadata templates
- Integration with other photo services
- Docker deployment option
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Ollama: Local LLM runtime
- Meta's Llama: Vision model
- face_recognition: Face detection library
- SmugMug API: Photo hosting platform
- Nominatim: Geocoding service
- Issues: GitHub Issues
- Documentation:
DESIGN.mdfor architecture details - Face Recognition Guide:
README_FACE_RECOGNITION.md
Built with ❤️ for photographers who love automation