Two-phase pipeline to cut source images into grid tiles (Phase 1) and reassemble them with a mask-free, border-based best-buddies solver (Phase 2). Includes an interactive GUI for viewing before/after results.
- Quick Start
- Setup
- Data Layout
- Architecture Overview
- Phase 1: Preprocessing & Tile Cutting
- Phase 2: Puzzle Solving
- GUI: Interactive Viewer
- Design Justifications
- References
- Troubleshooting
One command to do everything:
python run_all.pyThis will:
- ✓ Check if Phase 1 has been run (runs it if not)
- ✓ Check if Phase 2 has been run (runs it if not)
- ✓ Launch interactive GUI while processing continues in background
- ✓ Puzzles automatically ordered: image 0 (2x2, 4x4, 8x8), image 1 (2x2, 4x4, 8x8), etc.
- ✓ Shows "Loading..." for images being processed
- ✓ Click Next/Previous to browse while data is being generated
python -m venv .venv
& .\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
python run_all.py
## Data Layout
- Input images live under `dataset_images/`, grouped by folders that include `2x2`, `4x4`, or `8x8` in their name (e.g., `puzzle_4x4/0/*.png`).
- Phase 1 outputs land in `phase1_outputs/<group>/<image_id>/tiles/` with `metadata.json`.
- Phase 2 writes assembled puzzles to `phase2_outputs/<group>/<image_id>.png`.
---
## Architecture Overview
The pipeline consists of two distinct phases:
┌─────────────────────────────────────────────────────────┐ │ INPUT DATASET │ │ (Unscrambled source images in grid folders) │ └──────────────────┬──────────────────────────────────────┘ │ ▼ ┌──────────────────────────────┐ │ PHASE 1: CUTTING │ │ • Grid detection │ │ • Deterministic tiling │ │ • Edge-preserving enhance │ └──────────────────┬───────────┘ │ ▼ ┌──────────────────────────────┐ │ PHASE 1 OUTPUTS │ │ • tiles/tile_r_c.png │ │ • metadata.json │ └──────────────────┬───────────┘ │ ▼ ┌──────────────────────────────┐ │ PHASE 2: SOLVING │ │ • Border feature extraction │ │ • Best-buddies matching │ │ • Beam-search placement │ └──────────────────┬───────────┘ │ ▼ ┌──────────────────────────────┐ │ PHASE 2 OUTPUTS │ │ • {image_id}.png (solved) │ └──────────────────┬───────────┘ │ ▼ ┌──────────────────────────────┐ │ GUI VIEWER │ │ • Before/After display │ │ • Interactive navigation │ └──────────────────────────────┘
---
## Phase 1: Preprocessing & Tile Cutting
### Purpose
Transform raw puzzle images into standardized, enhanced tiles that preserve edge information for accurate matching.
### Process Flow
#### 1. **Grid Detection**
```python
def detect_grid_from_folder(folder_name: str):
# Detects grid size (2x2, 4x4, 8x8) from folder name
# E.g., "puzzle_4x4" → (4, 4)
Justification: Using folder naming convention for grid specification is deterministic, eliminates need for automatic grid detection (which is ambiguous), and provides user control.
Input image → uniformly divided into exact grid cells with no overlapping or gaps.
Formula:
tile_height = image_height / num_rows
tile_width = image_width / num_cols
tile[r,c] = image[r*H : (r+1)*H, c*W : (c+1)*W]
Justification:
- Deterministic approach ensures reproducibility
- No contour detection avoids misalignment issues
- Uniform tiles preserve edge alignment for matching
Applied to each tile to enhance matching features while preserving borders:
Input Image
↓
[1] Bilateral Denoising
└─ Preserves edges while smoothing textures
└─ cv2.bilateralFilter(d=9, sigmaColor=40, sigmaSpace=40)
↓
[2] Guided Filtering (or Bilateral Fallback)
└─ Smooth without blur while preserving edges
└─ Uses tile itself as guide
└─ radius=8, eps=1e-2
↓
[3] Soft Unsharp Masking
└─ Crisp edges without halos
└─ addWeighted(img, 1.12, blur, -0.12)
↓
[4] Frequency Fusion
└─ Blend original (55%) + enhanced (45%)
└─ Retains detail while keeping enhancement subtle
↓
Output: Enhanced Tile
Justification:
- Bilateral denoising: Classic edge-preserving technique from computer vision (Tomasi & Manduchi, 1998)
- Guided filter: Advanced edge-preserving smoothing (He et al., 2010) better than Gaussian blur
- Unsharp masking: Standard technique to enhance local contrast
- Frequency fusion: Prevents over-enhancement by blending with original; avoids artificial artifacts that fool edge matchers
- Adaptive parameters: Scales with tile size to remain effective across 2x2, 4x4, and 8x8 grids
{
"grid": [4, 4],
"tile_height": 256,
"tile_width": 256,
"tile_filenames": ["tile_0_0.png", "tile_0_1.png", ...]
}Justification: Metadata allows Phase 2 to load tiles in deterministic order, independent of filesystem ordering.
phase1_outputs/
├── puzzle_2x2/
│ ├── 0/
│ │ ├── tiles/
│ │ │ ├── tile_0_0.png
│ │ │ ├── tile_0_1.png
│ │ │ ├── tile_1_0.png
│ │ │ └── tile_1_1.png
│ │ └── metadata.json
│ ├── 1/
│ └── ...
├── puzzle_4x4/
└── puzzle_8x8/
Reconstruct the original image by analyzing tile borders and finding correct placements using best-buddies matching algorithm.
Unlike traditional puzzle solvers that use contour segmentation masks, this approach directly analyzes pixel borders:
For each tile, extract 4 directional border patches (0=top, 1=right, 2=bottom, 3=left):
Border 0 (Top)
┌──────────────────┐
│░░░░░░░░░░░░░░░░░│ strip_width = 1 pixel
│ │
B3│ TILE │B1 (Right)
│ │
│░░░░░░░░░░░░░░░░░│
└──────────────────┘
Border 2 (Bottom)
Feature vector for each border includes:
- LAB Color (3 channels): Separates luminance (L) from color (A, B)
- Gradient Magnitude (1 channel): Edge strength via Sobel operators
- Gradient Direction (1 channel): Edge orientation
- Laplacian (1 channel): Second-derivative edge detector
feature_vector = [L, A, B, |∇|, θ(∇), ∇²]Justification:
- LAB color: Perceptually uniform color space; L channel matches human brightness perception
- Gradient: Captures edge sharpness and orientation at tile borders
- Laplacian: Detects fine texture and edge transitions
- Multiple scales: Combines different edge representations for robustness
- No contours: Avoids segmentation artifacts; works directly on pixels
Build 4 matrices (one per direction) measuring how well adjacent tiles match:
For side 0 (top), tile i's top border should match tile j's bottom:
compat[0][i, j] = distance(border_i[top], border_j[bottom])
Formula with weighted distances:
D(A, B) = [Σ|A-B|^p]^(q/p)
Where:
- p = 0.3 (sub-linear norm)
- q = 1/16 (compression factor)
- Weighted combination:
0.4 × color_distance
+ 0.2 × gradient_magnitude_distance
+ 0.2 × gradient_direction_distance
+ 0.4 × laplacian_distance
Justification:
- Sublinear norm (p=0.3): Makes outliers less influential; fewer false positives
- Weighted channels: Laplacian and color weighted equally (0.4) as primary cues; gradient magnitude/direction (0.2) as secondary
- 4 directional matrices: Separates constraint spaces for directional matching
Find strongly reciprocal matches:
Tile A's best match in direction 0 (top) = Tile B
Tile B's best match in direction 2 (bottom) = Tile A
→ Strong confidence that A and B are adjacent
Algorithm:
- For each tile and direction, find top-K candidates (candidates below distance threshold)
- Check bidirectional agreement: A→B in direction 0 AND B→A in direction 2
- Build confident edges in the placement graph
Justification:
- Bidirectional agreement: Dramatically reduces false matches vs single-direction voting
- Beam search: Explores multiple promising partial solutions instead of greedy left-to-right
- Dynamic seeding: Starts with most confident placements, allowing less confident corners to be resolved later
Build the solution incrementally using beam search:
Initialization:
For each possible start tile:
Place at (0,0)
Propagate constraints via best-buddies graph
→State = [partial placement, cost, unplaced tiles]
Iteration:
For each state in beam (top K by cost):
Try placing each remaining tile at empty position
If best-buddies confirm it:
Add to new beam
Prune to top K states by cost
Termination:
When all tiles placed or beam exhausted
→ Return best state found
Dynamic Seeding Benefits:
- Doesn't assume (0,0) is known
- Explores multiple root placements in parallel
- Adapts to missing edge matches
Justification:
- Beam search: Explores K promising paths vs greedy single path (more robust to local minima)
- Best-buddies prioritization: Heavily weighted constraints reduce search space
- Cost function balances:
- Strongly rewards best-buddy placements
- Penalizes mismatches
- Handles missing constraints gracefully
Once placement found:
For each position (r,c):
Retrieve tile at grid[r,c]
Extract tile image from phase1_outputs
Place at canvas position (r*tile_h, c*tile_w)
Save as phase2_outputs/{group}/{image_id}.png
phase2_outputs/
├── puzzle_2x2/
│ ├── 0.png
│ ├── 1.png
│ └── ...
├── puzzle_4x4/
└── puzzle_8x8/
- Left panel: Original puzzle image from dataset (unscrambled reference)
- Right panel: Solved puzzle image (output from Phase 2)
- Navigation: Browse through puzzles while Phase 1/2 processing continues in background
- Smart ordering: Puzzles grouped by image ID first, then grid size (compare same image across difficulties)
- Dynamic loading: Shows "Loading..." for incomplete data; auto-updates when ready
- Refresh: Force re-scan of available puzzles
- tkinter: GUI framework (built-in, cross-platform)
- PIL/Pillow: Image display and resizing
- OpenCV (cv2): Image manipulation
- Threading: Non-blocking background processes
- Subprocess: Launch Phase 1/2 without blocking GUI
Traditional approaches (e.g., Jigsaw solvers using contour masks):
- Require precise contour detection
- Fail on worn edges, curved jigsaw pieces, or puzzle-like images
- Add computational overhead
This approach (border pixels directly):
- Works on any image grid (photos, art, documents)
- Robust to worn or curved edges
- Simpler implementation
Alternatives:
- Greedy left-to-right: Fast but error-prone; early mistakes cascade
- Hungarian algorithm: Optimal but O(n³) complexity; too slow for 64+ tiles
- Genetic algorithms: Slow convergence
Best-buddies + beam search:
- Exploits strong reciprocal constraints (A↔B must match)
- Beam search explores K promising paths (more robust than greedy)
- Near-linear complexity for well-constrained puzzles
- Scales well to 64-tile puzzles (8x8 grid)
- Separates luminance from color: Gradient detection more robust
- Perceptually uniform: Color differences match human perception
- Standard in image processing: Used in SIFT, SURF, and modern CNN backbones
Large tiles (4x4, 8x8) have different texture scales than small tiles (2x2):
- Bilateral filter radius scales with tile size
- Guided filter radius matches tile structure → Same code works well across 2x2, 4x4, 8x8 without per-size tuning
Default: K=5 candidate states per iteration
- Explores 5 promising partial solutions in parallel
- Balances exploration vs computation time
- For most 4x4/8x8 puzzles: finds solution in first 1-2 iterations
- Fallback to lower K if time limit exceeded
- Pure enhanced image: Over-processed; artificial edges fool matcher
- Pure original image: Blurry; loses edge detail
- 55/45 blend: Retains original texture while boosting real edges → Best-buddies matcher gets natural, convincing borders
-
Best-Buddies Similarity - Freeman & Garland (2002) "Image Quilting for Texture Synthesis and Transfer"
- Foundational bidirectional matching concept
-
LAB Color Space - CIE 1976 Color Space
- Perceptually uniform opponent color model
-
Bilateral Filtering - Tomasi & Manduchi (1998) "Bilateral Filtering for Gray and Color Images"
- Edge-preserving denoising technique
-
Guided Filter - He, Sun, Tang (2010) "Guided Image Filtering"
- Advanced edge-preserving smoothing
- Sobel/Laplacian operators: Gradient detection (standard edge detection)
- Gaussian blur: Low-pass filtering
- Unsharp masking: Local contrast enhancement
- Image resizing (bilinear interpolation): cv2.INTER_LINEAR
- Beam search: Classic search algorithm used in NLP, planning
- Compatibility matrices: CSP (Constraint Satisfaction Problem) concept
- Greedy/approximate matching: Practical alternative to exhaustive search
- "No puzzles found": ensure
dataset_images/is populated and folders contain2x2,4x4, or8x8in their names. Phase 2 will attempt to generate missing Phase 1 outputs automatically. - "Cannot read image": verify image paths and permissions under
dataset_images/. - GUI not launching: Ensure tkinter is installed (built-in on Windows/Mac; on Linux run
sudo apt-get install python3-tk). - Puzzle solving varies: Due to random seed initialization in beam search, results may differ slightly between runs. Re-running the solver on the same puzzle may yield different (but equally valid) solutions.
pytest -q

