Skip to content

sam22ridhi/SpectraVIT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation


SpectraViT: Hybrid Spectral Analysis with Vision Transformers for Melanoma Classification

SpectraViT is a novel deep learning model designed to enhance melanoma classification through a combination of Fourier transforms, Wavelet transforms, and Vision Transformers (ViT). The model combines the spectral analysis capabilities of Fourier and Wavelet transforms with the attention-driven feature extraction of Vision Transformers to capture both fine-grained and holistic details from dermatoscopic images of skin lesions.

Overview

Project Objective

The goal of SpectraViT is to improve the accuracy of binary classification of melanoma by:

  • Utilizing spectral transformations to capture high-frequency details and textures.
  • Leveraging the Vision Transformer’s capacity for capturing global dependencies within an image.
  • Combining these techniques to create a hybrid model that performs both low-level feature extraction and high-level contextual analysis.

Approach

SpectraViT integrates Fourier and Wavelet transforms as preprocessing steps, with a Vision Transformer layer to aggregate both local and global information from the images. The main steps in SpectraViT are as follows:

  1. Fourier Transform Layer: Extracts high-frequency components of the image, enhancing the model’s ability to identify textural details.
  2. Wavelet Transform Layer: Performs a multi-level decomposition (Level 2 Haar wavelet) to capture spatial-frequency features at different resolutions.
  3. Hybrid Pooling: Combines outputs from the Fourier and Wavelet transforms to reduce dimensions while retaining essential features.
  4. Vision Transformer: Processes the spectral information as well as spatial context, enhancing the model’s ability to understand the structure and distribution of image patterns.

Results

  • Accuracy: SpectraViT achieves an accuracy of 92%* on the melanoma classification dataset, showing a 2% improvement over standard ViT.
  • Robustness: The model is resilient to noise and effectively captures both fine and coarse details, making it suitable for medical imaging tasks where precision is crucial.

Contact

For questions or feedback, please contact at samridhiraj04@gmail.com.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors