This repository contains the code for the tasks of the first homework of the course Generative artificial intelligence for graphics and multimedia (01VRWOV, 01VRWYG) at the Polytechnic University of Turin
The tasks cover the implementation of several GAN (Generative Adversarial Network) based architectures trained on the Oxford Flowers102 dataset:
- Objective: train and test an unconditional GAN from scratch
- Architecture: DCGAN-based architecture
- Implementation Details:
- ConvTranspose2D layers replaced with Conv2D + Upsample to avoid checkerboard patterns
- images resized to 64x64 and pixels normalized to [-1, 1]
- Results Preview:
- Objective: extend Ex.1 to achieve conditional generation through labels
- Architecture: DCGAN-based architecture
- Implementation Details:
- ConvTranspose2D layers replaced with Conv2D + Upsample to avoid checkerboard patterns
- images resized to 64x64 and pixels normalized to [-1, 1]
- labels concatenated in the first convolutional layer
- Results Preview:
- Objective: re-implement Ex.1 using the Wasserstein objective with Gradient Penalty
- Architecture: DCGAN-based architecture
- Implementation Details:
- ConvTranspose2D layers replaced with Conv2D + Upsample to avoid checkerboard patterns
- BatchNorm2d layers replaced with InstanceNorm2d layers in the critic
- images resized to 64x64 and pixels normalized to [-1, 1]
- Results Preview:
- Objective: extend Ex.3 to achieve conditional generation through labels
- Architecture: DCGAN-based architecture
- Implementation Details:
- ConvTranspose2D layers replaced with Conv2D + Upsample to avoid checkerboard patterns
- BatchNorm2d layers replaced with GroupNorm2d layers in the critic
- images resized to 64x64 and pixels normalized to [-1, 1]
- labels passed separately wrt images, and "combined" via a dot product of their final representations
- Results Preview:
- Objective: implement the Pix2Pix architecture for paired image translation from dark and noisy to bright images
- Architecture: Pix2Pix (PatchGAN discriminator + UNet generator)
- Implementation Details:
- dark and noisy images created by multiplying by a dark factor (from 10% to 40%) and by adding Gaussian noise
- images resized to 256x256 and pixels normalized to [-1, 1]
- to avoid overfitting, training has been done on the val and test splits (7k images), leaving about 1k images of the training split for inference
- given the high computational effort to pass 256x256 images, a batch size of 1 is used
- Results Preview:
- Dependencies: Python 3.10+, PyTorch, Torchvision, OpenCV (
cv2), NumPy, Matplotlib. Install them with:
pip install -r requirements.txtThe trained weights for the Ex.5 Pix2Pix Generator are tracked using Git Large File Storage (LFS) due to the large memory footprint of the 256x256 UNet.
To successfully clone this repository and download the actual .pth file (rather than a 130-byte text pointer), ensure you have Git LFS installed on your system before cloning:
# install Git LFS
git lfs install
# clone the repository
git clone https://github.com/Malgesw/GenAI-Homework1If you already cloned the repository without Git LFS, install it and pull the weights directly:
# install Git LFS
git lfs install
# pull the weights
git lfs pullTo run training and/or inference with one of the architectures, simply run the corresponding jupyter notebook.




