Skip to content

gkalstn000/DPTN_step

Repository files navigation

DPTN Step Generation

The PyTorch implementation based on DPTN model

img

Generating complex and fine-grained textures in one step is very challenging, often leading to poor results.

We propose a step-by-step approach, starting from generating images with coarse-grained textures and progressively moving to fine-grained images.

Generate Image Step-by-Step

image

Sampling Coarse-grained Image Process

  • discritize constant $d_j∈{1, 2, ⋯255}$:
    • $d_j>d_(j+1)$
  • $j^{th}$ coarse-grained Image $I^j$ can be described
    • $I^j=Q^j∙d_j$
    • $Q^j$ and $R^j$ represent $j^{th}$the quotient and remainder, respectively.

Step Prediction (ACGAN)

Building upon the framework of Auxiliary Classifier GAN (ACGAN) (PMLR 2017), we have incorporated an additional training process: the step prediction loss. This enhancement aims to stabilize the training phase.

Key Improvement

  • Step Prediction Loss: This added component mitigates a common issue in the original ACGAN framework. Without step prediction, the Discriminator often struggles to accurately determine which step of image generation should be executed. This confusion can lead to the generation of artifacts, such as image cracking. By integrating step prediction loss into the training process, we enhance the Discriminator's ability to more accurately guide the generation process, thereby reducing the occurrence of artifacts.

This modification to the ACGAN framework not only stabilizes the training process but also improves the overall quality of the generated images.

img

Installation

Requirements

  • Python 3
  • PyTorch 1.7.1
  • CUDA 10.2

Conda Installation

# 1. Create a conda virtual environment.
conda create -n DPTN_step python=3.8
conda activate DPTN_step
conda install -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.3

pip install -r requirements.txt

Dataset

  • Download img_highres.zip of the DeepFashion Dataset from In-shop Clothes Retrieval Benchmark.

  • Unzip img_highres.zip. You will need to ask for password from the dataset maintainers. Then rename the obtained folder as img and put it under the ./dataset/deepfashion directory.

  • We split the train/test set following GFLA. Several images with significant occlusions are removed from the training set. Download the train/test pairs and the keypoints pose.zip extracted with Openpose by runing:

    cd scripts
    ./download_dataset.sh

    Or you can download these files manually:

    • Download the train/test pairs from Google Drive including train_pairs.txt, test_pairs.txt, train.lst, test.lst. Put these files under the ./dataset/deepfashion directory.
    • Download the keypoints pose.rar extracted with Openpose from Google Driven. Unzip and put the obtained floder under the ./dataset/deepfashion directory.
  • Run the following code to save images to lmdb dataset.

    python -m scripts.prepare_data \
    --root ./dataset/deepfashion \
    --out ./dataset/deepfashion

Training

This project supports multi-GPUs training. The following code shows an example for training the model with 256x176 images using 4 GPUs.

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch \
--nproc_per_node=4 \
--master_port 1234 train.py \
--id $name_of_your_experiment \
--netG dptn \
--batchsize 20 \
 --num_workers 10 \

Inference

  • Run the following code to evaluate the trained model:

    CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch \
    --nproc_per_node=4 \
    --master_port 1234 test.py \
    --id $name_of_your_experiment \
    --save_id $save_folder \
    --netG dptn \
    --batchsize 20 \
    --num_workers 10 \
    --simple_test

Evaluation

eval_step.py

Results

image

Proposed training scheme can generate more detailed texture results than DPTN (baseline).

  • Achieved much lower FID score than DPTN

table


Style Mixing

image

image

image

We can mix styles without any additional processes, simply by inputting different style images at different steps.

This experiment reveals that in the initial steps, the synthesis primarily involves detailed styles (such as hair, gender, facial features, etc.), and as we progress to the later steps, more universal styles (like clothing color, etc.) are integrated into the synthesis.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors