Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions Promt.slump
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
#!/bin/bash
#SBATCH --partition=GPUQ
#SBATCH --account=<ACCOUNT-NAME-HERE>
#SBATCH --time=9-99:99:99
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --gres=gpu:a100:4
#SBATCH --constraint="gpu40g|gpu80g"
#SBATCH --job-name="generate_prompts"
#SBATCH --output=generate_prompts.out
#SBATCH --mem=64G

module purge
module --ignore_cache load foss/2022a
module --ignore_cache load Python/3.10.4-GCCcore-11.3.0

VENV_DIR=$(mktemp -d -t env-repaint-XXXXXXXXXX)
python3 -m venv "$VENV_DIR"
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip

pip uninstall -y numpy opencv-python pillow scipy scikit-image # remove conflicting libs

pip install --no-cache-dir --force-reinstall \
numpy==1.23.5 \
opencv-python==4.6.0.66 \
pillow==9.3.0 \
scipy==1.9.3 \
scikit-image==0.19.3 \
einops==0.6.0 \
lmdb==1.3.0 \
lpips==0.1.4 \
PyYAML==6.0 \
tensorboardX==2.5.1 \
timm==0.6.12 \
torch==1.13.0 \
torchsummaryX==1.3.0 \
torchvision==0.14.0 \
tqdm \
gradio==3.39.0

pip install Ninja
pip install tensorboard scikit-image
pip install -U torch torchvision
pip install ema-pytorch
pip install diffusers transformers accelerate scipy safetensors

export PYTORCH_CUDA_ALLOC_CONF="expandable_segments:True"


python generate_prompts.py --images_folder /path/to/images --output_file prompts.json

deactivate
rm -rf "$VENV_DIR"
36 changes: 26 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,32 @@
# Flickr Diverse Faces - FDF

Flickr Diverse Faces (FDF) is a dataset with **1.5M faces** "in the wild".
FDF has a large diversity in terms of facial pose, age, ethnicity, occluding objects, facial painting, and image background.
The dataset is designed for generative models for face anonymization, and it was released with the paper "*DeepPrivacy: A Generative Adversarial Network for Face Anonymization.*

The dataset is designed for generative models for face anonymization, and it was released with the paper "_DeepPrivacy: A Generative Adversarial Network for Face Anonymization._

![](media/header_im.jpg)

The dataset was crawled from the website Flickr ([YFCC-100M dataset](http://projects.dfki.uni-kl.de/yfcc100m/)) and automatically annotated.
Each face is annotated with **7 facial landmarks** (left/right ear, lef/right eye, left/right shoulder, and nose), and a **bounding box** of the face. [Our paper]() goes into more detail about the automatic annotation.



## Licenses

The images are collected from images in the YFCC-100M dataset and each image in our dataset is free to use for **academic** or **open source** projects.
For each face, the corresponding original license is given in the metadata. Some of the images require giving proper credit to the original author, as well as indicating any changes that were made to the images. The original author is given in the metadata.

The dataset contains images with the following licenses:
- [CC BY-NC-SA 2.0](https://creativecommons.org/licenses/by-nc-sa/2.0/): 623,598 Images (23.4 GB)
- [CC BY-SA 2.0](https://creativecommons.org/licenses/by-sa/2.0/): 199,502 Images 7.4 GB)

- [CC BY-NC-SA 2.0](https://creativecommons.org/licenses/by-nc-sa/2.0/): 623,598 Images (23.4 GB)
- [CC BY-SA 2.0](https://creativecommons.org/licenses/by-sa/2.0/): 199,502 Images 7.4 GB)
- [CC BY 2.0](https://creativecommons.org/licenses/by/2.0/): 352,961 Images (13.1 GB)
- [CC BY-NC 2.0](https://creativecommons.org/licenses/by-nc/2.0/): 295,192 Images (10.9 GB)

The FDF metadata is under [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).

## Citation

If you find this code or dataset useful, please cite the following:

```
@InProceedings{10.1007/978-3-030-33720-9_44,
author="Hukkel{\aa}s, H{\aa}kon
Expand All @@ -47,19 +49,20 @@ isbn="978-3-030-33720-9"
pip install wget, tqdm
```

2. To download metadata, run (expects python 3.6+):
2. To download metadata, run (expects python 3.6+):

```
python download.py --target_directory data/fdf
```

3. If you want to download including images:

```
python download.py --target_directory data/fdf --download_images
```


## Metainfo

For each face in the dataset, follows the following metainfo:

```
Expand All @@ -68,7 +71,7 @@ For each face in the dataset, follows the following metainfo:
"author": "flickr_username",
"bounding_box": [], # List with 4 eleemnts [xmin, ymin, xmax, ymax] indicating the bounding box of the face in the FDF image. In range 0-1.
"category": "validation", # validation or training set
"date_crawled": "2019-3-6",
"date_crawled": "2019-3-6",
"date_taken": "2010-01-16 21:47:59.0",
"date_uploaded": "2010-01-16",
"landmark": [], # List with shape (7,2). Each row is (x0, y0) indicating the position of the landmark. Landmark order: [nose, r_eye, l_eye, r_ear, l_ear, r_shoulder, l_shoulder]. In range 0-1.
Expand All @@ -84,23 +87,36 @@ For each face in the dataset, follows the following metainfo:
}
```

## Prompt Generation

For tasks that require text prompts, we have added separate script `generate_prompts.py`. This script can be ran using the following command. The workload can be demanding so we urge use to submit it using ['Promt.slump' script](https://www.hpc.ntnu.no/idun/documentation/running-jobs/).

```
python generate_prompts.py --images_folder /path/to/images --output_file prompts.json
```

## Statistics

#### Distribution of image licenses

![](media/license_pie_chart.png)

### Training vs Validation Percentage

There are 50,000 validation images, 1,421,253 training images.

![](media/category_pie_chart.png)

### Original Face size
Each face in the original image has a resolution of minimum:

Each face in the original image has a resolution of minimum:

![](media/face_size_chart.png)

## Citation

If you find the dataset useful, please cite the following:

```
@InProceedings{10.1007/978-3-030-33720-9_44,
author="Hukkel{\aa}s, H{\aa}kon
Expand Down
78 changes: 78 additions & 0 deletions generate_prompts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
"""

This script generates text prompts (captions) for images in a specified folder

using a BLIP image-to-text pipeline. The generated prompts are saved as key-value
pairs in a JSON file, where the key is the image filename and the value is the
corresponding prompt

Added as part of masters project Spring 2025
"""

from json import dump
from os import listdir, path

from PIL import Image
from transformers import pipeline


def generate_prompts_for_folder(images_folder: str, output_file: str) -> None:
"""
Processes all images in `images_folder` to generate a text prompt (caption)
for each image using a BLIP image-to-text pipeline, and then saves them as
key-value pairs in a JSON file (image filename: prompt).
"""
captioner = pipeline("image-to-text", model="Salesforce/blip-image-captioning-base")

prompts_dict = {}

for filename in listdir(images_folder):
if not filename.lower().endswith((".png", ".jpg", ".jpeg")):
continue # Skip non-image files.

image_path = path.join(images_folder, filename)
try:
image = Image.open(image_path).convert("RGB")
except Exception as e:
print(f"Error loading {image_path}: {e}")
continue

try:
generated = captioner(image)

prompt_text = generated[0][
'generated_text'
] # Extract the generated text from the output
except Exception as e:
print(f"Error generating caption for {filename}: {e}")
prompt_text = "error generating caption"

prompts_dict[filename] = prompt_text
print(f"Processed {filename}: {prompt_text}")

with open(output_file, "w") as f:
dump(prompts_dict, f, indent=4)
print(f"Saved prompts to {output_file}")


if __name__ == "__main__":
from argparse import ArgumentParser

parser = ArgumentParser(
description="Generate image captions for all images in a folder and save to a JSON file."
)
parser.add_argument(
"--images_folder",
type=str,
required=True,
help="Path to the folder containing images",
)
parser.add_argument(
"--output_file",
type=str,
default="image_prompts.json",
help="Path to the output JSON file",
)
args = parser.parse_args()

generate_prompts_for_folder(args.images_folder, args.output_file)