A professional, research-friendly Text-to-Image generation project built using Stable Diffusion XL (SDXL), PyTorch, and Diffusers, designed to run smoothly inside Jupyter Lab.
This project focuses on:
- Clean architecture
- Efficient GPU/CPU usage
- Simple two-cell workflow
- High-quality image generation
👉 Download SDXL Base Model (safetensors) ⬇️ Click Here to Download
⬇️ Download file:
sd_xl_base_1.0.safetensors
📁 After downloading, place the model here:
models/sd_xl_base_1.0.safetensors
⚠️ The notebook will NOT run unless the model is placed correctly inside themodels/folder.
text-to-image-sdxl/
│
├── models/
│ └── sd_xl_base_1.0.safetensors
│
├── text_to_image.ipynb
├── README.md
└── requirements.txt (optional)
Stable Diffusion XL (SDXL) is a state-of-the-art text-to-image generative model capable of producing:
- Ultra-realistic images
- Cinematic lighting
- High-resolution outputs
- Strong prompt understanding
This project uses local inference, meaning:
- No API cost
- No internet dependency after setup
- Full control over generation
Download Python 3.10.x from:
https://www.python.org/downloads/
✔️ Make sure Python is added to PATH during installation.
python -m venv venvActivate it:
Windows
venv\Scripts\activateLinux / macOS
source venv/bin/activatepython -m pip install --upgrade pippip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121pip install torch torchvision torchaudiopip install diffusers transformers accelerate safetensorspip install jupyterlab ipythonLaunch Jupyter:
jupyter labThe notebook is intentionally designed with ONLY TWO MAIN CELLS.
📌 Purpose
- Loads the SDXL model into memory
- Detects GPU/CPU automatically
- Applies memory optimizations
- Prepares pipeline for reuse
📌 Rule
🚫 Run this cell ONLY ONE TIME per session
✔️ Re-running it again wastes memory and slows performance.
📌 Purpose
- Accepts text prompt
- Generates image
- Displays output inside Jupyter
📌 You can safely run this cell multiple times
- Change prompt
- Adjust parameters
- Generate unlimited images
✔️ Keep prompts descriptive but concise
✔️ Use negative prompts to remove artifacts
✔️ Best resolution: 768 × 768
✔️ CFG scale between 6 – 8 works best
Example:
Ultra realistic cinematic scene, golden hour lighting,
photorealistic, high detail
| Parameter | Meaning |
|---|---|
width / height |
Image resolution |
num_inference_steps |
More steps = better detail |
guidance_scale |
Prompt control strength |
negative_prompt |
Removes unwanted artifacts |
torch.no_grad() |
Faster & memory-safe inference |
- CPU (works but slower)
- 16 GB RAM
- NVIDIA GPU (8 GB+ VRAM)
- CUDA enabled
- SSD storage
✅ Local inference (no API dependency) ✅ Jupyter-friendly workflow ✅ Memory-optimized SDXL loading ✅ Clear separation of setup & inference ✅ Easy for demos, research, and presentations
This project is intended for:
- Educational use
- Research
- Demonstrations
Users are responsible for generated content.
If you like this project:
- ⭐ Star the repository
- 🍴 Fork it
- 🧠 Experiment with prompts
Happy Generating 🚀