Skip to content

Hands On Waidrin #52

@accemlcc

Description

@accemlcc

Hey Pew, the AI coder and I thought we'd pimp Waidrin a little:

  • Main feature is dynamic image generation with Flux 2 Klein via ComfyUI
  • Every NPC gets an avatar
  • Avatars are embedded in scenes via multi-reference Kontext conditioning
  • Each dialogue turn generates a scene illustration
  • Genre selection (Fantasy / Sci-Fi) – influences prompts, races, image style
  • D&D-lite RPG stats (STR/DEX/CON/INT/WIS/CHA + HP) with genre-specific enforcement
  • Language selection at the beginning → prompts are automatically translated
  • Random character/world generation with re-roll dice button
  • Optional custom appearance text field for the protagonist
  • State JSON and the rest remain largely original (minor schema addition for image references)

I think that's pretty cool for immersion. The images are still a bit cringe, but that should be manageable.

Disadvantages:

Currently only runs with multi-GPU (1× llama.cpp / 2× ComfyUI), though single-GPU would be theoretically possible with sequential loading
Image generation lags slightly behind the text
The question is: are you interested in a PR? The changes to Waidrin are not insignificant, and until it's cleanly integrated, it could be a minor undertaking ;-)

The multi-GPU setup in particular is likely to be a deal breaker for most people. So the question of effort/return arises – ultimately, I don't care, I can just make a fork at some point.

Image Image Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions