A project to visualize and sonify the uncertainty of a language model's "thinking" process. Designed for the Penn State Behrend DIGIT program and VARIA lab.
This is a tool for visualizing a language model's uncertainty in real time as it generates text. At the alpha stage, running only in the shell, each token is color-coded by confidence level: from cool blue (certain) through yellow/orange (hesitating) to red (genuinely torn between multiple possibilities). High-uncertainty moments trigger a dramatic pause, helping to visualize the model's "thinking process". As the project develops, we intend to build this an art exhibit with light and sound effects to dramatize the experience of uncertainty in the model's processing.
In the alpha version, it runs only in a shell environment running Python 3 (we recommend 3.13), and depending on the local presence of Ollama to pull in a Qwen model. The script is set to pull Qwen2:5:15b, but can readily be changed to access a different model.
We intend to develop this as an interactive art exhibit project.
As the model generates each word (token), the script retrieves the probability the model assigned to that choice alongside its top alternative candidates. It computes a Shannon entropy score from those probabilities: a measure of how evenly spread the possibilities were. High entropy means the model was genuinely uncertain; low entropy means it was confident. This score drives the color display and the pacing of the output.
Shannon entropy is a calculation we're making after the model calculates probabilities for the next likely tokens.
In our Python script we have this line, which is pulling in information from our language model about the array of probabilities:
probs = [math.exp(lp) for lp in top_logprobs.values()]Here's the chain of events of what's happening with with each token's prediction, and how it relates to the Shannon entropy we're working with:
neural network
↓
logits (raw scores, any real number)
↓
softmax (converts to probabilities summing to 1.0)
↓
log-probs (what Ollama sends us: the natural log of those probabilities)
↓
math.exp() (we reverse the log to get probabilities back)
↓
Shannon entropy (we measure how spread the distribution is)
↓
uncertainty score (normalized to 0–1 and applied to our color / pause / visualization / sonification logic.
We're converting it to percentages for the token readouts)
Ollama is sending us probabilities in natural log form rather than plain probabilities, because token probabilities can be super small (like 10⁻³⁰), difficult to work with. Logarithms keep the numbers in a manageable range.
Here is another property we can tinker with: temperature, which adjusts the likelihood that low-probability responses are selected: Heightening the temperature can make for more "wild" random associations, and different models will have different recommended temperature settings for lucid responses.
We can adjust the temperature in the script, and that will be applied before softmax, by dividing all the logits by the temperature value:
logits_adjusted = logits / temperature
- Python 3.13
- Ollama (runs the language model locally)
These instructions work for both macOS and Windows. Commands are the same unless noted.
Download and install Ollama from ollama.com.
On macOS, Ollama installs as a menu bar app and starts automatically at login. You'll see a small llama icon in your menu bar when it's running. You do not need to run any command to start it.
On Windows, Ollama runs as a background service after installation. Check the system tray for the icon.
Once installed, pull the language models we are using in the project. We started with Qwen but are now evaluating small language models transparently trained on data that is not under copyright. This includes PleIAs, trained on the Common Corpus dataset, a 2.3 trillion-token dataset trained entirely on open source data.
ollama pull qwen2.5:1.5bThis downloads about 1GB and only needs to be done once.
[See Installing and Configuring with PleIAs Models below! You need to have worked through the rest of the setup and configured a Python environment to work with other models.
If you use git, clone this repo. If you downloaded/recieved a ZIP file for this project, unzip it to a folder of your choice. If you're cloning from GitHub:
git clone https://github.com/YOUR-USERNAME/YOUR-REPO-NAME.git
cd YOUR-REPO-NAMEA virtual environment keeps the project's dependencies isolated from the rest of your Python installation.
macOS / Linux:
python3.13 -m venv .venv
source .venv/bin/activateWindows (Command Prompt):
python -m venv .venv
.venv\Scripts\activate.batWindows (PowerShell):
python -m venv .venv
.venv\Scripts\Activate.ps1You'll know the virtual environment is active when your terminal prompt shows (.venv) at the start.
a. you need the requests library
pip install requestspip install -r requirements.txtMake sure your virtual environment is active (you'll see (.venv) in the prompt), then:
python uncertainty-monitor.pyThis launches interactive mode. You'll be prompted to type a sentence for the model to continue. Press Enter with no text to use the built-in default prompt. Type quit or press Ctrl-C to exit.
You can also pass a prompt directly as a command-line argument for a single run:
python uncertainty-monitor.py "The thing that surprised me most was"pip freeze > requirements.txtPrompts that work best are ones where multiple continuations are equally plausible — open emotional descriptions, incomplete sentences, ambiguous scenarios. Factual questions tend to produce low uncertainty throughout. Try prompts like:
"The feeling I get when I look at the night sky is""What I remember most about that day is""The strange thing about silence is"
All tunable parameters are at the top of uncertainty-monitor.py:
| Variable | Default | What it does |
|---|---|---|
MODEL |
?????? |
Which Ollama model to use |
UNCERTAINTY_THRESHOLD |
0.55 |
Score above which a hesitation pause triggers |
MAX_PAUSE_SECONDS |
2.0 |
Maximum pause length at peak uncertainty |
TOP_K_CANDIDATES |
5 |
How many alternative tokens to request per step |
PROMPT |
(see file) | Default prompt used when Enter is pressed with no input |
The color palette is also editable — look for the COLOR_* variables in the color section of the file.
(We will evaluate other PleIAs models, too, but this is a good starting point).
The installation is a little more involved because Pleias models are not in Ollama's library and must be downloaded from HuggingFace and imported manually.
Make sure your virtual environment (.venv) is active: Then:
pip install huggingface_hubConfirm that it works wtih:
huggingface-cli --versionRun this from inside your project folder. The download is ~700MB and may take several minutes depending on your connection:
huggingface-cli download PleIAs/Pleias-350m-Preview \
--local-dir ./pleias-350m-src \
--local-dir-use-symlinks FalseThe --local-dir-use-symlinks False flag is essential — without it, Ollama will
reject the files with an "insecure path" error.
When complete, verify you have a real file (not a symlink) by checking the size:
ls -lh pleias-350m-src/model.safetensorsYou should see something around 700MB. If it shows a tiny file with an arrow (->), the symlink flag didn't take — delete the folder and try again.
A Modelfile-pl-350m is included in the repository. Run:
ollama create pleias-350m -f Modelfile-pl-350mOllama will convert the downloaded weights to its internal format. This may take a minute or two and will show a progress indicator.
ollama listYou should see pleias-350m:latest in the list.
I have saved a version of the uncertainty-monitor Python script to work with Pleias: uncertainty-monitor-alpha.py
We can adjust the Pleias models in this script by changing the MODEL variable, using the value we see in ollama list.
Currently, in uncertainty-monitor-alpha.py, this is set to Pleias 350m with
MODEL = "pleias-350m:latest"Then run this script as usual:
python uncertainty-monitor.pyThese are already set in the uncertainty-monitor-alpha.py, NOT the same settings as for rhe Qwen model.
"options": {
"temperature": 0.2, # PleIAs recommends low temperature
"repeat_penalty": 1.2, # prevents repetitive looping
"top_k": TOP_K_CANDIDATES,
"num_predict": 80, # hard cap — model doesn't self-terminate cleanly
},- The
ollama run hf.co/PleIAs/...shortcut does NOT work for these models - Plain
curldownloads a tiny Git LFS pointer file (~15 bytes), not the real model huggingface-climust be used with--local-dir-use-symlinks False, otherwise Ollama will refuse to load the files with a security error- The RAG-specialized variants (Pleias-Pico, Pleias-Nano) do not respond well to
open-ended creative prompts (but are worth looking at just for fun).
- We should probably work with the
Previewbase models for this exhibit (including Pleias-350m).
- We should probably work with the
The function on_uncertainty_event(token, score, candidates) in the script fires on every single token. It currently does nothing (pass), but this is where lighting and sound triggers will go. It receives:
token: the word the model just generatedscore: a float from 0.0 (confident) to 1.0 (maximally uncertain)candidates: a dictionary of the top alternative tokens and their log probabilities
Replace pass with your Arduino serial write, MCP tool call, OSC message, or any other hardware trigger.
Cannot reach Ollama at http://localhost:11434
Ollama is not running. On macOS, check the menu bar for the llama icon. On Windows, check the system tray. If it's not there, open the Ollama application to start it.
(no logprob data) in the output table
Make sure "top_logprobs" is set in the request payload (it should be by default). Also confirm your Ollama version is 0.12.11 or later: run ollama --version to check.
Colors don't display correctly in VS Code's terminal VS Code's integrated terminal can have issues rendering true-color ANSI. Run the script in your system terminal (Terminal.app on macOS, Windows Terminal on Windows) instead — this is the recommended environment for the exhibit anyway.
PowerShell says "running scripts is disabled on this system" Run this command once to allow local scripts, then try activating the virtual environment again:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUserDeveloped with assistance from Claude Sonnet 4.6.