Skip to content

Releases: 7ZoneSystems/NativeLab

Windows_x86_64_

06 Mar 17:22
281f63f

Choose a tag to compare


Native Lab Pro v2 — Windows Release

Native Lab Pro v2 is the first public Windows release of Native Lab Pro — a fully local, privacy-first desktop application for running large language models directly on your machine using llama.cpp.

No API keys.
No cloud.
No telemetry.
Your models and data stay entirely on your system.


🚀 Key Features

Fully Local LLM Chat

Run GGUF models directly on your machine using llama.cpp with a native PyQt6 desktop interface.

Multi-Model Architecture

Load multiple models simultaneously and assign them specialized roles:

  • General — main chat model
  • Reasoning — architectural reasoning and analysis
  • Summarization — document summarization
  • Coding — code generation tasks
  • Secondary — additional pipeline insight engine

Pipeline Mode

Coding prompts can run through a multi-stage reasoning pipeline:

  1. Non-coding models produce architectural insights
  2. The coding model receives those insights as context
  3. Final structured code is generated

This produces more structured and reliable code output.

Document Reference Engine

Attach documents or source code files to a session and ask questions about them.

Supported reference types:

  • PDFs
  • Text files
  • Source code files

The engine automatically retrieves the most relevant excerpts and injects them into prompts.

Structured Script Parsing

Source code files are parsed to extract:

  • imports
  • functions
  • classes
  • constants
  • type definitions

The model receives structured context instead of raw text chunks.

Long Document Summarization

Built-in pipeline for summarizing large documents using chunked processing with context carryover.

Features:

  • pause / resume long jobs
  • automatic state saving
  • multi-PDF cross-document summarization

Parallel Model Loading

Multiple models can run simultaneously through separate llama-server instances.

⚠ Each model consumes its full RAM allocation.

Quantization Detection

Automatic detection of GGUF quantization formats including:

  • K-Quants (Q2_K → Q6_K)
  • imatrix quants (IQ series)
  • legacy quants (Q4_0, Q8_0)
  • float formats (F16, BF16)

Models are labeled with human-readable quality tiers.

Prompt Template Auto-Detection

Correct prompt templates are automatically selected based on model filename.

Supported families include:

  • LLaMA-2 / LLaMA-3
  • Mistral / Mixtral
  • DeepSeek / DeepSeek-R1
  • Phi-3
  • Qwen
  • Gemma
  • Falcon
  • Vicuna
  • Yi
  • Zephyr
  • Starling
  • CodeLlama
  • Orca
  • Command-R

Smart Memory Management

A RAM watchdog prevents crashes during large document processing by automatically spilling reference caches to disk when memory pressure is detected.


🖥 System Requirements

Linux (primary supported platform)
Windows(current)

Minimum RAM depends on the model used.

Typical requirements:

Model RAM Required
7B Q4 ~4-5 GB
13B Q5 ~9-10 GB
70B Q4 ~38-40 GB

📦 Dependencies

Python 3.10+

Required:

PyQt6

Optional:

psutil    # RAM monitoring
PyPDF2    # PDF loading and summarization

Install with:

pip install PyQt6 psutil PyPDF2

llama.cpp Requirement

Native Lab Pro requires llama.cpp.

Compile or download it and configure the binary paths inside the application.

Default paths used by the application:

LLAMA_CLI    = /home/hrirake/llama.cpp/build/bin/llama-cli
LLAMA_SERVER = /home/hrirake/llama.cpp/build/bin/llama-server

You can modify these paths in the source if needed.


Model Directory

The default directory scanned for models is:

localllm

You can also add models manually through the Models tab.

Supported format:

*.gguf

▶ Launching the Application

After extracting the release, start Native Lab Pro using:

run.bat (as administrator)

then

Run from your start menu

or directly run the Python file:

python native_lab_pro_v2.py

💾 Data Storage

Native Lab Pro stores data locally in the application directory:

Folder Purpose
sessions/ chat history
paused_jobs/ paused summarization jobs
ref_cache/ reference text cache
ref_index/ reference metadata
model_configs.json per-model settings
app_config.json global configuration

No data is sent outside your system.


⌨ Keyboard Shortcuts

Shortcut Action
Ctrl + N New session
Ctrl + Q Quit
Ctrl + B Toggle sidebar
Ctrl + L Logs tab
Ctrl + M Models tab
Enter Send message
Shift + Enter New line

⚠ Notes

  • This is the first Linux release of Native Lab Pro.
  • GPU acceleration depends on your llama.cpp build configuration.
  • Running multiple models simultaneously requires significant RAM.
    *Place your model gguf file in folder of app /locallm

🔒 Privacy

Native Lab Pro runs entirely offline.

  • No telemetry
  • No external APIs
  • No cloud services

All computation happens locally on your machine.


Linux_x86_64_cpu_only_Ubuntu_ServerConfig_update

06 Mar 00:17
45f02c6

Choose a tag to compare

Native Lab Pro — Release Notes

v2.1.0 · Server & Binary Configuration

March 2026 · Additive, non-breaking change · 1 new tab · 9 change sets


Overview

v2.1.0 introduces a dedicated 🖥️ Server & Binary Configuration tab, giving users full control over llama-cli and llama-server binary paths without hardcoding. Previously, binary locations were resolved at startup from a fixed directory structure. This update makes those paths user-configurable, adds per-OS detection, persists settings to disk, and plumbs the configuration through every inference code path.


Motivation

The previous approach had several limitations:

  • Binaries had to live in a fixed relative path (./llama/bin/) — no flexibility for system-wide installs or custom builds
  • Windows, macOS, and Linux users needed to manually edit source to point to the correct executable
  • The llama-server host and port range were hardcoded to 127.0.0.1:8600–8700 with no way to change them
  • Extra launch flags (e.g. --numa, --flash-attn, --no-mmap) required source edits
  • There was no in-app way to verify a binary was present and functional before loading a model

What's New

🖥️ Server & Binary Configuration Tab

A new tab added to the main interface alongside Models, Config, Logs, and Appearance. It contains four sections:

  • Binary Paths — browse for llama-cli and llama-server executables with live ✅/❌ file-exists indicators
  • Server Settings — configure bind host and port scan range used when starting llama-server instances
  • Extra Launch Flags — append arbitrary flags to every CLI or server launch without touching source code
  • Binary Test — run --version against either binary and see the output inline to verify it works

ServerConfig Dataclass & Persistence

A new ServerConfig dataclass stores all settings and serialises them to localllm/server_config.json. Settings survive restarts and are loaded at app startup. The class also exposes detected_os, default_cli_name, and default_server_name properties that adapt to Windows, macOS, and Linux automatically.

Dynamic Binary Resolution

A _resolve_binary() helper and _refresh_binary_paths() function update the module-level LLAMA_CLI and LLAMA_SERVER variables whenever settings are saved. All existing inference code paths — CLI workers, server launch, pipeline mode, chunked summary, and multi-PDF — automatically pick up the new paths without any further changes.

Configurable Port Range

_free_port() now reads port_range_lo and port_range_hi from ServerConfig instead of using hardcoded 8600/8700. Defaults remain unchanged, so existing setups are unaffected.


Detailed Change Log

| Location | Description | Type

-- | -- | -- | --
1 | SERVER_CONFIG_FILE | New constant pointing to localllm/server_config.json | ADDED
2 | ServerConfig dataclass | Stores cli_path, server_path, host, port ranges, extra flags. Includes save(), load(), and OS detection. | ADDED
3 | _resolve_binary() | Helper that returns a custom path if set and valid, otherwise the built-in default | ADDED
4 | _refresh_binary_paths() | Updates module-level LLAMA_CLI / LLAMA_SERVER from ServerConfig at runtime | ADDED
5 | LLAMA_CLI / LLAMA_SERVER | Renamed default constants to _LLAMA_CLI_DEFAULT / _LLAMA_SERVER_DEFAULT. Live vars now resolve through _refresh_binary_paths() | CHANGED
6 | _free_port() | Default lo/hi params changed to 0; reads ServerConfig.port_range_lo/hi at call time | CHANGED
7 | LlamaEngine._start_server() | Reads SERVER_CONFIG.host and extra_server_args; appends extra flags to the launch command | CHANGED
8 | LlamaEngine.create_worker() | CLI branch appends SERVER_CONFIG.extra_cli_args to the llama-cli command list | CHANGED
9 | ServerTab (UI class) | New widget — binary browse, server settings, flag editor, test runner. Saved via ServerConfig. | ADDED
10 | MainWindow._build_ui() | Registers ServerTab as new tab between Config and Logs | CHANGED
11 | MainWindow._toggle_theme() | Rebuilds ServerTab on theme switch to pick up updated palette colours | CHANGED


Files Changed

native_lab_pro.py      (+~420 lines added, ~30 lines modified)
localllm/server_config.json   (new — created on first save)

Migration & Compatibility

This release is fully backwards-compatible. No action required for existing installations:

  • If server_config.json does not exist, all defaults match previous behaviour exactly
  • LLAMA_CLI and LLAMA_SERVER resolve to the same built-in paths as before unless overridden
  • Port range defaults remain 8600–8700, host defaults remain 127.0.0.1
  • No database migrations, no session format changes, no breaking API changes

How to Use

  1. Open the app and navigate to the new 🖥️ Server tab
  2. Under Binary Paths, click Browse… next to llama-cli and select your binary
  3. Repeat for llama-server
  4. Optionally adjust the bind host, port range, and any extra launch flags
  5. Click Test to verify each binary responds correctly
  6. Click Save Server Settings — written to localllm/server_config.json
  7. Reload your model via Models tab or Model > Reload Model

The ✅ / ❌ indicators next to each path field update in real-time as you type or browse, showing whether the file exists before you save.


Known Limitations

  • Extra flags are passed as a whitespace-split string — flags with spaces in their values are not supported
  • The Test button uses --version, which not all llama.cpp builds support; some may exit non-zero but still be functional
  • Changing the port range does not affect already-running server instances — only new launches pick up the updated range

Upcoming

  • Per-role binary overrides (e.g. a separate CUDA build for the coding engine)
  • GPU layer configuration (--n-gpu-layers) exposed in the Server tab
  • Auto-discovery scan that searches common install paths and populates fields automatically
  • Quoted-argument support for extra flag fields

NativeLabPro_Major_featureUpdate_v4

06 Mar 23:14
0796de1

Choose a tag to compare

NativeLab Pro — Release Notes

Version 2.5 · Development Session Changelog


Overview

This document covers every feature, improvement, and bug fix applied to nativelab.py during this development session. Changes are grouped by area. Each section describes what was added, how it works, and what files / classes were touched.


1. GPU Acceleration Support

Area: Server Tab · ServerConfig dataclass · ServerTab class

What was added

The Server tab now contains a dedicated GPU Acceleration card that auto-detects available graphics hardware on startup and exposes all GPU launch flags visually — no more hand-editing the Extra Launch Flags box.

How it works

A new utility function _detect_gpus() is called once when the Server tab builds. It probes three backends in order: nvidia-smi for NVIDIA CUDA cards, system_profiler on macOS for Apple Metal GPUs, and vulkaninfo as a Vulkan fallback. Each probe runs in a subprocess with a short timeout so the UI never freezes. The result is a list of dicts carrying device index, name, VRAM in MB, and backend type.

The GPU card renders a backend badge (🟢 CUDA / Metal, 🟡 Vulkan, ⚪ None) followed by a list of all detected GPUs with their VRAM. Four controls appear:

  • Enable GPU offloading checkbox — disabled automatically if no GPU was detected.
  • GPU layers spin box (-1 to 999) — special display text "All (−1)" when set to −1 means offload every layer.
  • Primary GPU combo box — populated from the detected device list.
  • Tensor split line edit — for multi-GPU ratio strings like 0.6,0.4.

On save (_save()), the GPU flags are serialised into ServerConfig and also injected into extra_server_args as --ngl N [--main-gpu N] [--tensor-split X,Y] using a regex strip-then-prepend so existing manually typed flags are preserved. The existing launch code picks them up with zero changes.

New fields in ServerConfig

enable_gpu    bool   = False
ngl           int    = -1
main_gpu      int    = 0
tensor_split  str    = ""

2. HuggingFace GGUF Model Download Tab

Area: New tab · ModelDownloadTab widget · HfSearchWorker / HfDownloadWorker QThreads

What was added

A new ⬇️ Download tab that lets users search HuggingFace for GGUF model files and download them directly into any local folder, with live progress and cancellation — no browser needed.

How it works

Search flow: The user types a repo ID (e.g. TheBloke/Mistral-7B-GGUF) and clicks Search. A HfSearchWorker QThread calls https://huggingface.co/api/models/{repo}, filters the siblings list to .gguf files only, and emits results_ready(list). Results appear in a QListWidget with colour-coded quantisation badges (Q2 = red through Q8 = green) and a human-readable file size.

Download flow: The user selects a file, picks a destination folder, and clicks Download. A HfDownloadWorker QThread fetches https://huggingface.co/{repo}/resolve/main/{filename} in 256 KB chunks, emitting progress(int) on each chunk. If the user cancels or the download errors, the partial file is deleted. On successful completion MODEL_REGISTRY.add(path) is called so the model appears immediately in all model lists, and a success dialog offers to open the folder.

Only Python standard library (urllib) is used — no new pip dependencies.


3. MCP Server Management Tab

Area: New tab · McpTab widget · MCP_CONFIG_FILE constant

What was added

A new 🔌 MCP tab for managing Model Context Protocol servers. Users can add, start, stop, and remove MCP servers (stdio or SSE transport) and see live log output from each.

How it works

Configuration is stored in ./localllm/mcp_config.json as {"servers": [...]}. Each server entry holds name, transport type (stdio / sse), command or URL, and description.

The tab has three panels: a server list with 🟢/⚪ running indicators, a control row (Start / Stop / Remove), and a log pane with timestamps. Clicking Start on a stdio server launches it via subprocess.Popen(shell=True) and stores the handle in a _procs dict keyed by server name. SSE servers just log their URL since they run externally. Log lines are appended from stdout polling. Stop terminates the process and removes it from _procs.


4. Pipeline Builder — Full Logic Block System

Area: PipelineBlockType · PipelineCanvas · PipelineExecutionWorker · PipelineBuilderTab sidebar

What was added

Seven new Python-evaluated logic blocks that let users build conditional, branching, and transforming pipelines without writing any model calls.

Block types

⑂ IF / ELSE evaluates a Python boolean expression (len(text) > 200, 'error' in text.lower()) against the incoming context. TRUE routes to the E port, FALSE to the W port. Users draw two labelled arrows to set up the branches.

⑃ SWITCH evaluates a Python expression that returns a string key ('long' if len(text) > 300 else 'short'). Each outgoing arrow carries a user-supplied label. Only the arm whose label matches the returned key is followed. A default labelled arm catches unmatched keys.

⊘ FILTER acts as a gate. If the condition is True the text continues unchanged. If False the pipeline terminates cleanly with a [FILTER DROPPED] message in the Output tab — no crash, no silent drop.

⟲ TRANSFORM performs instant deterministic text operations with no model: prefix, suffix, find-and-replace, upper, lower, strip whitespace, or truncate to N characters.

⊕ MERGE collects every context queued for it in the current execution pass (from multiple incoming arrows) and joins them. Modes: concat with separator, prepend, append, or JSON array.

⑁ SPLIT broadcasts the exact same text to every outgoing arrow simultaneously. No configuration needed — just draw multiple outgoing arrows.

⌥ Custom Code opens a full code editor dialog (described below) where the user writes arbitrary Python.

Multi-output fan-out

Before this change every source port was limited to one outgoing arrow (the old code deleted the previous connection on each new draw). Logic blocks are now added to a _LOGIC_BTYPES set that skips that deletion, allowing any number of arrows to fan out from the same port. Duplicate connections (same from_bid + from_port + to_bid) are silently ignored. Normal flow blocks (Input, Model, Output, Intermediate) still enforce single-output-per-port.

Branch label badges on arrows

When an arrow leaves an IF/ELSE or SWITCH block, a branch label is stored as a dynamic attribute conn.branch_label on the PipelineConnection object. The _draw_arrow method reads this attribute and renders a small rounded badge at 35% along the Bezier curve — green for TRUE, red for FALSE, pipeline-colour for other labels.

Custom Code Editor Dialog (_CodeEditorDialog)

A QDialog with a QTextEdit code editor (Consolas 11pt, 28-unit tab stops), a live syntax-check label that updates on every keystroke using compile(), an available-variables reference table, and a 🧪 Test button that runs the code in a sandboxed exec() with sample text and shows the result and log output in a QMessageBox. Saving validates syntax first and refuses to close if there is a syntax error. The block label is automatically set to the first non-comment code line.

The sandbox exposes only safe builtins: len str int float bool list dict tuple range enumerate zip map filter sorted min max sum abs round isinstance hasattr getattr repr type print — no open, os, subprocess, or __import__.


5. Pipeline Builder — LLM Logic Block System

Area: PipelineBlockType · PipelineCanvas · PipelineExecutionWorker · PipelineBuilderTab sidebar · _LlmLogicEditorDialog

What was added

Five new LLM-evaluated logic blocks that are functionally identical to the Python logic blocks above except every condition or instruction is written in plain English and evaluated at runtime by an attached GGUF model over the llama-server HTTP API.

Block types

🧠 LLM IF / ELSE sends the incoming text plus a plain-English condition to the model with a tight system prompt demanding a single word: YES or NO. The parser accepts YES Y TRUE 1 PASS POSITIVE as truthy. Routes to E (YES) or W (NO).

🧠 LLM SWITCH presents the model with the incoming text and the classification task. The valid category names are automatically extracted from the branch labels on the outgoing arrows and included in the prompt. Case-insensitive matching with a substring fallback ensures robustness. A default labelled arm catches unmatched classifications.

🧠 LLM FILTER demands PASS or STOP. On STOP the pipeline ends with a structured message showing the filter name, condition, model decision, and original text — so the user can inspect exactly what was blocked and why.

🧠 LLM TRANSFORM uses a higher default token budget (512), provides a system prompt that demands output-only with no preamble, and automatically strips common model preamble phrases (Here is, Result:, Output:, Transformed:) before the result flows downstream.

🧠 LLM SCORE extracts the first integer 1–10 from the model response using regex (handles prose like "I'd give it a 7"), maps to LOW (1–3) / MID (4–7) / HIGH (8–10) bands routed to E / S / W ports. A score-labelled outgoing arrow receives the raw numeric string instead of the original context.

LLM Logic Editor Dialog (_LlmLogicEditorDialog)

A QDialog tailored per block type. It shows: a colour-coded about-panel describing exactly what the model will receive and return, a branch-routing hint (e.g. "E port = YES / W port = NO"), a model selector combo box populated from MODEL_REGISTRY with a Browse button, and a multi-lin...

Read more

NativeLabPro_x86_64_Linux_Major_featureUpdate_v3

06 Mar 04:34
4253ddd

Choose a tag to compare

Native Lab Pro — Patch Release Notes

All changes applied in this session on top of Native Lab Pro v2.
Patches are listed in the order they were implemented.


🏗️ Pipeline Builder (new feature)

A fully interactive visual pipeline editor added as a dedicated 🔗 Pipeline tab in the main window.

Canvas

  • PipelineCanvas — custom QWidget with its own paintEvent; renders blocks and connections entirely via QPainter (no external libs)
  • Drag-and-drop blocks anywhere on the canvas; position snaps to an 8 px grid
  • 8-directional connection ports (N · S · E · W on each block) drawn as dots; drag from one port dot to another to create a connection
  • Connections render as smooth cubic Bézier curves with an arrowhead at the target port
  • Loop connections detected automatically when a back-edge would form a cycle; rendered in C["pipeline"] cyan instead of the default C["acc2"] purple, with a ×N multiplier badge
  • Selecting a connection and pressing Delete removes it; selecting a block removes it along with all its connections
  • Double-clicking a block opens a configuration dialog appropriate to its type

Block Types

Block Colour Purpose
▶ Input C["ok"] green Entry point — receives the user's prompt
◈ Intermediate C["warn"] amber Passes output of one model into the next; required between two MODEL blocks
■ Output C["err"] red Terminal block — captures and displays the final result
🤖 Model C["pipeline"] cyan Wraps a loaded local model; configurable role, label, and model file
📎 Reference C["acc"] purple Injects a pasted text or loaded file snippet ahead of the context
💡 Knowledge C["acc2"] lavender Prepends a knowledge-base chunk to the context
📄 PDF Summary C["pipeline"] cyan Extracts / summarises a PDF and prepends the result

Sidebar

  • FLOW BLOCKS section — one-click add for Input, Intermediate, and Output blocks
  • CONTEXT BLOCKS section — one-click add for Reference, Knowledge, and PDF Summary blocks
  • MODELS (dbl-click to add) — live list of all models in MODEL_REGISTRY; double-click inserts a pre-configured Model block
  • ↻ Refresh button re-scans the model registry without restarting
  • CANVAS CONTROLS — Clear All button wipes blocks and connections

Right Panel

  • Server status badge — polls every 2 s; shows green when a llama-server is ready, amber when loading
  • ▶ Run Pipeline button — validates the graph then kicks off PipelineExecutionWorker
  • ⏹ Stop Execution — aborts the worker mid-stream
  • 📋 Log tab — live execution log with timestamps
  • ■ Output tab — rendered final output with Markdown formatting
  • Per-intermediate ◈ BlockName tabs created dynamically as execution reaches each Intermediate block; each streams tokens live

Execution Engine (PipelineExecutionWorker)

  • Runs entirely in a QThread; never blocks the UI
  • Server-mode only — calls ensure_server_or_reload() on the engine before starting; retries up to 3 times with a 6 s delay
  • Walks the connection graph in topological order; context is carried forward through each block
  • Signals: step_started, step_token, step_done, intermediate_live, pipeline_done, err, log_msg
  • Loop connections cause the enclosed sub-graph to execute loop_times times, accumulating context on each pass

Validation

  • Refuses to run if no INPUT block is present
  • Refuses to run if two MODEL blocks are connected directly without an INTERMEDIATE block between them
  • Checks for server readiness before dispatch; surfaces errors as dialog boxes rather than silent failures

🔗 Pipeline Save / Load System

Patches A → D

Added full pipeline persistence so canvas states survive restarts.

  • Pipelines serialised to ~/.native_lab/pipelines/<name>.json (version 2 format)
  • _pipeline_to_dict / _pipeline_from_dict helpers handle blocks + connections including all metadata
  • Block IDs are remapped on load so counter collisions never occur
  • list_saved_pipelines(), save_pipeline(), load_pipeline() module-level helpers

New sidebar buttons in Pipeline Builder:

Button Action
💾 Save Pipeline… Prompts for name, overwrites if exists
📂 Load Pipeline… Lists saved pipelines; includes inline delete option
🗑 Delete Available inside the Load dialog

🔧 Pipeline Execution Fixes

Patches E → F

  • OUTPUT block no longer accumulates all intermediate text — it receives only what was directly piped into it
  • Sender block label is captured and included in pipeline_done payload as JSON { "text": …, "sender": … }
  • _on_pipeline_done in PipelineBuilderTab renders output with a **Output from: BlockName** header

🔗 Pipeline Button in Main Chat Window

Patches G → H

  • Added 🔗 Pipeline button next to the Send button in InputBar
  • Button emits pipeline_run_requested signal (proper PyQt signal chain — avoids direct cross-widget method calls)
  • Connected in MainWindow._build_ui to _on_pipeline_from_chat()
  • Selecting a pipeline prompts a dialog, reads current chat input, and runs PipelineExecutionWorker without leaving the Chat tab
  • Each pipeline stage renders as its own labelled chat bubble:
    • ⚡ Processing block: Name… — system note bubble
    • ◈ Name — intermediate output — amber intermediate bubble
    • ■ Output (from: Name) — standard assistant bubble

🎨 New Chat Bubble Roles

Patch I

Two new MessageWidget roles added alongside user / assistant:

Role Colour Label Use
pipeline_intermediate C["warn"] amber ◈ Intermediate Mid-pipeline block output
system_note C["txt3"] muted ⚡ System Stage progress notes

📜 Pipeline Builder Sidebar — Scrollable

The sidebar previously clipped content when many models were loaded.

  • Inner sidebar QWidget wrapped in a QScrollArea (214 px wide, accounts for scrollbar)
  • setFixedWidth moved from the widget to the scroll area
  • setHorizontalScrollBarPolicy(AlwaysOff) — horizontal scrolling disabled
  • addStretch() appended before wrapping so buttons stay top-aligned when content is short
  • All existing block buttons, model list, and canvas controls remain fully functional

✨ Fluid UI Animations

Animated Chat Bubbles

  • Every new MessageWidget fades in over 220 ms (ease-out cubic) via _fade_in() helper
  • _fade_in uses QGraphicsOpacityEffect + QPropertyAnimation on opacity property
  • Guard clause skips PipelineCanvas and ThinkingBlock to prevent QPainter re-entry conflicts

Empty-State Placeholder

  • ChatArea now shows "Hi, message me up when you are ready." when no messages exist
  • Placeholder is centred, 22 px light-weight text in C["txt3"]
  • Fades in at 300 ms on clear_messages() and hides automatically when the first message arrives

Tab Switch Fade

  • _FadeOverlay — a sibling QWidget placed on top of tab content, never a QGraphicsOpacityEffect on the tab itself (avoids QPainter conflicts with PipelineCanvas.paintEvent)
  • alpha animated from 220 → 0 over 180 ms on every tab change via pyqtProperty(int)
  • Overlay covers only the newly-visible tab page geometry; hidden immediately after animation completes

Reference Panel Slide-In

  • References panel slides in from the right edge over 240 ms (ease-out cubic) by animating maximumWidth from 0 → 260
  • Slides out over 200 ms (ease-in cubic); setVisible(False) fires only after animation completes
  • No QGraphicsOpacityEffect involved — zero painter conflicts

🌐 API Models Tab

A new 🌐 API Models tab in the main window lets users connect to any cloud or local API endpoint. Once verified, the API engine is treated identically to a loaded local model.

Supported Providers (pre-configured)

Provider Format Notable Models
OpenAI OpenAI GPT-4o, o1, o3-mini
Anthropic Anthropic claude-opus-4-5, claude-sonnet-4-5, claude-haiku-4-5
Groq OpenAI-compat Llama-3.3-70b, Mixtral-8x7b
Mistral OpenAI-compat mistral-large, codestral
Together AI OpenAI-compat Llama-3.3-70b-Turbo, Qwen-2.5-72b
OpenRouter OpenAI-compat GPT-4o, Claude-3.5, Gemini-Pro
Ollama OpenAI-compat llama3.2, qwen2.5, phi4
Custom Configurable Any OpenAI-compatible endpoint

How it works

  1. Fill in provider, model, API key, base URL, max tokens
  2. Click ⚡ Test & Load — sends a 1-token "hi" message to verify connectivity
  3. On success the engine swaps into the main chat system; status bar updates to 🌐 Provider · model-id
  4. Click 💾 Save Config to persist the connection for future sessions
  5. Saved configs appear as cards with ▶ Load and 🗑 Delete buttons

Architecture

  • ApiConfig dataclass — serialisable to/from ~/.localllm/api_models.json
  • ApiRegistry — load/save/add/remove configs
  • ApiStreamWorker(QThread) — SSE streaming for both OpenAI-compatible and Anthropic formats
  • ApiEngine — drop-in replacement for LlamaEngine; implements is_loaded, status_text, create_worker, ensure_server, shutdown
  • _active_engine_for() in MainWindow prioritises ApiEngine when loaded
  • Full structured message history (last 60 turns) passed to API models instead of raw prompt string

🎨 Custom Prompt Format for API Models

Prompt Template Presets

Seven built-in presets selectable from a dropdown:

Preset Used by
Default Provider handles formatting (recommended)
ChatML OpenHermes, Qwen, and ChatML-trained models
Llama-2 Chat Meta Llama-2 instruction models
Read more

Linux_x86_64_cpu_only_Ubuntu_hotfix0.1

05 Mar 13:51
3b9fefa

Choose a tag to compare

Fixes has been applied for issues where in reasoning and coding pipelines model may output training or irrelevant data .
Auto server - cli switch issues have been fixed .

Multiple stability fixes have been applied especially in coding and reasoning sectors , and model specific prompt templates.

Linux_x86_64_cpu_only_Ubuntu_hotfix0.2

05 Mar 18:27
620c3be

Choose a tag to compare

Native Lab Pro — Changelog


[2.2.0] — 2026

Overview

This release extends the stability work begun in 2.1.0 with two major areas of improvement: a comprehensive PDF summarisation pipeline overhaul, and a deeper second pass on server process safety and role assignment reliability. The summarisation pipeline gains mode selection, live pause/abort controls, and a robust fallback chain for the final consolidation pass. Server management gains a global process registry so every spawned server is tracked and killed precisely on exit or reload. Role assignment gains atomic state tracking to eliminate the remaining race conditions and failure-point edge cases not addressed in 2.1.0.


📄 PDF Summarisation Pipeline

Summary mode selector. A new combo box in the input toolbar lets users choose between three summarisation modes before starting a job. Summary produces a standard structured overview. Logical produces a mechanism/methodology breakdown with numbered steps and sub-bullets, focused on how and why things work. Advisory extracts actionable recommendations framed as a practical brief. The selected mode is embedded in every section prompt and the final consolidation prompt, so the model's output is consistently shaped throughout the entire pipeline rather than only at the final pass.

Live pause/abort banner. As soon as a chunked summarisation job begins, a control bar appears inline in the chat panel showing the current status, a "Pause & Save" button, and an "Abort" button. Previously, the only way to stop a job was from a separate Config tab. The banner updates its status text as chunks complete and is removed automatically when the job finishes, pauses, or errors.

Input blocking during summarisation. The message input field is disabled and given a descriptive placeholder while a summarisation job is active. Attempting to send a message via _on_send while _summarizing_active is True shows a dialog explaining that the job must be paused or aborted first. This prevents the inference queue from being silently overloaded by a second request competing for the same model.

Final consolidation fallback chain. Previously, if the final consolidation pass failed (e.g. due to a context-length overflow on a very long document), the entire job returned an error. The method now attempts the pass on the secondary/reasoning engine first, falls back to the primary engine if that fails, and as a last resort concatenates the section summaries with a clear [Auto-fallback] header rather than losing the work already done.

Error message fix for section failures. The inference error message previously hardcoded "Inference failed on section 4" regardless of which section actually failed. It now interpolates the correct section index.

Resume preserves mode. Paused jobs now save summary_mode to disk alongside the existing state. When a job is resumed from the Config tab, the original mode is restored so the remaining sections and final pass use the same instructions as the sections already completed.

Summary bubble and input bar fully restored on all exit paths. The pause banner removal, _summarizing_active flag reset, and input re-enable are now applied in _on_summary_final, _on_summary_err_or_pause (both the pause path and the error path), and _on_summary_err individually, so no code path can leave the UI in a locked state.


🛡️ Llama Server Process Management

Global server registry. A module-level _SERVER_REGISTRY dictionary maps every port to the PID of the server that was bound to it. The registry is populated in LlamaEngine._start_server() immediately after the server passes its health check, and entries are removed in shutdown() and _kill_server_on_port(). This gives the application a precise, session-scoped picture of every server it owns, independently of psutil process enumeration.

Port cleared before every bind. _start_server() now calls _kill_server_on_port() on the chosen port before attempting to start a new server. A 150 ms settle is observed after the kill. This eliminates the "address already in use" failure that could occur when reloading a model quickly on the same port slot, which previously required a manual app restart to recover from.

_kill_server_on_port() utility. A new top-level helper kills whatever OS process is occupying a given port, using psutil.net_connections when available and falling back to netstat/lsof cross-platform parsing otherwise. It also prunes the port from the registry so subsequent _free_port() scans reflect the true state.

_kill_all_registered_servers() on exit. closeEvent calls this after shutting down all named engine instances. It iterates the registry and kills by both PID and port, ensuring any server that was started but whose engine reference was subsequently lost (e.g. replaced by a failed reload) is still terminated. The registry is cleared on completion.

Startup orphan scan. _scan_and_kill_orphaned_servers() is scheduled via QTimer.singleShot(0, ...) during MainWindow.__init__. It uses psutil.process_iter to find any llama-server processes occupying ports in the 8600–8700 range that are not present in the current session's registry, and terminates them. On systems without psutil the function exits immediately, preserving the no-dependency fallback behaviour.

shutdown() kills by port as well as PID. In addition to the process-tree kill added in 2.1.0, shutdown() now calls _kill_server_on_port() as a belt-and-suspenders step and zeros self.server_port afterward. This covers the edge case where the Popen handle's PID no longer matches the process actually using the port (e.g. after a rapid respawn).

closeEvent stops all workers before killing servers. The summary worker and multi-PDF worker are now explicitly stopped (abort() + wait(1000)) in closeEvent alongside the existing inference and pipeline worker teardown. This prevents a background thread from attempting to POST to a server that has already been killed.


⚡ Multi-Model Role Assignment

_roles_loading concurrency guard. A set instance tracks which roles currently have an active ModelLoaderThread. _start_role_engine_load() returns immediately if the target role is already in the set. The role is added at load start and removed in _on_role_engine_loaded(), including on failure. This is an explicit set-based check that complements the signal-disconnection approach from 2.1.0.

_set_role_buttons_enabled() helper. A new method enables or disables the full strip of model-management buttons (btn_load_role_engine, btn_unload_all, btn_load_primary, btn_browse_model, btn_remove_model, btn_save_cfg) in a single call. It is called with False when any load begins and with True only when _roles_loading becomes empty again. This closes the window between disabling only the load button (2.1.0 behaviour) and disabling the entire management surface, preventing config saves and model removes from racing against an in-progress load.

Engine cleanup on load failure. If _on_role_engine_loaded receives ok=False, it calls shutdown() on the newly created engine and sets the role attribute back to None. Previously, a failed load left a partially-initialised LlamaEngine in place, which could cause the next load attempt to inherit stale port or mode state.

_unload_all_engines() clears loading state. The unload-all handler now calls self._roles_loading.clear() and self._set_role_buttons_enabled(True) before shutting down engines, preventing a scenario where an in-progress load had disabled the buttons and was then overtaken by an unload-all, leaving the UI permanently locked.

Model list selection preserved across refresh. _refresh_model_list() now records the currently selected model path before clearing the list and restores the selection afterward using blockSignals(True/False) to avoid triggering spurious currentItemChanged callbacks. Previously, any operation that saved a config change (and thus called _refresh_model_list()) would silently deselect the model the user was configuring.

Role attribute kept in sync on config save. _save_model_config() now reads the old role from the registry before writing the new config. If the role changed and the model is currently loaded in a live engine, the engine's role attribute is updated in place and _refresh_engine_status() is called. This keeps the engine status list consistent with the registry without requiring a reload.

LlamaEngine carries a role attribute. LlamaEngine.__init__ now accepts an optional role string (default "general"). All engine construction sites pass the appropriate role. This makes role attribution intrinsic to the engine object rather than inferred from which attribute it was stored under, simplifying status display and future logging.


Bug Fixes

Fixed pause banner left visible after summary error. If the summarisation pipeline raised an error before the final pass, the pause banner widget was left in the chat indefinitely. All error-exit paths now call chat_area.remove_pause_banner().

Fixed input field stuck disabled after summary abort. If the user aborted a job via the banner's Abort button, input_bar.input.setEnabled(True) was not always reached. The enable call is now present on every exit path including the __PAUSED__ signal branch.

Fixed _on_summary_err called with stale _summary_bubble. On certain timing paths, _on_summary_err appended an error message to a bubble that had already been nulled out. The write is now guarded with if self._summary_bubble:.


🔧 Internal / Developer Notes

  • _SERVER_REGISTRY is a plain module-level dict. It is intentionally not enca...
Read more

Linux_x86_64_cpu_only_Ubuntu_UI0.1_Major

05 Mar 23:19
620c3be

Choose a tag to compare

NativeLab Pro — UI Theme Changelog

All changes relate to the dual light/dark theme system, appearance consistency, and the live Theme Editor tab introduced during this session.


Phase 1 — Dual Theme Architecture (Initial Implementation)

Added the foundational infrastructure to support runtime theme switching.

  • Added CURRENT_THEME = "light" global variable to track active theme state.
  • Split the single C colour dictionary into C_DARK (original palette) and C_LIGHT (new light palette), with C assigned dynamically based on CURRENT_THEME.
  • Converted the static QSS stylesheet string into a _build_qss(c: dict) function so the stylesheet can be regenerated from any palette at runtime.
  • Added a View → Switch to Light/Dark Theme menu item with a dynamic label that updates to reflect the current state.
  • Implemented _toggle_theme() and _update_theme_action_label() methods on MainWindow.
  • Added theme persistence: the active theme is saved to app_config.json and restored on next launch.

Initial C_LIGHT palette

The first iteration used a warm cream aesthetic: #faf7f2 canvas, sage green #4a7652 accent, and warm brown #1c1810 text.


Phase 2 — Professional Light Palette (Stripe/Linear aesthetic)

Replaced the cream/sage palette with a clinical, high-contrast SaaS-style palette after user feedback that the initial version looked unprofessional.

Key values introduced:

Token Value Purpose
bg0 #ffffff Pure white canvas
bg1 #f7f7f8 Sidebar/panel
acc #2563eb Vivid blue accent (Stripe standard)
txt #0d0d10 Near-black primary text
bdr #e4e4e7 Barely-there zinc-200 border
usr #eff6ff Whisper-blue user bubble

All warm-neutral tones; no cool greys. Accent colour shifted from blue to burnt orange to harmonise with the peach base.


Phase 7 — Live Appearance / Theme Editor Tab

Added a full 🎨 Appearance tab allowing users to edit every colour token of the active theme in real time using colour swatches, hex inputs, and HSL sliders.

New class: AppearanceTab(QWidget)

  • Emits theme_changed = pyqtSignal(dict) whenever any colour is modified.
  • Colour tokens are grouped into six logical sections: Backgrounds, Text, Accent, Bubbles, Borders, Semantic.
  • Each token row contains: a labelled swatch button (opens QColorDialog), a hex QLineEdit, and three HSL QSlider widgets (Hue 0–360, Saturation 0–100, Lightness 0–100).
  • All three controls stay in sync — editing one updates the others.
  • Reset button reverts to the current built-in palette for the active theme.
  • Save button persists the custom palette to app_config.json.

Separate persistence per theme

  • Light mode saves to APP_CONFIG["custom_light_palette"].
  • Dark mode saves to APP_CONFIG["custom_dark_palette"].
  • Both are loaded and merged at startup independently, so customising one theme does not affect the other.

QSS additions for Appearance tab

Added rules for: #appearance_bar, #appearance_hdr, #appearance_group_hdr, #appearance_row_lbl, #appearance_sl_lbl, QLineEdit#appearance_hex, QSlider#appearance_slider (groove, handle, sub-page), QPushButton#appearance_btn, QPushButton#appearance_btn_acc.

MainWindow wiring

  • AppearanceTab is instantiated in _build_ui and wired via theme_changed → _on_appearance_changed.
  • _on_appearance_changed updates C_LIGHT or C_DARK (whichever is active), rebuilds QSS, and calls self.setStyleSheet(QSS) — changes are visible instantly without restarting.
  • _toggle_theme calls appearance_tab.load_palette(...) after switching so the editor always reflects the current theme's colours.
  • Palette loading at startup was moved into __init__ after _build_ui returns, so it executes after QApplication is fully initialised.

setStyleSheet migration

All QApplication.instance().setStyleSheet(QSS) calls were replaced with self.setStyleSheet(QSS) on the QMainWindow instance to avoid NoneType errors during initialisation. Stylesheet inheritance from the top-level window to all child widgets is identical.


End of changelog.

Linux_x86_64_cpu_only_Ubuntu

04 Mar 15:35
f8d4f93

Choose a tag to compare


Native Lab Pro v2 — Linux Release

Native Lab Pro v2 is the first public Linux release of Native Lab Pro — a fully local, privacy-first desktop application for running large language models directly on your machine using llama.cpp.

No API keys.
No cloud.
No telemetry.
Your models and data stay entirely on your system.


🚀 Key Features

Fully Local LLM Chat

Run GGUF models directly on your machine using llama.cpp with a native PyQt6 desktop interface.

Multi-Model Architecture

Load multiple models simultaneously and assign them specialized roles:

  • General — main chat model
  • Reasoning — architectural reasoning and analysis
  • Summarization — document summarization
  • Coding — code generation tasks
  • Secondary — additional pipeline insight engine

Pipeline Mode

Coding prompts can run through a multi-stage reasoning pipeline:

  1. Non-coding models produce architectural insights
  2. The coding model receives those insights as context
  3. Final structured code is generated

This produces more structured and reliable code output.

Document Reference Engine

Attach documents or source code files to a session and ask questions about them.

Supported reference types:

  • PDFs
  • Text files
  • Source code files

The engine automatically retrieves the most relevant excerpts and injects them into prompts.

Structured Script Parsing

Source code files are parsed to extract:

  • imports
  • functions
  • classes
  • constants
  • type definitions

The model receives structured context instead of raw text chunks.

Long Document Summarization

Built-in pipeline for summarizing large documents using chunked processing with context carryover.

Features:

  • pause / resume long jobs
  • automatic state saving
  • multi-PDF cross-document summarization

Parallel Model Loading

Multiple models can run simultaneously through separate llama-server instances.

⚠ Each model consumes its full RAM allocation.

Quantization Detection

Automatic detection of GGUF quantization formats including:

  • K-Quants (Q2_K → Q6_K)
  • imatrix quants (IQ series)
  • legacy quants (Q4_0, Q8_0)
  • float formats (F16, BF16)

Models are labeled with human-readable quality tiers.

Prompt Template Auto-Detection

Correct prompt templates are automatically selected based on model filename.

Supported families include:

  • LLaMA-2 / LLaMA-3
  • Mistral / Mixtral
  • DeepSeek / DeepSeek-R1
  • Phi-3
  • Qwen
  • Gemma
  • Falcon
  • Vicuna
  • Yi
  • Zephyr
  • Starling
  • CodeLlama
  • Orca
  • Command-R

Smart Memory Management

A RAM watchdog prevents crashes during large document processing by automatically spilling reference caches to disk when memory pressure is detected.


🖥 System Requirements

Linux (primary supported platform)

Minimum RAM depends on the model used.

Typical requirements:

Model RAM Required
7B Q4 ~4-5 GB
13B Q5 ~9-10 GB
70B Q4 ~38-40 GB

📦 Dependencies

Python 3.10+

Required:

PyQt6

Optional:

psutil    # RAM monitoring
PyPDF2    # PDF loading and summarization

Install with:

pip install PyQt6 psutil PyPDF2

llama.cpp Requirement

Native Lab Pro requires llama.cpp.

Compile or download it and configure the binary paths inside the application.

Default paths used by the application:

LLAMA_CLI    = /home/hrirake/llama.cpp/build/bin/llama-cli
LLAMA_SERVER = /home/hrirake/llama.cpp/build/bin/llama-server

You can modify these paths in the source if needed.


Model Directory

The default directory scanned for models is:

/home/hrirake/localllm

You can also add models manually through the Models tab.

Supported format:

*.gguf

▶ Launching the Application

After extracting the release, start Native Lab Pro using:

nativelabpro.desktop

or directly run the Python file:

python native_lab_pro_v2.py

💾 Data Storage

Native Lab Pro stores data locally in the application directory:

Folder Purpose
sessions/ chat history
paused_jobs/ paused summarization jobs
ref_cache/ reference text cache
ref_index/ reference metadata
model_configs.json per-model settings
app_config.json global configuration

No data is sent outside your system.


⌨ Keyboard Shortcuts

Shortcut Action
Ctrl + N New session
Ctrl + Q Quit
Ctrl + B Toggle sidebar
Ctrl + L Logs tab
Ctrl + M Models tab
Enter Send message
Shift + Enter New line

⚠ Notes

  • This is the first Linux release of Native Lab Pro.
  • GPU acceleration depends on your llama.cpp build configuration.
  • Running multiple models simultaneously requires significant RAM.
    *Place your model gguf file in folder of app /locallm

🔒 Privacy

Native Lab Pro runs entirely offline.

  • No telemetry
  • No external APIs
  • No cloud services

All computation happens locally on your machine.