Skip to content

Hybrid Research Rig — Ensemble AI Decision Engine, Ender Dragon Automation, Vision, HUD & Security Hardening#716

Draft
Z0mb13V1 wants to merge 12 commits intomindcraft-bots:developfrom
Z0mb13V1:pr/hybrid-rig-v0.1.3
Draft

Hybrid Research Rig — Ensemble AI Decision Engine, Ender Dragon Automation, Vision, HUD & Security Hardening#716
Z0mb13V1 wants to merge 12 commits intomindcraft-bots:developfrom
Z0mb13V1:pr/hybrid-rig-v0.1.3

Conversation

@Z0mb13V1
Copy link

@Z0mb13V1 Z0mb13V1 commented Mar 2, 2026

Hybrid Research Rig v0.1.3

A Hybrid Research Rig where two AI bots run simultaneously — one using a cloud ensemble of multiple LLMs (CloudGrok, on AWS EC2), and one using local GPU inference (DragonSlayer, on an RTX 3090) — sharing a persistent Minecraft world on a Paper 1.21.11 server.

12 commits · 149 files changed · +13,320 / −1,065 lines


Key Features

  • Ensemble Decision Engine — 4-model all-Grok panel with 3-phase arbiter: heuristic scoring, LLM-as-Judge (grok-4), and ChromaDB vector memory for past-experience retrieval
  • Baritone A* Pathfinding (RC25) — custom pathfinder with distance-adaptive timeouts, ghost block handling, and Paper server compatibility
  • Ender Dragon Automation (RC31) — DragonSlayer profile with 6-chunk autonomous progression: diamond pickaxe ✅, nether portal, blaze rods, ender pearls, stronghold, dragon fight
  • getDiamondPickaxe 4-Tier Rewrite (RC29–RC31) — wood→stone→iron→diamond with ensureSticksAndTable() helper, furnace crafting, iron smelting, strip-mining, and pillarUp shaft escape
  • Auto-Redirect Blocked Commands (RC31) — andy-4:q8_0 never generates !dragonProgression on its own; agent.js intercepts blocked crafting commands and redirects to the orchestrator
  • Broadened Error Handler (RC30) — 16+ non-fatal patterns (Baritone noise, timeouts, navigation failures) caught and swallowed to prevent process crashes
  • New SkillspillarUp() for shaft escape, stripMineForOre() for horizontal tunnel mining with wall/ceiling/floor ore detection
  • Vision System — both bots see via screenshots using Xvfb + Mesa software rendering in Docker, with graceful fallback
  • HUD Overlay — gaming-style web dashboard with per-bot runtime tracker, goal/action display, scrollable command log, and live bot camera feeds
  • Paper Server Compatibility — timeout-protected collectBlock, Geyser cross-play support, Aikar GC flags
  • Auto-Bed & Door Navigation — bots prefer doors over block-breaking, auto-sleep at night
  • Security Hardening — recursive prototype pollution guard, TTS command injection fix, path traversal prevention, Docker non-root user, enhanced message validation, rate limiter hardening
  • Agent Lifecycle Resilience — backoff restart, ghost restart prevention, action deadlock detection, post-respawn grace period
  • Multi-Hop Explore — smart exploration with unstuck pause, water avoidance, barren biome escape, 200-block relocation
  • Remote MindServer — local agents appear in EC2 web UI
  • Discord Bot Enhancements — direct bot chat, auto-fix monitor, role-based admin, bot prompt hardening
  • CI/CD — zero-warning ESLint enforcement, Docker build gate, GitHub Actions lint + EC2 deploy workflows
  • 23 LLM providers supported via dynamic model routing

What This PR Adds

CloudGrok — Cloud Ensemble Bot

A fully autonomous Minecraft bot powered by a voting ensemble of 4 Grok models. On every LLM call:

  1. Panel — all 4 models respond in parallel (60s timeout): grok-4, grok-4-fast-non-reasoning, grok-4-1-fast-non-reasoning, grok-code-fast-1
  2. Arbiter — heuristic scoring (length, completeness, valid commands, latency); escalates to judge if top-2 within 0.08 margin
  3. Judgegrok-4 picks the best response (30s timeout, fallback to arbiter)
  4. ChromaDB Memory — retrieves similar past decisions (cosine > 0.6) as [PAST EXPERIENCE], then logs outcome for future retrieval

DragonSlayer — Local GPU Bot

An autonomous Ender Dragon speedrunner running sweaterdog/andy-4:q8_0 via Ollama on an RTX 3090. Features:

  • 6-chunk progression system with persistent state in dragon_progress.json
  • RC31 auto-redirect: blocked crafting/collection commands intercepted by agent.js and redirected to !dragonProgression orchestrator
  • 4-tier getDiamondPickaxe (RC29): wood→stone→iron→diamond with ensureSticksAndTable(), tier-skip logic, furnace crafting, strip-mining, sideways relocation
  • RC30 error handler: 16+ non-fatal exception patterns swallowed to prevent Baritone/pathfinder crashes
  • New skills: pillarUp() (cobblestone/dirt self-scaffold), stripMineForOre() (1×2 tunnel with 5-face ore detection)
  • Chunk 1 (diamond pickaxe) COMPLETE — bot successfully mined 3 deepslate diamond ore and crafted diamond pickaxe
  • Chunk 2 (nether portal) IN PROGRESS — debug tracing added, surface recovery for underground starts

Shared Infrastructure

  • Paper 1.21.11 on AWS EC2 (<EC2_PUBLIC_IP>:19565), ONLINE_MODE=FALSE, ENFORCE_SECURE_PROFILE=FALSE
  • ChromaDB-backed vector memory (3072-dim Gemini embeddings) shared across bots
  • HUD dashboard at :8080 with per-bot runtime, goal, and command log
  • Discord bot with direct chat, admin commands, auto-fix monitor, and usage tracking

Table of Contents

  1. Key Features
  2. Minecraft Version Compatibility
  3. Commit 1 — Ensemble Decision Engine
  4. Commit 2 — Ender Dragon Automation, RC13–RC31 Skills, Baritone Pathfinding
  5. Commit 3 — Vision System, HUD Overlay, Discord Bot
  6. Commit 4 — Security Hardening
  7. Commit 5 — AWS / Docker Infrastructure
  8. Commit 6 — Profiles, Core Agent, Model Providers, Config
  9. Commits 7–12 — RC29–RC31 DragonSlayer Fixes
  10. Breaking Changes
  11. Test Plan
  12. Copilot Review Responses
  13. How to Review

Minecraft Version Compatibility

Server — Java Edition

The server runs Paper with VERSION: LATEST (currently 1.21.11) and automatically tracks the latest stable Paper release across rebuilds.

Plugin / Layer Versions covered What it does
ViaVersion Latest Java clients on older servers (always-on) Protocol translation — modern clients to latest server
ViaBackwards Java 1.9 – 1.21.x clients Allows older Java clients to connect to a newer server
ViaRewind Java 1.7.4 – 1.8.9 clients Extends backwards compat to legacy Java clients
Geyser (beta) All current Bedrock Edition versions Full Bedrock cross-play on UDP :19132

Floodgate is not needed — the server runs ONLINE_MODE=FALSE, so Bedrock players connect directly through Geyser without a linked Java account.

Result: any Java client from 1.7.4 → latest and any Bedrock client on any current Bedrock version can join. No client-side mods required.


Bots — mineflayer version negotiation

The bots connect using mineflayer with automatic version detection:

// settings.js
"minecraft_version": "auto"   // auto-negotiates protocol version on connect

mineflayer officially supports up to 1.21.6; with ENFORCE_SECURE_PROFILE=FALSE on the Paper server it connects cleanly to 1.21.11 without signed-chat rejection.

Scenario minecraft_version setting
Connect to latest Paper (1.21.11) "auto" ✅ (default)
Connect to older server (e.g., 1.20.4) "1.20.4"
Connect to unsupported/future version Use ViaProxy sidecar (see below)

Bedrock Edition — Playing alongside the bots

Bedrock players (Windows, iOS, Android, Console, Xbox) can join the same world the bots are playing in:

Field Value
Server address <EC2_PUBLIC_IP>
Port 19132 (UDP — Geyser default)
Client mods required None
Account required None (ONLINE_MODE=FALSE)

Geyser handles full Java↔Bedrock protocol translation server-side. The bots are unaffected — they connect on the Java TCP port (19565) and see Bedrock players as normal entities.


ViaProxy — Bot-side version bridging (optional)

For cases where the target server runs a version mineflayer doesn't natively support (pre-1.7 or future snapshots), a ViaProxy Docker sidecar is included:

# Enable the ViaProxy sidecar (docker-compose.yml profile)
docker compose --profile viaproxy up

# Point the bot at the proxy instead of the real server
# settings.js:
"host": "host.docker.internal",
"port": 25568

ViaProxy listens on :25568, auto-detects the target version, and translates. The bot always speaks modern 1.21.x protocol; ViaProxy handles all downward translation. auth-method: NONE works for offline servers; Microsoft account auth is also supported via the ViaProxy CLI.

⚠️ Disabled on EC2 by default — ViaProxy requires >256 MB JVM heap. The t3.large is at memory capacity with three containers (mindcraft, ChromaDB, LiteLLM). Enable only on hosts with ≥4 GB RAM headroom.


Full version matrix

Client type Supported versions Port Protocol
Java bot (mineflayer) auto / 1.7–1.21.x 19565 TCP mineflayer + ViaVersion on Paper
Java player (vanilla) 1.7.4 – latest 19565 TCP ViaVersion + ViaBackwards + ViaRewind
Bedrock player Any current Bedrock version 19132 UDP Geyser (beta)
Bot via ViaProxy Any (incl. unsupported) 2556819565 ViaProxy sidecar; off by default on EC2

Commit 1 — Ensemble Decision Engine

Location: src/ensemble/ (6 new files)

File LOC Purpose
controller.js 191 EnsembleModel — drop-in replacement for any single model provider
panel.js 146 Queries 4 models in parallel with configurable timeout
arbiter.js 156 Heuristic scoring; escalates if top-2 within 0.08 margin
judge.js 92 LLM-as-judge (grok-4); 30s timeout with arbiter fallback
feedback.js 178 ChromaDB vector memory; stores and retrieves past decisions (similarity > 0.6)
logger.js 114 Writes every decision to bots/{BotName}/ensemble_log.json

Integration: EnsembleModel implements the same sendRequest(turns, systemMessage) interface as every other model class, so prompter.js uses it as a transparent drop-in when profile.model is "ensemble".


Commit 2 — Ender Dragon Automation, RC13–RC31 Skills, Baritone Pathfinding

Ender Dragon Automation

Location: src/agent/library/dragon_runner.js (1589 LOC) + dragon_progress.js (359 LOC)

Chunk Function What it does Status
1 getDiamondPickaxe() Mine wood → stone → iron → diamond; craft tools at each tier ✅ Complete
2 buildNetherPortal() Mine 10 obsidian, build frame, light with flint & steel 🔄 In Progress
3 collectBlazeRods() Enter Nether, locate fortress, kill blazes for 7+ rods ⏳ Pending
4 collectEnderPearls() Kill endermen for pearls, craft 15+ eyes of ender ⏳ Pending
5 locateStronghold() Throw eyes to triangulate stronghold, find End portal ⏳ Pending
6 defeatEnderDragon() Enter End, destroy crystals, kill dragon ⏳ Pending

Orchestrator: loads persistent state on start, 5 retries per chunk with exponential backoff, death recovery (return to drop coords, re-craft lost tools), dimension-aware. Atomic state writes to dragon_progress.json.

getDiamondPickaxe — 4-Tier Rewrite (RC29–RC31)

The getDiamondPickaxe() function was completely rewritten across RC29–RC31 to handle the specific limitations of the andy-4:q8_0 model:

Tier Tools Key Logic
1 — Wood Wooden pickaxe Collect logs → ensureSticksAndTable() → craft wooden pickaxe. Skip if already have cobblestone + sticks.
2 — Stone Stone pickaxe Mine 3 cobblestone → upgrade. Cobblestone inventory check before mining.
3 — Iron Iron pickaxe stripMineForOre() at y=16 → smelt with furnace → craft. Auto-crafts furnace (8 cobblestone), finds fuel (coal/planks).
4 — Diamond Diamond pickaxe stripMineForOre() at y=-58 → craft. Sideways cardinal relocation on retry (no goToSurface). Perpendicular strip-mine fallback. MAX_FALL_BLOCKS = 5 for cave gaps.

Helper: ensureSticksAndTable() — ensures 4+ sticks and a reachable crafting table (distance ≤ 4 blocks). Places table from inventory if navigation fails. Converts logs→planks if plank shortage.

Baritone A* Pathfinding (RC25+)

patches/@miner-org+mineflayer-baritone+4.5.0.patch + world.js + skills.js:

  • Custom A* pathfinder; distance-adaptive timeouts (dist × 1s + 10s, min 20s)
  • Ghost block handling for Paper servers (re-fetch block after nav)
  • isClearPath() temporarily disables break/place to test feasibility
  • Null-guard on isWalkable to prevent boundingBox crash on respawn

Skills RC13–RC31

RC Fix
13 safeToBreak filter rejecting tree logs + log-type fallback
14–17 Multi-hop smart explore (200-block relocation), water avoidance, wider collect range
18–20 Resilient collectBlock: retry on "Digging aborted" from combat, exclude failed positions
22 Aikar GC flags; RC23: bypass bot.collectBlock.collect() for Paper compatibility
24 Timeout-protected dig: goToPosition(15s), dig(10s), pickup(8s)
26 Prefer doors over block-breaking; stale dig fix (re-fetch after nav)
27 9 runtime bug fixes from log review
29 getDiamondPickaxe 4-tier rewrite, ensureSticksAndTable(), tier-skip logic
30 pillarUp(), stripMineForOre(), broadened error handler (16+ non-fatal patterns), MAX_FALL_BLOCKS 2→5
31 Auto-redirect blocked commands → !dragonProgression, furnace crafting + iron smelting, craftRecipe unreachable table fallback, debug tracing, surface recovery

Commit 3 — Vision System, HUD Overlay, Discord Bot

Vision System

  • Dynamic import() for camera module — graceful fallback if WebGL/canvas unavailable
  • Xvfb + Mesa software rendering in Docker (LIBGL_ALWAYS_SOFTWARE=1)
  • 2-second delay after Xvfb start for WebGL context initialization
  • Patched prismarine-viewer: entity bone parent null check, unknown entity suppression
  • Per-bot vision model support (e.g., grok-2-vision-1212 for CloudGrok)

HUD Overlay (mindserver.js + public/index.html)

  • Gaming-style web dashboard at :8080
  • Per-bot panels: runtime tracker, current goal, action display, scrollable command log
  • Live bot camera feeds via protocol-aware viewer iframes
  • Responsive CSS with toolbar and bot controls

Discord Bot (discord-bot.js, 1130 LOC)

  • Direct bot chat: talk to any bot via mapped Discord channels
  • Admin commands: !start, !stop, !restart with group-based control
  • Auto-fix monitor: watches bot-output events for errors, suggests fixes
  • Role-based access: DISCORD_ADMIN_IDS for privileged commands
  • Usage tracking: !usage [agent|all] shows token/cost stats
  • Path traversal guard + command injection detection on all user input

Commit 4 — Security Hardening

Module What changed
src/utils/message_validator.js Injection detection (shell commands, backticks, pipe-to-shell), character sanitization, length limits
src/utils/rate_limiter.js Sliding-window per-user limiter with stale entry cleanup
src/utils/usage_tracker.js Token usage tracking with cost estimation for active providers
src/utils/keys.js Environment variables always override keys.json; .env support
src/agent/speak.js TTS sanitization to prevent command injection
src/agent/commands/index.js isCommandBlocked() against per-profile blocked_actions array
settings.js deepSanitize() strips __proto__, constructor, prototype from SETTINGS_JSON overrides
discord-bot.js Path traversal guard on profile loading
Dockerfile Non-root user, minimal apt packages

Commit 5 — AWS / Docker Infrastructure

Docker / Compose

  • docker-compose.aws.yml: production config with LiteLLM proxy (:4000), ChromaDB, Tailscale sidecar, ENFORCE_SECURE_PROFILE=FALSE (required for mineflayer unsigned chat on Paper 1.19.1+)
  • docker-compose.yml: dev compose with Ollama host routing, port 19565 (was 25565)
  • Dockerfile: non-root user, Xvfb/Mesa for vision, memory 1536M → 2560M
  • Tasks.Dockerfile: separate image for task evaluation runner

AWS Scripts (aws/)

Script Purpose
ec2-go.sh One-command deploy: pull, rebuild, restart. IMDSv2 support, auto-detects local vs remote
setup.sh Full EC2 provisioning: Docker, Tailscale, ChromaDB, environment setup
deploy.sh Rsync-based deployment with SSM secret pulling
ec2-deploy.sh Self-contained bootstrap for EC2 browser SSH
backup.sh / restore.sh S3 world backup/restore
env-toggle.sh Switch between cloud/local/hybrid environment configs
setup-ollama-proxy.sh socat systemd service for Tailscale→Ollama routing

Observability

  • prometheus-aws.yml: Prometheus scrape config for EC2
  • grafana-provisioning/: dashboards, datasources, and alerting rules

Server Security

  • whitelist.json: pre-built offline UUIDs (avoids Playerdb API crash for ONLINE_MODE=FALSE bot names)
  • Port hardened to 19565 (non-default, external); internal remains 25565
  • ENFORCE_SECURE_PROFILE=FALSE: Paper 1.19.1+ requires cryptographically signed chat; mineflayer sends unsigned packets which are silently dropped without this flag

Commit 6 — Profiles, Core Agent, Model Providers, Config

New Profiles

Profile Model Purpose
ensemble.json All-Grok 4-panel + grok-4 judge CloudGrok ensemble bot
cloud-persistent.json grok-4 Single-model cloud fallback
dragon-slayer.json sweaterdog/andy-4:q8_0 via Ollama Autonomous Ender Dragon speedrun
local-research.json sweaterdog/andy-4 via Ollama Research and exploration

Core Agent Improvements

  • agent.js: death handler + respawn recovery, post-respawn grace period (5s), human-player message priority, max-command cap at 15
  • action_manager.js: deadlock detection, cancelResume(), improved error propagation
  • modes.js: night bed mode, door navigation preference, anti-team-kill guards, food panic spiral prevention
  • history.js / learnings.js: atomic writes (.tmp + rename), EBADF retry with exponential backoff
  • agent_process.js: 3 quick-exit retries with increasing delay
  • conversation.js: inter-bot messaging protocol; alias system (/msg gkGrok_En)

Model Providers

  • prompter.js: $WIKI, $LEARNINGS, $MEMORY, $INVENTORY, $STATS placeholder injection
  • grok.js: updated for grok-4 / grok-4-fast-non-reasoning API
  • ollama.js: switched to http.request (avoids headers timeout), num_ctx/num_gpu params
  • All other providers: price table and API compatibility updates

Config / Tooling

  • settings.js: deepSanitize(), SETTINGS_JSON env override, allow_vision, port 19565
  • eslint.config.js: flat config migration; zero-warning enforcement
  • .husky/pre-commit: lint gate on every commit
  • data/minecraft_wiki.json: wiki data for $WIKI prompt injection
  • CLAUDE.md: AI assistant guidance for codebase navigation

Commits 7–12 — RC29–RC31 DragonSlayer Fixes

These 6 commits represent iterative debugging and stabilization of the DragonSlayer bot during live testing on the EC2 server.

Commit 7 — Repetition Guard & Self-Prompt Safety

  • Repetition guard, ocean escape, context truncation, self-prompt safety

Commit 8 — Anti-Stuck Routing & Terrain Escape

  • Anti-stuck routing, terrain escape, getDiamondPickaxe log fallback

Commit 9 — CI/CD Workflows

  • GitHub Actions: ESLint lint workflow + EC2 deploy workflow

Commit 10 — Copilot Code Review Fixes

  • Ensemble judge/controller lint fixes, Discord bot improvements, speak.js fix, docker-compose corrections

Commit 11 — RC31: Full DragonSlayer Overhaul

The largest single change — complete rewrite of the DragonSlayer pipeline:

skills.js (+552 lines):

  • getDiamondPickaxe() 4-tier rewrite with ensureSticksAndTable() helper
  • pillarUp(bot, distance) — cobblestone/dirt self-scaffold to escape shafts
  • stripMineForOre(bot, oreNames, length) — 1×2 horizontal tunnel with 5-face ore detection
  • craftRecipe() unreachable table fallback — places crafting_table from inventory
  • digDown() MAX_FALL_BLOCKS increased from 2 to 5
  • Furnace crafting + fuel checking for iron smelting
  • Tier 4 sideways cardinal relocation (no goToSurface which caused nav timeouts)

agent.js (+25 lines):

  • RC31 auto-redirect: when a blocked command (!craftRecipe, !collectBlocks, !searchForBlock, !getCraftingPlan, !newAction) is detected, agent automatically executes !dragonProgression instead
  • This is the KEY mechanism — andy-4:q8_0 never generates !dragonProgression on its own

init_agent.js (+34 lines):

  • RC30 broadened error handler with 16+ NON_FATAL_PATTERNS
  • isNonFatal(msg) helper catches Baritone pathfinder noise, navigation timeouts, connection resets

dragon_runner.js (+46 lines):

  • RC31 debug tracing ([RC31] console.log markers) throughout prepareForChunk, orchestration loop, chunk runner, buildNetherPortal
  • Surface recovery for nether portal chunk (goToSurface + pillarUp fallback when y < 50)

dragon-slayer.json (profile overhaul):

  • blocked_actions: ["!startConversation", "!moveAway", "!craftRecipe", "!collectBlocks", "!searchForBlock", "!getCraftingPlan", "!newAction"]
  • 18+ conversation examples including blocked command redirects and chunk transitions
  • Self-prompt directs bot to ALWAYS call !dragonProgression

Commit 12 — Merge Commit

  • Merged remote Copilot review fixes into PR branch

Breaking Changes

  • Node.js v18+ required (v20 LTS recommended; v24+ may have issues with native modules)
  • .env file now required for API keys (previously keys.json only)
  • allow_insecure_coding defaults to false!newAction blocked unless explicitly enabled
  • Profile format: new fields (blocked_actions, conversation_examples, mode configs) — old profiles load but won't use new features
  • Server port changed from 25565 to 19565
  • Docker containers now request 2560M minimum memory

Test Plan

Manually Tested

  • Both bots connect and play simultaneously on shared world
  • Ensemble 3-phase pipeline produces correct responses (CloudGrok on EC2)
  • DragonSlayer (andy-4:q8_0 via Ollama on RTX 3090) connects and plays autonomously
  • Vision system works in Docker with Xvfb + Mesa
  • HUD overlay displays correctly at :8080
  • Baritone pathfinding handles Paper server ghost blocks
  • ENFORCE_SECURE_PROFILE=FALSE allows mineflayer bot chat to appear in-game
  • Auto-bed and door navigation work in survival mode
  • Dragon chunk 1 — diamond pickaxe crafted successfully (4-tier progression: wood→stone→iron→diamond)
  • Strip-mine found 4 iron ore blocks in 40-block tunnel at y=16
  • Furnace crafted, raw_iron smelted to iron_ingots
  • Iron pickaxe crafted (tier 3 milestone)
  • Diamond pickaxe crafted from 3 deepslate_diamond_ore (tier 4 milestone)
  • RC31 auto-redirect intercepts blocked commands and runs !dragonProgression
  • RC30 error handler catches Baritone noise without crashing
  • Death recovery resumes from checkpoint
  • Discord bot commands work (start/stop/restart/usage)
  • ESLint passes with zero warnings
  • Docker build succeeds
  • EC2 deployment via ec2-go.sh works
  • Whitelist correctly allows bot join without Playerdb API call
  • Bedrock client can connect via Geyser on port 19132
  • Java 1.8 client can connect via ViaRewind

In Progress

  • Dragon chunk 2 — nether portal: debug tracing added, surface recovery implemented, crash under investigation
  • Dragon chunks 3–6: pending chunk 2 completion

Automated

  • ESLint zero-warning gate (pre-commit hook + CI Docker build)
  • GitHub Actions lint workflow

Suggested Community Testing

  • Run with different LLM providers (OpenAI, Claude, etc.)
  • Test on various Minecraft server versions
  • Try ensemble with different panel model combinations
  • Run !beatMinecraft from a fresh world
  • Test Bedrock cross-play with a phone/console client

Copilot Review Responses

# File Issue Status
1 Tasks.Dockerfile git clone pulls unpinned HEAD Acknowledged — upstream task runner pattern. SHA pinning is valid but out of scope.
2 src/mindcraft/mindcraft.js agent_name in /tmp path unsanitized Fixedpath.basename() sanitization added.
3 src/agent/conversation.js endAllConversations() async but doesn't await Fixed — removed unnecessary async keyword.
4 src/agent/library/world.js isClearPath() assumes bot.ashfinder exists Fixed — added bot.ashfinder?.config null-guard with fallback.
5 src/agent/commands/index.js delete commandList.find(...) doesn't remove from array Acknowledged — blocklist enforced via isCommandBlocked() at execution time; vestigial delete cleaned up.
6 src/agent/vision/vision_interpreter.js Camera may be undefined during async init Fixedcapture() guards on this.camera?.ready.
7 eslint.config.js Broad globals may hide real no-undef bugs Acknowledged — globals are required for the SES sandbox runtime (coder.js).
8 Tasks.Dockerfile Supply-chain risk from unpinned clone Same as #1.

How to Review

Suggested order for a large PR:

  1. New standalone modulessrc/ensemble/ (6 files) and src/agent/library/dragon_runner.js / dragon_progress.js. Self-contained, don't modify existing code.
  2. Version compatibilitydocker-compose.aws.yml plugin list, ENFORCE_SECURE_PROFILE flag, settings.js minecraft_version: "auto".
  3. Security additionssrc/utils/message_validator.js, rate_limiter.js, settings.js (deepSanitize). Small, focused files.
  4. Core agent changesagent.js, action_manager.js, modes.js. These modify existing behavior.
  5. Skills changesskills.js is the largest diff (+552 lines in RC31 alone). The RC13–RC31 changes are incremental, each with a clear commit message.
  6. Infrastructuredocker-compose*.yml, Dockerfile, aws/ scripts. Review for correctness and security.
  7. Profiles — JSON configuration. Spot-check a few for valid structure.

Files that can be skimmed or skipped: README.md (docs only), discord-bot.js (additive, doesn't touch core), start.ps1 (Windows convenience script), CLAUDE.md (AI assistant guidance).

…DB memory

Adds the 3-phase ensemble pipeline used by CloudGrok:
- panel.js: queries 4 Grok models in parallel (60s timeout)
- arbiter.js: heuristic scoring (length, completeness, action quality, latency);
  escalates to judge when top-2 within 0.08 margin
- judge.js: LLM-as-judge (grok-4) picks best response; 30s timeout with fallback
- feedback.js: ChromaDB vector memory (3072-dim Gemini embeddings); retrieves
  similar past decisions (similarity > 0.6) and injects as [PAST EXPERIENCE]
- logger.js: writes every decision to bots/{BotName}/ensemble_log.json
- controller.js: EnsembleModel class — drop-in replacement for any single model

Integration: profile.model = 'ensemble' routes through EnsembleModel, which
implements the same sendRequest() interface as all other model providers.
Z0mb13V1 added 5 commits March 2, 2026 06:39
…tone pathfinding

Ender Dragon automation (dragon_runner.js + dragon_progress.js):
- 6 gameplay chunks: getDiamondPickaxe, buildNetherPortal, collectBlazeRods,
  collectEnderPearls, locateStronghold, defeatEnderDragon
- Persistent state (dragon_progress.json) with atomic writes and corruption recovery
- 5 retries per chunk with exponential backoff; death recovery returns to drop coords
- !beatMinecraft / !dragonProgression commands (120-180 min timeout)

Baritone A* pathfinding (RC25+):
- Custom A* pathfinder replacing mineflayer-pathfinder; distance-adaptive timeouts
- Ghost block handling for Paper servers (re-fetch block after nav)
- isClearPath() with Baritone integration; null-guard on isWalkable for respawn

Skills RC13-RC29 (skills.js + world.js):
- RC13: safeToBreak filter fix for tree logs
- RC14-17: multi-hop smart explore (200-block relocation), water avoidance, wider collect range
- RC18-20: resilient collectBlock — retry on combat, exclude failed positions
- RC22: Aikar GC flags; RC23: bypass bot.collectBlock.collect() for Paper
- RC24: timeout-protected dig (goToPosition 15s, dig 10s, pickup 8s)
- RC26: prefer doors over block-breaking; stale dig fix (re-fetch after nav)
- RC27: 9 runtime bug fixes from log review
Vision system (src/agent/vision/):
- Dynamic import() for camera module with graceful fallback if WebGL unavailable
- Xvfb + Mesa software rendering in Docker (LIBGL_ALWAYS_SOFTWARE=1)
- 2-second delay after Xvfb start for WebGL context initialization
- Patched prismarine-viewer: entity bone parent null check, unknown entity suppression
- Per-bot vision model support (grok-2-vision-1212 for CloudGrok)

HUD overlay (mindserver.js + public/index.html):
- Gaming-style web dashboard at :8080
- Per-bot panels: runtime tracker, current goal, action display, scrollable command log
- Live bot camera feeds via protocol-aware viewer iframes
- Toolbar with bot controls; responsive CSS

Discord bot (discord-bot.js):
- Direct bot chat via Discord channels
- Admin commands: !start, !stop, !restart with group-based control
- Auto-fix monitor: watches bot-output events for errors, suggests fixes
- Role-based access (DISCORD_ADMIN_IDS), usage tracking (!usage [agent|all])
- Path traversal guard + command injection detection on all user input
- MindServer integration: live agent status display

Windows launcher (start.ps1): one-command start/stop/detach for all bot profiles
…ctions, key loading

New security modules:
- message_validator.js: injection detection (shell commands, backticks, pipe-to-shell),
  character sanitization, and length limits for Discord/Minecraft chat
- rate_limiter.js: sliding-window per-user limiter with automatic stale entry cleanup
- usage_tracker.js: token usage tracking with cost estimation for active providers

Existing file hardening:
- keys.js: environment variables always override keys.json; added .env support
- speak.js: TTS sanitization to prevent command injection in text-to-speech
- commands/index.js: isCommandBlocked() check against per-profile blocked_actions array;
  settings.js deepSanitize() strips __proto__, constructor, prototype from SETTINGS_JSON
…Grafana monitoring

Docker / Compose:
- docker-compose.aws.yml: production config with LiteLLM proxy (:4000),
  ChromaDB, Tailscale sidecar, ENFORCE_SECURE_PROFILE=FALSE for mineflayer chat
- docker-compose.yml: dev compose with Ollama host routing, port 19565 (was 25565)
- Dockerfile: non-root user, Xvfb/Mesa for vision, memory 1536M→2560M
- Tasks.Dockerfile: separate image for task evaluation runner
- .dockerignore: exclude bot logs, node_modules, world saves from build context

AWS scripts (aws/):
- ec2-go.sh: one-command deploy (pull/rebuild/restart); IMDSv2 support, auto-detects
  local vs remote execution
- setup.sh: full EC2 provisioning — Docker, Tailscale, ChromaDB, environment setup
- deploy.sh: rsync-based deployment with SSM secret pulling
- ec2-deploy.sh: self-contained bootstrap for EC2 browser SSH
- env-toggle.sh: switch between cloud/local/hybrid environment configs
- backup.sh / restore.sh: S3 world backup and restore
- setup-ollama-proxy.sh: socat systemd service for Tailscale→Ollama routing

Observability:
- prometheus-aws.yml: Prometheus scrape config for EC2 deployment
- grafana-provisioning/: dashboards, datasources, and alerting rules

Security:
- whitelist.json: pre-built offline UUIDs (avoids Playerdb crash for ONLINE_MODE=FALSE)
- .env.example / keys.example.json: API key templates
- .husky/pre-commit: ESLint zero-warning gate on every commit

Patches:
- prismarine-viewer: entity bone parent null check, unknown entity suppression
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a “Hybrid Research Rig” setup for running multiple autonomous Minecraft agents concurrently (cloud ensemble + local GPU), adds an ensemble decision pipeline with ChromaDB-backed experience retrieval, expands automation/task profiles (notably Ender Dragon progression), and includes multiple security/ops hardening changes across runtime, Docker/AWS, and messaging.

Changes:

  • Adds an ensemble model controller (panel → heuristic arbiter → optional LLM judge) with decision logging and ChromaDB similarity memory.
  • Adds/updates infra & runtime features: remote agent support, usage tracking, rate limiting/message validation, Docker/AWS scripts/configs.
  • Expands profiles/tasks for dragon progression and multi-mode provider routing.

Reviewed changes

Copilot reviewed 146 out of 147 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
Tasks.Dockerfile Reworks the tasks runner image (Node 22 slim, Java 21, AWS CLI, non-root) and switches to copying repo source instead of git clone.
package.json Adds/updates dependencies and introduces prepare/lint/lint-staged scripts and additional overrides.
src/utils/usage_tracker.js Adds per-agent token/cost usage tracking with periodic persistence and rolling RPM/TPM stats.
src/utils/rate_limiter.js Adds an in-memory sliding-window rate limiter with periodic stale-entry cleanup.
src/utils/message_validator.js Adds message validation/sanitization for Discord and Minecraft chat plus username validation.
src/utils/keys.js Makes env vars override keys.json, adds sanitization and one-time warnings for legacy key usage.
src/ensemble/* Adds ensemble querying, scoring, judging, logging, and ChromaDB-backed feedback/memory retrieval.
src/process/init_agent.js Adds remote MindServer connection mode and global handlers for uncaught errors/rejections.
src/process/agent_process.js Adds remote-agent spawn arguments, restart/backoff logic, and process.execPath spawning.
Dockerfile Improves caching, runs tests in build, and drops root privileges for runtime.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

COPY ./server_data.zip /mindcraft/
RUN unzip -q server_data.zip && rm server_data.zip

RUN npm ci --omit=dev
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

npm ci --omit=dev will skip devDependencies, but this repo relies on install scripts that invoke dev tools (e.g., postinstall: patch-package and prepare: husky install in package.json). This is likely to fail the image build because patch-package/husky won’t be present. Consider either (1) running npm ci without --omit=dev and then pruning dev deps, or (2) disabling install scripts (--ignore-scripts / HUSKY=0) and ensuring patches are applied another way.

Suggested change
RUN npm ci --omit=dev
RUN npm ci && npm prune --omit=dev

Copilot uses AI. Check for mistakes.
read -r -p "Are you sure? Type 'yes' to confirm: " CONFIRM
[[ "$CONFIRM" == "yes" ]] || { echo "Aborted."; exit 0; }

SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=no"
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This restore helper also uses ssh with -o StrictHostKeyChecking=no, which disables host key verification and enables transparent man-in-the-middle of the SSH session. A network attacker could impersonate the EC2 host so that your restore runs against an attacker-controlled machine, potentially exposing world data or other state. Replace StrictHostKeyChecking=no with accept-new (or pre-seed known_hosts) to enforce host identity verification on subsequent connections.

Suggested change
SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=no"
SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=accept-new"

Copilot uses AI. Check for mistakes.
return
fi
info "Stopping Mindcraft containers on EC2..."
SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=no -o ConnectTimeout=5"
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stopping the EC2 workloads via ssh with -o StrictHostKeyChecking=no disables host key verification, so a man-in-the-middle can impersonate your instance and receive these administrative commands. While this script currently only stops containers and the instance, using insecure SSH options here normalizes a pattern that can lead to more sensitive operations being MITM'd later. Prefer StrictHostKeyChecking=accept-new or managing known_hosts so that SSH refuses connections when the EC2 host key does not match the expected value.

Suggested change
SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=no -o ConnectTimeout=5"
SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=accept-new -o ConnectTimeout=5"

Copilot uses AI. Check for mistakes.
[[ -n "${KEY_FILE:-}" ]] || error "KEY_FILE not set in config.env"
[[ -f "$KEY_FILE" ]] || error "SSH key not found: ${KEY_FILE}. Run aws/setup.sh first."

SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=no -o ConnectTimeout=10"
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ssh with -o StrictHostKeyChecking=no disables host key verification, which allows a man-in-the-middle to impersonate your EC2 instance and run arbitrary commands under the ubuntu user when you run this deploy script. An attacker on the network path could intercept code syncs and remote commands, or trick you into configuring secrets on their host instead of the real server. Switch to StrictHostKeyChecking=accept-new (or pre-populate known_hosts) so the SSH client verifies the EC2 host key on subsequent connections.

Suggested change
SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=no -o ConnectTimeout=10"
SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=accept-new -o ConnectTimeout=10"

Copilot uses AI. Check for mistakes.
# shellcheck source=/dev/null
source "$CONFIG_FILE"
[[ -n "${EC2_IP:-}" ]] || { echo "EC2_IP not set"; exit 1; }
SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=no"
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here ssh is invoked with -o StrictHostKeyChecking=no, which disables verification of the EC2 host key and makes it possible for a man-in-the-middle to impersonate your instance during backup. An attacker able to intercept this SSH connection could cause backups to run against their own host or observe any data the script might later transmit over this channel. Use StrictHostKeyChecking=accept-new or manage known_hosts explicitly so that SSH refuses connections when the host key changes unexpectedly.

Suggested change
SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=no"
SSH_OPTS="-i ${KEY_FILE} -o StrictHostKeyChecking=accept-new"

Copilot uses AI. Check for mistakes.
@Z0mb13V1 Z0mb13V1 force-pushed the pr/hybrid-rig-v0.1.3 branch from 01679b6 to a270f05 Compare March 2, 2026 11:40
Z0mb13V1 and others added 6 commits March 2, 2026 06:57
…safety

- agent.js: inject [ANTI-LOOP] system message after _forceExplore fires so
  small models (andy-4:q8_0) cannot loop back to !collectBlocks
- dragon_runner.js (death-recovery): ocean escape before wood-collection loop:
  check for any log within 64 blocks; if none, explore up to 3×200 blocks;
  fall back to getDiamondPickaxe() if still no trees after 600 blocks
- history.js: after each memSaving cycle, cap this.turns at 15 entries to
  prevent context overflow / instruction collapse on small local models
- dragon-slayer.json: sharpen ANTI-STUCK rules (never retry !collectBlocks
  immediately; use !getDiamondPickaxe for the full chain); add conversation
  example showing correct response after repeated collect-0 + ANTI-LOOP hint
…ain hardening (#3)

* Initial plan

* fix: security, correctness, and reliability bugs across 7 files

- judge.js: fix timer leak in Promise.race (clear timeout on success and error)
- controller.js: wrap getSimilar() in try-catch to prevent crash when ChromaDB goes down
- feedback.js: add null check for _client before deleteCollection in error handler
- speak.js: use '--' end-of-options for macOS 'say' (consistent with espeak, prevents flag injection)
- discord-bot.js: move Gemini API key from URL query param to x-goog-api-key header
- dragon_runner.js: add try-catch to death handler, validate coordinates are finite
- docker-compose.aws.yml: pin 5 image tags to specific versions instead of :latest

Co-authored-by: Z0mb13V1 <201696719+Z0mb13V1@users.noreply.github.com>

* fix: revert macOS say to space-prefix guard (say does not support --)

Co-authored-by: Z0mb13V1 <201696719+Z0mb13V1@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Z0mb13V1 <201696719+Z0mb13V1@users.noreply.github.com>
…llback

- dragon-slayer: redirect all stuck messages to !getDiamondPickaxe (not !explore);
  add examples covering moveAway-blocked, nav timeout, and collectBlocks-0 cases
- skills.js: collectBlocks failure message now points to !getDiamondPickaxe;
  explore() gains terrain-escape branch (goToSurface + random hop) on consecutive
  path failures instead of silently breaking; getDiamondPickaxe retries log search
  with explore(200) before giving up, re-reads inventory to handle mismatched log types
- local-research: switch model to sweaterdog/andy-4:F16
…\n- Rewrote getDiamondPickaxe with 4-tier progression (wood→stone→iron→diamond)\n- Added ensureSticksAndTable helper, furnace crafting, iron smelting\n- Added pillarUp and stripMineForOre skills\n- RC29/RC31 auto-redirect: blocked crafting commands → !dragonProgression\n- Blocked actions: craftRecipe, collectBlocks, searchForBlock, getCraftingPlan, newAction\n- RC30 broadened error handler (16+ non-fatal patterns for Baritone noise)\n- craftRecipe unreachable table fallback (place from inventory)\n- digDown MAX_FALL_BLOCKS 2→5, tier 4 sideways relocation\n- dragon_runner.js: RC31 debug tracing, surface recovery for nether portal chunk\n- Profile overhaul: conversation examples, self_prompt, blocked_actions\n- Chunk 1 (diamond_pickaxe) COMPLETE, chunk 2 (nether_portal) in progress"
async function callGemini(systemPrompt, userMessage, history = []) {
if (!GOOGLE_API_KEY) return null;
try {
const url = 'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent';
Copy link
Contributor

@uukelele uukelele Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you just use the google genai client library?

@uukelele
Copy link
Contributor

uukelele commented Mar 3, 2026

This PR is far too bloated to ever be merged. Just include what you need, don't make countless adjustments. And don't include things that don't need to be included, like AWS integration or a discord bot.

@uukelele
Copy link
Contributor

uukelele commented Mar 4, 2026

[2601.15494] Vibe Coding Kills Open Source

Why did you open like 20 different PRs...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants