fix: Dockerfile patch order, viewer iframe hostname, Xvfb timing, EPIPE handling, 14 code review fixes, ensemble system, HUD overlay#710
Closed
Z0mb13V1 wants to merge 239 commits intomindcraft-bots:developfrom
Conversation
- Removed hardcoded Discord bot token from all files - Removed hardcoded Discord webhook URL from backup scripts - Implemented environment variable system (.env) - Updated .gitignore with comprehensive secret patterns - Created security documentation and incident reports - All secrets now managed via .env file (not committed) This commit starts a fresh history without any exposed credentials. Previous commits contained sensitive data and have been purged. Security incident ref: Discord Security Alert - Token exposure detected Date: February 24, 2026
Bot changes: - Renamed: gemini -> Gemini_1, gemini2 -> Gemini_2, Grok -> Grok_1 - Added Grok_1 profile (grok-code-fast-1 via xAI Cloud) - Added Grok_1 to settings.js profiles array - All conversing prompts updated with identity facts (name, model, provider, compute) - init_message now requests model + compute type on spawn Embedding fix: - gemini.js embed() return: result.embeddings -> result?.embedding?.values (@google/genai v1.x API change) - Embedding model: text-embedding-004 -> gemini-embedding-001 (text-embedding-004 not available in v1beta endpoint) - grok.json embedding: openai -> google/gemini-embedding-001 - skill_library.js: log actual error message in catch block Infrastructure: - Container names: minecraft-pc -> minecraft-server, added mindcraft-agents - settings.js host: 192.168.0.30 -> minecraft-server - allow_vision: false (WebGL unavailable in Docker) - check-backup.ps1: fix PS env var eval in param() defaults - .env.example: MINDSERVER_HOST updated to mindcraft-agents Docs: - CHANGELOG.md: created with full version history
…docs Bugs fixed: - grok.js: res?.replace() null safety (API can return null content -> TypeError) - gemini.js sendVisionRequest: generationConfig -> config (v1.x SDK field) - discord-bot.js HELP_TEXT: gemini: example -> Gemini_1: - AVAILABLE-MODELS.md: stale profile names + grok-beta -> grok-code-fast-1 Docs: - CHANGELOG.md: added [Unreleased] workflow instructions + bug fix entries - AVAILABLE-MODELS.md: model selection guide updated
…MD022/MD032/MD024
- Added _modes config (cloud, local, hybrid) to all 18 profile JSON files - Created switch-mode.ps1 script to swap between modes - Cloud: uses cloud API models (Gemini, Grok, Claude, GPT, etc.) - Local: uses Ollama models via host.docker.internal:11434 - Hybrid: cloud model for chat, local Ollama model for code generation - Script auto-updates conversing prompt identity (Compute: line) - Default targets: Gemini_1, Gemini_2, Grok_1. Use -All for all profiles. - Usage: .\switch-mode.ps1 -Mode local [-Profile gemini,grok] [-Restart]
- Added !mode [cloud|local|hybrid] [profile] command to discord-bot.js - !mode with no args shows current modes for all active profiles - Switches model, embedding, code_model, and conversing prompt compute type - Auto-restarts affected agents via MindServer after switching - Mounted profiles/ directory into discord-bot container - Updated HELP_TEXT with !mode command
…oup) - Added bot groups: all, gemini (both), grok, cloud, 1 (slot 1), 2 (slot 2) - !start/!stop/!restart now accept groups, names, or comma-separated lists - Added !startall command alongside existing !stopall - resolveAgents() supports exact match, partial match, and group expansion - Updated HELP_TEXT with group syntax and examples
…king - Migrate all 18 profiles from Ollama to vLLM (google/gemma-3-12b-it) for local/hybrid modes - Rewrite src/models/vllm.js with _lastUsage tracking, <think> tag handling, vision support - Add persistent learning system (src/agent/learnings.js) that tracks command outcomes and injects $LEARNINGS into prompts for continuous bot improvement - Enhance usage tracker with RPM/TPM rolling window and offline disk-read fallback - Update discord-bot.js formatAgentUsage() to show cost, RPM, TPM per agent - Update mindserver.js to serve usage data for offline agents from disk - Add commented LiteLLM service block to docker-compose.yml for future use - Fix check-backup.ps1 parse error (add CmdletBinding), switch-mode.ps1 lint warnings - Add tmpclaude-* to .gitignore Co-Authored-By: Claude <noreply@anthropic.com>
Major improvements: 1. API Keys: Environment-first loading (env vars > keys.json) 2. Input Validation: New sanitization for Discord & Minecraft messages 3. vLLM GPU Memory: 90% → 70% for stability (RTX 3090) 4. Discord Rate Limiting: 5 msgs/60s per user 5. Async File I/O: Non-blocking profile operations in bot commands New files: - src/utils/message_validator.js (70 lines) - message/username validation - src/utils/rate_limiter.js (59 lines) - in-memory rate limiter - test-phase2.js - automated test suite for all optimizations - services/vllm/install.sh, start.sh, test.py - vLLM setup - setup-wsl-vllm.ps1 - WSL2 + vLLM initialization script - PHASE2-VERIFICATION.txt - complete optimization summary Changes: - discord-bot.js: +validation, +rate limiting, +async file I/O, !mode async - src/agent/agent.js: +message validation, +alias support - src/utils/keys.js: refactored to env-first pattern - docker-compose.yml: +./src volume mount for discord-bot - All profiles: updated model references (compatible with vLLM) Tests: All 5 optimizations verified and functional Bot: Running stable with agents connected
Add `* text=auto eol=lf` to enforce LF line endings for all text files in the repo, preventing CRLF/LF noise diffs in WSL environments. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Gemini_1 and Grok_1 were configured in hybrid mode with code generation routed to a local vLLM server (Qwen/Qwen2.5-7B-Instruct at port 8000) that is not running, causing all !newAction calls to silently fail. Both bots now use their cloud chat models for code generation until vLLM is available again. The _modes.hybrid config is preserved for future use when vLLM is set up. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- speak.js: replace exec() shell interpolation with spawn() arg arrays; Windows TTS text now passed via stdin (never touches shell command line) - lockdown.js: document evalTaming trade-off (unsafeEval required for mineflayer/protodef compatibility; AI code still runs in Compartment) - controller.js: use path.resolve() + boundary check to prevent path traversal when loading NPC construction JSON files - settings.js: filter __proto__/constructor/prototype keys from SETTINGS_JSON env var before Object.assign to block prototype pollution Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…=true, update deps (discord.js^14.25.1, mineflayer-auto-eat^5.0.3 w/ loader), add axios^1.7.7 override (vulns)
- Add nvidia-gpu-exporter service to docker-compose (port 9835) - Add Prometheus scrape target for nvidia-gpu-exporter - Add Docker Monitoring Grafana dashboard (mindcraft-bots#179) - Remove dead dcgm-exporter service (subscription-gated, never worked) - Remove dcgm Prometheus target and DCGM Grafana dashboard - Fix YAML indentation in prometheus.yml (nvidia-gpu job was over-indented) - Fix YAML indentation in grafana-provisioning/dashboards.yml - Add nvidia_gpu_exporter/ source repo to .gitignore Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- grafana: add NVIDIA GPU Metrics dashboard (#14574) to dashboards.yml - grafana: add explicit uid to prometheus datasource (required for alert refs) - grafana: add alerting/alerts.yml with 5 rules (GPU temp/util/VRAM, exporter down, mindcraft container down); mount alerting dir in docker-compose - package.json: pin undici ^6.23.0 in overrides to fix GHSA-g9mf-h72j-4rw9 without downgrading discord.js v14 - profiles: fix _active_mode hybrid→cloud on both bots (no code_model at top level, so hybrid label was incorrect) - profiles/gemini.json: fix identity facts (Cloud-based), remove invalid !jump reference, add failure recovery instruction - profiles/grok.json: fix identity facts, add sleep/use door to task triggers, add failure recovery instruction Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The deploy.resources.reservations.devices hierarchy was missing between the ports and driver entries, causing VS Code YAML linter to flag "All mapping items must start at the same column" error at line 81. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- tar: override to ^7.5.8 (fixes GHSA-8qq5-rm4j-mr97, GHSA-34x7-hfp2-rc4v, GHSA-r6q2-hw4h-h46w, GHSA-83g3-92jg-28cx — hardlink/symlink issues in node-gyp used by gl/node-canvas-webgl; macOS APFS risk, low on Linux) - undici: relax override to >=6.23.0 (allows 6.23.0+ and 7.x, fixes GHSA-g9mf-h72j-4rw9; cheerio@1.2.0 now resolves to 7.22.0 instead of outside override range; discord.js/rest override to 6.x+ compatible) npm audit still shows warnings because node_modules are root-owned from Docker. Apply fixes via Docker rebuild: run 'docker build .' from Windows CMD. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Previous dashboards.yml used an invalid url/id format that Grafana does
not support for provisioning. Grafana requires a file-based provider
pointing to local JSON dashboard files.
- dashboards.yml: rewrite to proper 'providers' format pointing to
/etc/grafana/provisioning/dashboards/json
- dashboard-json/: add downloaded JSON for all 3 dashboards:
- node-exporter-full.json (id 1860 - Node Exporter Full)
- docker-monitoring.json (id 179 - Docker Monitoring)
- nvidia-gpu.json (id 14574 - NVIDIA GPU Metrics)
nvidia-gpu.json: replace \${DS_PROMETHEUS} template vars with
datasource name 'prometheus' for provisioning compatibility
- docker-compose.monitoring.yml: mount dashboard-json/ dir into
Grafana container at provisioning path
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…ng.yml Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Adds full infrastructure-as-code for migrating the Mindcraft stack to AWS EC2 using GitHub Student Pack credits (~$100). Infrastructure (aws/setup.sh): - VPC, subnet, IGW, route table - Security group: Minecraft (0.0.0.0/0:25565), admin ports (Tyler IP only) - S3 bucket: block all public access, SSE-S3, versioning, HTTPS-only policy - IAM role with least-privilege S3 + SSM access - SSM Parameter Store for all API keys (SecureString, never in git) - EC2 t3.large, Ubuntu 24.04, 30GB gp3, IAM role attached EC2 bootstrap (aws/user-data.sh): - Installs Docker, Docker Compose plugin, AWS CLI v2 on first boot Deployment (aws/deploy.sh): - rsync code to EC2 (excludes secrets, node_modules, world data) - Pulls all API keys from SSM → writes keys.json + .env on EC2 - Starts containers with docker-compose.aws.yml Docker Compose AWS variant (docker-compose.aws.yml): - MEMORY=2G (Minecraft, down from 4G), VIEW_DISTANCE=6 - Resource limits on all services (fits t3.large / t3.medium) - No nvidia-gpu-exporter (no GPU on EC2) - All monitoring services included (prometheus, grafana, cadvisor, node-exporter) Prometheus AWS config (prometheus-aws.yml): - No nvidia-gpu scrape job Backup/restore (aws/backup.sh, aws/restore.sh): - Sync minecraft-data + bot memory to S3 with SSE-S3 - Stops Minecraft briefly for consistent world snapshot - Auto-runs every 6h via cron Teardown (aws/teardown.sh): - Destroys EC2, IAM, SG, VPC, SSM params - S3 deletion is opt-in to protect backups Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rer, local-research) - Reduce max chat from 2 sentences to 1, prefer commands with zero words - bot_responder: strongly prefer 'ignore' for all inter-bot messages - ClaudeExplorer cooldown: 2s → 4s - Add explicit "do NOT narrate" instructions to all active profiles Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wrap mode.update() in try-catch so a navigation timeout (or any mode error) disables the mode instead of crashing the process. Add a 5-second grace period after respawn before self_defense re-engages mobs, breaking the death spiral where bots immediately charge mobs on respawn. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ban !stay, "spawn protection", and "giving up" concepts. Make biome escape the highest priority after respawn — immediately goToCoordinates 300+ blocks away instead of searching/exploring locally. Prevent saving_memory from persisting "trapped" or "spawn protection" beliefs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bot.dig() silently no-ops when given a stale block reference (chunks unloaded/reloaded during pathfinding), causing collectBlocks to report items 0→0. Re-fetch via bot.blockAt() after navigation and skip if the block changed. Applied to both normal and mustCollectManually paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Log block stateId, bot position before dig, and block name after dig to determine if bot.dig() is silently failing server-side. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- coder.js: invert missingSkills filter (was returning existing skills, not missing ones) - coder.js: fix undefined variable 'result' -> 'write_result' in error log - coder.js: add configurable execution timeout via Promise.race (respects code_timeout_mins setting) - settings.js + keys.js + discord-bot.js + prompter.js: export deepSanitize() and apply to all untrusted JSON.parse calls (profile files, keys.json) to prevent prototype pollution - Dockerfile: drop root privileges with chown -R node:node /app and USER node before CMD Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the end-game goal sequence (diamond pickaxe → nether portal → blazerods → ender pearls → stronghold → ender dragon) as isolated, composable chunks in every profile's _TODO field. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
State machine skill + command wrapper design for the first end-game progression chunk. Approved and ready for implementation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4-task plan: getDiamondPickaxe skill, !getDiamondPickaxe command, manual verification, push. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s#712) Add entity existence checks in attackEntity() before sending attack packets. Prevents invalid_entity_attacked kicks when entities despawn between targeting and attack — affects single-attack, post-navigation, and pvp.attack() paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…us monitoring script
… action deadlocks - Lazy-load prismarine-viewer/canvas to prevent crash on Node v24 - Add windowsHide to spawn() to prevent CLI windows on Windows - Force-cancel actions after 10s instead of killing the entire process - Add bot.quit() before process.exit for clean MC session release - Use exit code 88 for name conflicts with 60s reconnect delay - Add 30s restart delay (normal) and 20s min quick-exit delay - Ignore unnamed restart-agent WebSocket events from stale UI - Filter *used ...* action broadcasts from unrecognized bots - Fix self-prompt/init_message race condition on restart - Set only_chat_with to Zombie_Virus; disable auto_open_ui - Disable CloudGrok profile locally (already running on EC2) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Overrides bad default conversation_examples that taught the bot to increase search range instead of exploring (root cause of stuck loops) - New examples explicitly map "Unable to reach" / "collected 0" / "timed out" to !explore(200) - Simplifies conversing prompt 5x (3500→700 chars), flat numbered IF/THEN rules with anti-stuck at top of context window - Phase-based self_prompt: wood→stone→iron→diamond pickaxe progression - Removes all CloudGrok references (bot runs solo) - Adds hunting:true mode for passive food gathering - Switches model to andy-4:q8_0 (F16 caused OOM on long prompts) - Adds 90s timeout + timeout handler to Ollama HTTP requests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e found getNearestFreeSpace() returns undefined when no valid placement spot exists, causing TypeError on pos.x. Now returns false with a log message instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add full 6-chunk dragon progression (fresh world → dragon defeat): - dragon_runner.js: modular chunks (nether portal, blaze rods, ender pearls, stronghold, dragon fight) with orchestrator and retry logic - 6 new skills: safeMoveTo, rangedAttack, buildPanicRoom, ensureFed, autoManageInventory, stockpileFood - 2 new modes: auto_eat (hunger < 14), panic_defense (health < 6) - 13 new commands registered in actions.js - dragon-slayer.json profile + updated local-research.json - 7 task JSON files for individual/full-run automation Fix bot getting stuck in gather loops and unstuck crash: - action_manager: set _forceExplore flag after 3 zero-collect results instead of warning message the LLM ignores - agent.js: auto-execute !explore(200) when flag is set, bypassing LLM - modes.js unstuck: increase crash timeout 10s→30s (moveAway needs ~13.4s), add try/catch with brute-force walk fallback - Fix ReferenceError: log→skills.log in searchForBlock handler Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rld saves, stale bots, experiments, etc.) Preserved: full dragon automation (dragon_runner.js, tasks/dragon/, dragon-slayer profile, ensemble, custom patches)
- Removed CI badge (workflow deleted), dead doc links (docs/ removed), stale refs to .env.example, litellm, rig-go.ps1, experiments/ - Replaced docs table with CLAUDE.md + wiki - Removed unused writeFileSync imports from history.js, learnings.js, prompter.js - DragonSlayer profile: removed dead base_profile key, added vision_model, bot_responder, \/\ template vars, fixed memory label - Removed stale commented profiles from settings.js
- Baritone breakBlocks default → false (mcdata.js) - startDoorInterval 2-phase: doors at 1.2s, last-resort break at 8s - New night_bed mode: auto-navigates to bed at dusk, crafts+places if needed - Door-preference & bed instructions added to all profile prompts - night_bed: true added to _default.json, local-research, dragon-slayer modes
- Fix corrupted learnings.json (empty 0 bytes) with valid [] JSON - Add JSON load fallback: empty/corrupted memory files start fresh instead of crash-looping - Atomic JSON writes: write to .tmp then rename to prevent corruption on crash - Deferred log fallback: only expand to all log types after primary type yields 0 results - Raise slow loop threshold for craftRecipe (5→8) to allow legitimate crafting chains - Guard selectAPI() against undefined profile parameter - Distinguish blocked commands from hallucinated: better log messages and LLM feedback - Fix checkDigProgress false aborts: require 2 consecutive unharvstable checks before stopping - Wrap Baritone executor.js path-undefined crash with process error handlers and promise catch - Add uncaught exception/rejection handlers in init_agent.js for non-fatal Baritone errors
5528992 to
9c9d77f
Compare
This was referenced Mar 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
3/3/2026
PR #716 consolidated everything from #710, #714, #717, and #718.
This PR was superseded by #716.
Summary
Test Plan