Skip to content

security: add input validation, rate limiting, and prototype pollution guard#718

Draft
Z0mb13V1 wants to merge 1 commit intomindcraft-bots:developfrom
Z0mb13V1:security/input-validation-rate-limiting
Draft

security: add input validation, rate limiting, and prototype pollution guard#718
Z0mb13V1 wants to merge 1 commit intomindcraft-bots:developfrom
Z0mb13V1:security/input-validation-rate-limiting

Conversation

@Z0mb13V1
Copy link

@Z0mb13V1 Z0mb13V1 commented Mar 3, 2026

3/3/2026
PR #716 consolidated everything from #710, #714, #717, and #718.
This PR was superseded by #716.

Summary

Adds three complementary security layers that harden Mindcraft against malicious or malformed user input — without changing any existing behaviour for well-formed traffic.

Layer File Purpose
Message validation src/utils/message_validator.js Sanitise & validate Discord / Minecraft messages and usernames before they reach the LLM pipeline
Rate limiting src/utils/rate_limiter.js Per-user sliding-window rate limiter with automatic stale-entry cleanup
Prototype-pollution guard settings.js Recursively strips __proto__, constructor, and prototype keys from SETTINGS_JSON before merging

Details

src/utils/message_validator.js (new)

  • Three exports: validateDiscordMessage(), validateMinecraftMessage(), validateUsername()
  • Each returns { valid, error?, sanitized?, warnings? } so callers get both a pass/fail flag and a sanitised string
  • Detects shell command-injection patterns (rm, curl, wget, backtick execution, $(...) substitution, pipe-to-shell)
  • Strips control characters (\x00-\x1F, BOM) and replaces null bytes / newlines in Minecraft messages
  • Validates usernames against Minecraft's [a-zA-Z0-9_]{3,16} format
  • Configurable length caps (512 chars Discord, 256 chars Minecraft)

src/utils/rate_limiter.js (new)

  • RateLimiter class — sliding-window algorithm, default 5 requests / 60 s per user
  • _purgeStale() runs on a background interval (default 5 min) to reclaim memory from inactive users
  • interval.unref() so the timer never prevents clean process exit
  • getStats() for observability; destroy() for clean shutdown in tests

settings.jsdeepSanitize()

  • Recursive function that walks any depth of nested objects/arrays
  • Strips keys in { __proto__, constructor, prototype } before Object.assign
  • Applied to JSON.parse(process.env.SETTINGS_JSON) so no polluted key ever reaches settings

src/agent/agent.js — integration

  • Imports validateMinecraftMessage and validateUsername
  • respondFunc now validates the username and message early; rejects with a console warning if either is invalid
  • All downstream code receives the sanitised cleanMessage instead of the raw string

Testing

  • All four modified/new files pass npx eslint with zero new warnings
  • Remaining ESLint output is pre-existing upstream (no-undef for process, semi in settings object literal, require-await)
  • Manual verification: valid messages pass through unchanged; messages containing shell injection patterns, control chars, or oversized payloads are rejected or trimmed as expected

Backward Compatibility

  • Zero behaviour change for legitimate users sending normal chat messages
  • Rate limiter defaults (5 req / 60 s) are generous enough for normal play; constructor accepts custom values for different deployments
  • deepSanitize is a no-op when the JSON contains no prototype-polluting keys (i.e. all current usage)

…n guard

- Add src/utils/message_validator.js: validates and sanitizes Discord/Minecraft
  messages and usernames; detects command injection, Unicode exploits, and
  protocol-level attack strings
- Add src/utils/rate_limiter.js: per-user sliding-window rate limiter with
  automatic stale-entry cleanup to prevent memory leaks
- Add deepSanitize() to settings.js: recursively strips __proto__, constructor,
  and prototype keys from SETTINGS_JSON before merging, preventing prototype
  pollution via environment variables
- Integrate message validation in src/agent/agent.js respondFunc: rejects
  invalid usernames and sanitises messages before they reach the LLM pipeline
Copilot AI review requested due to automatic review settings March 3, 2026 15:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to harden inbound/untrusted inputs across the app by adding message sanitization utilities, a prototype-pollution guard for SETTINGS_JSON, and a (currently standalone) in-memory rate limiter, plus integrating Minecraft message/username validation into the agent’s inbound chat handling.

Changes:

  • Added src/utils/message_validator.js with Discord/Minecraft message and username validation/sanitization helpers.
  • Added src/utils/rate_limiter.js implementing a sliding-window per-user rate limiter with periodic stale-entry cleanup.
  • Integrated Minecraft message/username validation into src/agent/agent.js and added deepSanitize() to settings.js to strip prototype-polluting keys from SETTINGS_JSON before merging.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
src/utils/rate_limiter.js Introduces RateLimiter utility, but no integration/callers found in src/.
src/utils/message_validator.js Adds validators/sanitizers; includes some validation-order issues and API/behavior mismatches with PR description.
src/agent/agent.js Applies username/message validation and propagates sanitized Minecraft messages downstream.
settings.js Adds recursive sanitization to prevent prototype pollution from SETTINGS_JSON before Object.assign.
Comments suppressed due to low confidence (1)

src/agent/agent.js:24

  • PR description says validators return { valid, message, cleanMessage }, but the implementation returns { valid, sanitized, ... } (and the agent integration expects .sanitized). Either update the API to match the documented contract or update the PR description / JSDoc so callers have a consistent, stable shape.
import { validateMinecraftMessage, validateUsername } from '../utils/message_validator.js';

export class Agent {
    async start(load_mem=false, init_message=null, count_id=0) {
        this.last_sender = null;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +22 to +27
export function validateDiscordMessage(message) {
if (!message) return { valid: false, error: 'Empty message' };
if (typeof message !== 'string') return { valid: false, error: 'Message must be a string' };
if (message.length > MAX_DISCORD_MESSAGE_LENGTH) {
return { valid: false, error: `Message exceeds ${MAX_DISCORD_MESSAGE_LENGTH} characters` };
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateDiscordMessage checks !message before validating typeof message === 'string', so non-string falsy values (e.g. 0) will be reported as "Empty message" instead of a type error. Consider checking the type first, then treating empty/whitespace-only strings as empty after trimming.

Copilot uses AI. Check for mistakes.
Comment on lines +64 to +69
export function validateMinecraftMessage(message) {
if (!message) return { valid: false, error: 'Empty message' };
if (typeof message !== 'string') return { valid: false, error: 'Message must be a string' };
if (message.length > MAX_MINECRAFT_MESSAGE_LENGTH) {
return { valid: false, error: `Message exceeds ${MAX_MINECRAFT_MESSAGE_LENGTH} characters` };
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateMinecraftMessage checks !message before checking typeof message === 'string', so non-string falsy inputs will be misclassified as "Empty message". Reorder the validations (type check first), and consider trimming/collapsing whitespace after replacing newlines to avoid producing leading spaces.

Copilot uses AI. Check for mistakes.
Comment on lines +93 to +95
if (!username) return { valid: false, error: 'Username is empty' };
if (typeof username !== 'string') return { valid: false, error: 'Username must be a string' };
if (!/^[a-zA-Z0-9_]{3,16}$/.test(username)) {
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateUsername checks !username before validating the input type, so a non-string falsy value would return "Username is empty" rather than "Username must be a string". Reorder validations to check typeof username first, then validate emptiness/format on the trimmed string.

Suggested change
if (!username) return { valid: false, error: 'Username is empty' };
if (typeof username !== 'string') return { valid: false, error: 'Username must be a string' };
if (!/^[a-zA-Z0-9_]{3,16}$/.test(username)) {
if (typeof username !== 'string') return { valid: false, error: 'Username must be a string' };
const trimmedUsername = username.trim();
if (!trimmedUsername) return { valid: false, error: 'Username is empty' };
if (!/^[a-zA-Z0-9_]{3,16}$/.test(trimmedUsername)) {

Copilot uses AI. Check for mistakes.
Comment on lines +174 to 179
const cleanMessage = msgValidation.sanitized;

if (settings.only_chat_with.length > 0 && !settings.only_chat_with.includes(username)) return;
try {
if (ignore_messages.some((m) => message.startsWith(m))) return;
if (ignore_messages.some((m) => cleanMessage.startsWith(m))) return;

Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanMessage may start with a space after sanitization (e.g. when the raw message begins with \n/\r and those are replaced with spaces). That can bypass the ignore_messages.some(m => cleanMessage.startsWith(m)) filter. Consider trimming (or at least trimStart()) the sanitized message before startsWith checks and downstream processing.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +18
/**
* Simple in-memory rate limiter to prevent abuse.
* Includes automatic stale entry cleanup to prevent memory leaks.
*/

export class RateLimiter {
constructor(maxRequests = 5, windowMs = 60000, cleanupIntervalMs = 300000) {
this.maxRequests = maxRequests; // Max requests per window
this.windowMs = windowMs; // Time window in milliseconds
this.requests = new Map(); // userId → [timestamps]

// Periodically purge stale entries to prevent unbounded memory growth
this._cleanupInterval = setInterval(() => this._purgeStale(), cleanupIntervalMs);
// Allow the process to exit even if the interval is still running
if (this._cleanupInterval.unref) {
this._cleanupInterval.unref();
}
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RateLimiter is introduced but not referenced anywhere in src/ (no callers found), so this PR doesn't actually apply rate limiting as described in the PR summary/title. Either integrate it at the appropriate ingress (e.g., message receive path) or adjust the PR description/title to reflect the current scope.

Copilot uses AI. Check for mistakes.
Comment on lines +63 to +69
/**
* Recursively strips prototype-polluting keys from an object.
* Prevents __proto__, constructor, and prototype injection at any depth
* when parsing untrusted JSON (e.g. SETTINGS_JSON env var).
*/
function deepSanitize(obj) {
const BLOCKED_KEYS = new Set(['__proto__', 'constructor', 'prototype']);
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deepSanitize recreates BLOCKED_KEYS on every recursive call, which is unnecessary work (especially on large nested inputs). Move BLOCKED_KEYS to a module-level constant (or at least outside the function body) so recursion reuses the same set.

Suggested change
/**
* Recursively strips prototype-polluting keys from an object.
* Prevents __proto__, constructor, and prototype injection at any depth
* when parsing untrusted JSON (e.g. SETTINGS_JSON env var).
*/
function deepSanitize(obj) {
const BLOCKED_KEYS = new Set(['__proto__', 'constructor', 'prototype']);
const BLOCKED_KEYS = new Set(['__proto__', 'constructor', 'prototype']);
/**
* Recursively strips prototype-polluting keys from an object.
* Prevents __proto__, constructor, and prototype injection at any depth
* when parsing untrusted JSON (e.g. SETTINGS_JSON env var).
*/
function deepSanitize(obj) {

Copilot uses AI. Check for mistakes.
@Z0mb13V1 Z0mb13V1 closed this Mar 3, 2026
@Z0mb13V1 Z0mb13V1 deleted the security/input-validation-rate-limiting branch March 3, 2026 18:35
@Z0mb13V1 Z0mb13V1 restored the security/input-validation-rate-limiting branch March 3, 2026 18:36
@Z0mb13V1 Z0mb13V1 deleted the security/input-validation-rate-limiting branch March 3, 2026 19:09
@Z0mb13V1 Z0mb13V1 restored the security/input-validation-rate-limiting branch March 3, 2026 20:37
@Z0mb13V1 Z0mb13V1 reopened this Mar 3, 2026
@Z0mb13V1 Z0mb13V1 marked this pull request as draft March 3, 2026 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants