ProcmonMCP is a Model Context Protocol (MCP) server (single Python file, ~2300 lines) that enables LLMs to autonomously analyze Process Monitor (Procmon) XML log files. The project loads a Procmon XML file into memory with string interning and index optimizations, then exposes 14 MCP tools for querying events, inspecting processes, and exporting results.
The project is functional and demonstrates solid domain knowledge of both the MCP protocol and Procmon data analysis. The review below covers architecture, code quality, security, performance, testing, and packaging.
Strengths:
- Two-pass XML parsing (processes first, then events) is well-designed for building the process lookup table before event enrichment.
- String interning via
StringInterneris an effective memory optimization for repetitive Procmon data (process names, operations, paths, results). - Pre-built indices for process name and operation enable fast filtering without full scans.
- Compression support (gzip, bz2, lzma) adds practical flexibility.
- Graceful fallbacks (lxml -> stdlib ElementTree, psutil optional, MCP SDK mock) are pragmatic.
Concerns:
- Single-file monolith (2303 lines): The entire server—data structures, XML parsing, MCP tool definitions, filtering engine, export logic, CLI entry point—lives in one file. Splitting into modules (e.g.,
models.py,parser.py,tools.py,filters.py) would improve navigability and testability. - Global mutable state:
LOADED_DATAis a module-level global modified bymain_execution()and read by all MCP tool handlers. This makes unit testing difficult and prevents running multiple server instances in-process. Consider dependency injection or a server context object. - Single-file limitation: Only one Procmon file can be loaded per server instance. This is documented but architecturally baked in through the global state pattern.
- Semicolons as statement separators are used extensively (e.g.,
procmon-mcp.py:92-96,procmon-mcp.py:461,procmon-mcp.py:1583), placing multiple statements on a single line. This reduces readability. Standard Python style uses one statement per line. - Inconsistent line lengths: Some lines exceed 200 characters, making the code difficult to read without horizontal scrolling.
- Commented-out code: Dead code blocks remain in
_clear_elem()(procmon-mcp.py:178-188) andStringInterner.get_id()(procmon-mcp.py:213-218). These should be removed; version control preserves history.
-
Duplicate
ioimport (procmon-mcp.py:7andprocmon-mcp.py:26):import ioandimport io as pstats_ioimport the same module twice under different names. Useio.StringIOdirectly instead. -
StopAsyncIterationraised inside async generator (procmon-mcp.py:1018,1021): Per PEP 479 (enforced in Python 3.7+), raisingStopIterationorStopAsyncIterationinside a generator body is converted toRuntimeError. The correct pattern is to simplyreturnfrom the generator:# Current (problematic): raise StopAsyncIteration("Log data not loaded.") # Should be: await ctx.error("Filtering failed: Log data not loaded.") return
-
exit()vssys.exit()(procmon-mcp.py:2158,2179,2182, etc.): The built-inexit()is intended for the interactive interpreter. Usesys.exit()in scripts for clarity and reliability. -
CSV export mutates dictionaries (
procmon-mcp.py:2113-2115):export_query_resultsconvertsextra_data,stack_trace, andprocess_details_summaryto JSON strings in-place on therow_dictreturned by_get_formatted_event_details. While these are temporary copies in practice, mutating dicts during iteration is a code smell. Create copies or use a serialization helper. -
iterparsetagparameter compatibility (procmon-mcp.py:408,498): Thetagkeyword argument toiterparseis not supported in the same way across all Python/lxml versions. In CPython'sxml.etree.ElementTree,tagsupport initerparsewas not available until Python 3.13. This could causeTypeErroron older Python versions despite the 3.7+ minimum claim. -
Process list parsing state machine (
procmon-mcp.py:421-427): The state transitions fromseeking_procmontoseeking_processliston any end-event that isn'tprocmon. If the XML has unexpected top-level elements, this could skip the processlist or misparse. A more robust approach would check for theprocessliststart tag explicitly. -
README typo (
README.md:97): "To use GhidraMCP with Cline" should read "To use ProcmonMCP with Cline".
- Path traversal prevention in
export_query_results(procmon-mcp.py:2011-2042) usesos.path.commonpathto ensure output files stay within the working directory. This is correctly implemented. - Security warning in the README is prominent and detailed.
-
No authentication on SSE transport: When using
--transport sse, the server binds to a network interface and accepts unauthenticated connections. Any process or user that can reach the host:port has full read access to all loaded Procmon data. The README warns about trusted environments, but the server could at minimum support a shared secret or API key. -
Regex Denial of Service (ReDoS): User-supplied regex patterns (
filter_path_regex,filter_process_regex,filter_detail_regex) are compiled and executed against potentially millions of events with no timeout or complexity limit (procmon-mcp.py:1027-1029). A pathological regex like(a+)+$could hang the server. Consider usingre2or adding a timeout wrapper. -
No input length limits on filter parameters: String filter parameters have no maximum length validation. Extremely long strings could cause memory or performance issues during filtering.
-
os.makedirsin export (procmon-mcp.py:2032): While the path traversal check prevents writing outside CWD, theos.makedirscall allows creating arbitrary directory structures within CWD. This is low-risk but worth noting.
Strengths:
- String interning significantly reduces memory for repetitive data.
- Pre-built indices for
pname_idandop_idenable O(1) lookup of candidate event sets before applying further filters. - Stream-based XML parsing (
iterparse) keeps peak memory manageable during loading. - Progress reporting during long operations provides good user feedback.
- The optimization decision to use direct element iteration over lxml XPath (
procmon-mcp.py:163-169) is well-documented with profiling justification.
Improvement Opportunities:
-
count_events_by_process(procmon-mcp.py:1620-1662) iterates all events and resolves string names, when it could use the existingpname_id_indexdirectly:# Could be: for pname_id, event_indices in log_data.pname_id_index.items(): name = log_data.get_string(IK_PROCESS_NAME, pname_id) or 'Unknown' event_counts[name] = len(event_indices)
This would be O(unique_names) instead of O(total_events).
-
get_process_lifetime(procmon-mcp.py:1816-1852) performs a full linear scan of all events. It could use thepname_id_indexto narrow the search to events for the specific process, then further filter by operation usingop_id_indexintersection. -
Sorting
candidate_indices(procmon-mcp.py:1103): Converting a set to a sorted list for "cache locality" adds O(n log n) overhead. For large candidate sets, this overhead may exceed any cache benefit. Profiling should validate whether this helps. -
find_file_accessdoes a full linear scan with no index support. For a frequently used analysis tool, a path-based index (even a simple substring trie) could help.
Current state: The project has a single integration test file (tests/procmon-mcp-tester.py) that:
- Requires a running MCP server with loaded data
- Only supports SSE transport (not stdio)
- Performs smoke tests by calling each tool and printing results
- Has no assertions or pass/fail criteria
- Has no test data fixtures
Recommendations:
- Add unit tests for core components:
StringInterner,ProcessInfo.from_xml_element,_parse_timestamp_str,_strip_namespace,_find_text_ignore_ns, time filter parsing. - Add integration tests with small sample XML fixtures that can run without a full MCP server.
- Add a test runner configuration (pytest) and consider CI integration.
- Create sample Procmon XML test files (small, synthetic) for reproducible testing.
Missing files:
- No
requirements.txtorpyproject.toml— dependencies are only documented in the README. Users must manually installmcp[cli],lxml, andpsutil. - No
.gitignore— risks committing.proffiles,__pycache__, IDE configs, exported CSV/JSON files, etc. - No
setup.pyor packaging metadata — the project can't be installed via pip. - No Dockerfile — could simplify deployment, especially for SSE transport mode.
- No license file.
The timestamp rollover logic (procmon-mcp.py:566-568) advances the date when a parsed time is less than the previous time. This is documented but fragile:
- A single out-of-order event (common in multi-threaded Procmon captures) could trigger a false rollover, advancing the date permanently for all subsequent events.
- There is no mechanism to detect or recover from false rollovers.
- Consider requiring a minimum time gap (e.g., the new time must be significantly earlier, like > 1 hour earlier) before triggering a rollover.
| # | Finding | Location |
|---|---|---|
| 1 | StopAsyncIteration raised inside async generator causes RuntimeError |
procmon-mcp.py:1018,1021 |
| 2 | iterparse tag parameter not supported in stdlib before Python 3.13 |
procmon-mcp.py:408,498 |
| 3 | No authentication on SSE transport | procmon-mcp.py:2187-2198 |
| # | Finding | Location |
|---|---|---|
| 4 | ReDoS risk with user-supplied regex patterns | procmon-mcp.py:1027-1029 |
| 5 | Fragile midnight rollover detection | procmon-mcp.py:566-568 |
| 6 | No requirements.txt or packaging metadata |
Project root |
| 7 | No .gitignore file |
Project root |
| 8 | No unit tests | tests/ |
| # | Finding | Location |
|---|---|---|
| 9 | Duplicate io import |
procmon-mcp.py:7,26 |
| 10 | exit() instead of sys.exit() |
Multiple locations |
| 11 | CSV export mutates source dictionaries | procmon-mcp.py:2113-2115 |
| 12 | README references "GhidraMCP" instead of "ProcmonMCP" | README.md:97 |
| 13 | Semicolons reduce readability | Throughout |
| 14 | count_events_by_process could use existing index |
procmon-mcp.py:1641 |