Skip to content

[Core] Refactor NodeHead module by optimizing flow insight dependency imports to load only when enabled#751

Open
daiping8 wants to merge 2 commits intoantgroup:mainfrom
daiping8:flow_import
Open

[Core] Refactor NodeHead module by optimizing flow insight dependency imports to load only when enabled#751
daiping8 wants to merge 2 commits intoantgroup:mainfrom
daiping8:flow_import

Conversation

@daiping8
Copy link

Motivation

NodeHead, as one of the dashboard subprocesses, currently loads a batch of dependencies at process startup that are only needed in specific feature scenarios (especially flow/insight and some autoscaler helper logic). Changing to lazy loading can reduce the base memory footprint of each NodeHead subprocess and shorten import time during startup; in clusters, these savings are amplified by the number of subprocesses.

Changes

  • Move flow/insight related heavy dependencies (ray.util.insight / flow_insight / get_resource_usage / PROMPT_TEMPLATE, etc.) from the module top level to inside _emit_node_physical_stats(), importing them only when insight_monitor_address is detected and the feature is enabled.
  • Move autoscaler related utility functions (parse_usage, LoadMetricsSummary, get_per_node_breakdown_as_dict) to their actual usage code paths (autoscaler v2 / legacy branches), reducing unnecessary startup-time imports.
  • Refactor _to_service_state to a local function to_service_state inside _emit_node_physical_stats(), avoiding top-level symbol coupling for optional dependencies.

Unit Tests

  • test_node_head_get_nodes_logical_resources_autoscaler_v2_smoke
    • Force the autoscaler v2 branch (mock is_autoscaler_v2=True)
    • Mock ClusterStatusParser.from_get_cluster_status_reply and parse_usage
    • Verify the return value matches expectations, ensuring no regression of undefined parse_usage

Impact

image image

NodeHead RSS reduced by approximately 10MB

Statistics Script

ps -aux | grep -E 'ray-dashboard' | grep -v grep | awk '{
    rss_mb = $6 / 1024;
    total += rss_mb;
    printf "PID: %-6s RSS: %-8.2f MB CMD: %s\n", $2, rss_mb, $11
} END {
    printf "=====================================\n"
    printf "Total RSS Memory: %-8.2f MB\n", total
}'

…ate conversion logic inline. Optimize flow insight dependency imports to load only when enabled. This improves code clarity and performance.

Change-Id: I6f2ab36769aad55df6914d18345e8ab572e31fd7
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the NodeHead module to lazily import dependencies for flow insight and autoscaler logic. This is a great optimization to reduce the memory footprint and startup time of the dashboard subprocesses. The changes correctly move imports into the functions where they are used and add checks to only load dependencies when the features are enabled.

I've found one critical issue in the legacy autoscaler code path where a None value could be passed to a function that doesn't expect it, leading to a crash. I've provided a suggestion to fix this.

Overall, the refactoring is well-executed and improves the performance of the dashboard.

…ing logical resources are only retrieved when usage data is available. Added a smoke test for the autoscaler v2 branch to prevent NameError regressions. This improves stability and test coverage.

Change-Id: I7a588cbcb6ebb347b5b89c0416e28f1e266443bc
@github-actions
Copy link

github-actions bot commented Feb 3, 2026

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@github-actions github-actions bot added the stale label Feb 3, 2026
@daiping8
Copy link
Author

daiping8 commented Feb 4, 2026

keep alive

@github-actions github-actions bot added unstale and removed stale labels Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant