Conversation
Summary of ChangesHello @guapisolo, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces comprehensive support for the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds scripts to run the gpt-oss-20b model, introducing a model configuration file and a run script. However, the run script contains several security issues related to improper handling of environment variables and insecure default configurations, specifically unquoted variables in shell commands, direct string concatenation for JSON construction leading to command and JSON injection vulnerabilities, and exposing the Ray dashboard on all network interfaces, which increases the attack surface. Beyond security, there are also issues with a messy process cleanup block, the use of an environment variable without a safe default, and an undefined argument array, which impact the script's robustness and clarity.
| \"env_vars\": { | ||
| \"PYTHONPATH\": \"/root/Megatron-LM/\", | ||
| \"CUDA_DEVICE_MAX_CONNECTIONS\": \"1\", | ||
| \"NCCL_NVLS_ENABLE\": \"${HAS_NVLINK}\" |
There was a problem hiding this comment.
The environment variable HAS_NVLINK is directly embedded into a JSON string without sanitization, making it vulnerable to JSON injection if it contains a double quote, potentially leading to Remote Code Execution (RCE). Additionally, it lacks a default value, which could lead to unexpected behavior if unset. Consider using a proper JSON processor like jq and providing a safe default, for example 0, to explicitly disable NVLS when the variable is not present.
| \"NCCL_NVLS_ENABLE\": \"${HAS_NVLINK}\" | |
| "NCCL_NVLS_ENABLE": "${HAS_NVLINK:-0}" |
|
|
||
| # launch the master node of ray in container | ||
| export MASTER_ADDR=${MASTER_ADDR:-"127.0.0.1"} | ||
| ray start --head --node-ip-address ${MASTER_ADDR} --num-gpus 8 --disable-usage-stats --dashboard-host=0.0.0.0 --dashboard-port=8265 |
There was a problem hiding this comment.
The environment variable MASTER_ADDR is used unquoted in a shell command. In Bash, unquoted variables undergo word splitting and globbing, which can lead to command injection if the variable contains characters like ;, &, or |. For example, if MASTER_ADDR is set to 127.0.0.1; id, the script will execute the id command. Always wrap shell variables in double quotes to prevent this.
| ray start --head --node-ip-address ${MASTER_ADDR} --num-gpus 8 --disable-usage-stats --dashboard-host=0.0.0.0 --dashboard-port=8265 | |
| ray start --head --node-ip-address "${MASTER_ADDR}" --num-gpus 8 --disable-usage-stats --dashboard-host=0.0.0.0 --dashboard-port=8265 |
|
|
||
| # launch the master node of ray in container | ||
| export MASTER_ADDR=${MASTER_ADDR:-"127.0.0.1"} | ||
| ray start --head --node-ip-address ${MASTER_ADDR} --num-gpus 8 --disable-usage-stats --dashboard-host=0.0.0.0 --dashboard-port=8265 |
There was a problem hiding this comment.
The Ray dashboard is configured to listen on 0.0.0.0 (all interfaces). This exposes the dashboard to any network the machine is connected to. If the environment is not properly isolated, an attacker could access the dashboard to gain information about the cluster or potentially exploit vulnerabilities in the dashboard itself. It is a security best practice to restrict the dashboard to 127.0.0.1 or a specific internal management IP.
| pkill -9 sglang | ||
| sleep 3 | ||
| ray stop --force | ||
| pkill -9 ray | ||
| pkill -9 python | ||
| sleep 3 | ||
| pkill -9 ray | ||
| pkill -9 python |
There was a problem hiding this comment.
This process cleanup block is repetitive and can be simplified. The current implementation has redundant pkill commands and an unusual sleep. Reordering and simplifying the commands will make the script cleaner and more maintainable.
| pkill -9 sglang | |
| sleep 3 | |
| ray stop --force | |
| pkill -9 ray | |
| pkill -9 python | |
| sleep 3 | |
| pkill -9 ray | |
| pkill -9 python | |
| ray stop --force | |
| pkill -9 sglang | |
| pkill -9 ray | |
| pkill -9 python |
| ${GRPO_ARGS[@]} \ | ||
| ${WANDB_ARGS[@]} \ | ||
| ${PERF_ARGS[@]} \ | ||
| ${EVAL_ARGS[@]} \ |
There was a problem hiding this comment.
0f224f6 to
03b71b2
Compare
39e0d17 to
6b75c14
Compare
Co-Authored-By: Wenjie Ma <55629401+euclidgame@users.noreply.github.com>
6b75c14 to
5e52f41
Compare
Thanks @euclidgame for the base script.
Wandb log link: https://wandb.ai/zyzshizhui_/miles-mgt-oss/runs/apa89g0h?nw=nwuserzyzshizhui_