fix the numactl command control failure#642
Open
pengjunjie2100 wants to merge 1252 commits intoztxz16:masterfrom
Open
fix the numactl command control failure#642pengjunjie2100 wants to merge 1252 commits intoztxz16:masterfrom
pengjunjie2100 wants to merge 1252 commits intoztxz16:masterfrom
Conversation
C++支持Qwen3的模板
Improve scaling calculation in fastllm-cuda.cu and bug fix
e1b5c57 to
dce3e6c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
遇到一个情况,在NPS2配置服务器测试,机器一共有4个numa节点,需要使用numactl -N 0,1 -m 0,1 命令限制运行在节点0,1上,希望只启动2个compute server,但是当前实现调用numa_get_mems_allowed()获取可用numa节点函数,还是获取总的可用节点数4,这种情况下仍然会启动4个 computeserver,numactl控制无效,不符合预期。

建议改成调用numa_get_membind(),这样如果使用numactl命令会受-m参数控制, 按用户要求,使用指定节点;如果不使用numactl -m参数,跟以前一样还获取总的节点数。
以下是修改后使用numactl -N 0,1 -m 0,1 效果,只启动了2个computeserver,符合预期,测试对其他功能没有影响。

