[Feat] [history server] Correctly use glob in logs/ endpoint#4526
[Feat] [history server] Correctly use glob in logs/ endpoint#4526machichima wants to merge 16 commits intoray-project:masterfrom
glob in logs/ endpoint#4526Conversation
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR improves the /api/v0/logs endpoint in the history server to correctly implement glob pattern matching for log file filtering. Previously, the endpoint incorrectly used strings.TrimSuffix on the glob parameter instead of proper pattern matching. The fix introduces the doublestar/v4 library for robust glob support and updates the endpoint to properly parse glob patterns, separate the base directory from the pattern, and filter files accordingly.
Changes:
- Added
doublestar/v4dependency for proper glob pattern matching - Updated
/api/v0/logsendpoint to correctly parse and apply glob patterns, extracting base directory paths when present - Added comprehensive E2E test coverage for glob pattern matching with both matching and non-matching patterns
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| historyserver/go.mod | Added doublestar/v4 dependency for glob pattern support |
| historyserver/go.sum | Updated checksums for new dependency |
| historyserver/pkg/historyserver/router.go | Updated getNodeLogs to properly parse glob patterns using SplitPattern and extract base directories |
| historyserver/pkg/historyserver/reader.go | Implemented glob filtering in _getNodeLogs using doublestar.Match |
| historyserver/test/support/constant.go | Added EndpointLogs constant for test usage |
| historyserver/test/e2e/historyserver_test.go | Added comprehensive test coverage for glob pattern functionality |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ws.Route(ws.GET("/v0/logs").To(s.getNodeLogs).Filter(s.CookieHandle). | ||
| Doc("get appliations").Param(ws.QueryParameter("node_id", "node_id")). | ||
| Doc("get appliations"). |
There was a problem hiding this comment.
Spelling error: "appliations" should be "applications".
| var folder, glob string | ||
| if req.QueryParameter("folder") != "" { | ||
| folder = req.QueryParameter("folder") | ||
| } | ||
| if req.QueryParameter("glob") != "" { | ||
| folder = req.QueryParameter("glob") | ||
| folder = strings.TrimSuffix(folder, "*") | ||
| glob = req.QueryParameter("glob") | ||
| // SplitPattern splits e.g. "logs/raylet*" into base="logs/" and pattern="raylet*", | ||
| // so we can use base as the storage directory prefix and pattern for matching. | ||
| // For a flat pattern like "raylet*", base is "." which we treat as no subdirectory. | ||
| base, pattern := doublestar.SplitPattern(glob) | ||
| glob = pattern | ||
| if base != "." { | ||
| folder = base | ||
| } | ||
| } | ||
| data, err := s._getNodeLogs(clusterNameID+"_"+clusterNamespace, sessionName, req.QueryParameter("node_id"), folder) | ||
| data, err := s._getNodeLogs(clusterNameID+"_"+clusterNamespace, sessionName, req.QueryParameter("node_id"), folder, glob) |
There was a problem hiding this comment.
The node_id, folder, and glob parameters should be validated to prevent path traversal attacks, similar to the validation performed in getNodeLogFile (lines 988-995). Add validation using fs.ValidPath for these parameters to prevent malicious inputs like "../../../etc/passwd".
| // For a flat pattern like "raylet*", base is "." which we treat as no subdirectory. | ||
| base, pattern := doublestar.SplitPattern(glob) | ||
| glob = pattern | ||
| if base != "." { |
There was a problem hiding this comment.
When both the folder and glob query parameters are provided, the folder extracted from the glob pattern will silently overwrite the explicit folder parameter. This could lead to unexpected behavior. Consider either: (1) returning an error when both parameters are provided with conflicting values, (2) merging the folder and glob base path (e.g., path.Join(folder, base)), or (3) documenting that the glob parameter takes precedence and ignoring the folder parameter when glob contains a path.
| if base != "." { | |
| if base != "." { | |
| // If both folder and glob specify a folder and they conflict, return a clear error. | |
| if folder != "" && folder != base { | |
| resp.WriteErrorString(http.StatusBadRequest, "conflicting 'folder' and 'glob' query parameters") | |
| return | |
| } |
There was a problem hiding this comment.
The folder parameter is not supported in Ray Dashboard API, deleted it to make history server endpoint consistent with Ray Dashboard API
| } | ||
|
|
||
| // testNodeLogsEndpointDeadCluster verifies that the /api/v0/logs endpoint correctly lists and filters | ||
| // log files for a dead cluster using the glob and folder query parameters. |
There was a problem hiding this comment.
The folder parameter hasn't been covered in this e2e tests.
Since SplitPattern can implicitly set folder from a glob, would it be better adding a test case that uses folder directly to verify the directory scoping works correctly ?
Also a case where glob contains a path prefix to ensure SplitPattern correctly populates folder from it.
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
|
cc @rueian to merge, tks! |
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: machichima <nary12321@gmail.com>
Signed-off-by: Han-Ju Chen (Future-Outlier) <eric901201@gmail.com>
|
|
||
| const ( | ||
| EndpointLogs = "/api/v0/logs" | ||
| EndpointLogsFile = "/api/v0/logs/file" |
There was a problem hiding this comment.
Line 5-10 seem like a duplicate of 11-17?
There was a problem hiding this comment.
fixed, I didn't solve the merge conflict well...
Why are these changes needed?
Previous the
logs/endpoint does not doglobpattern matching on the file path, instead just dotrimsuffix.This PR makes the
logs/endpoint respect theglobparameter and use it correctly.Manual Test Result
On dead cluster:
Related issue number
Checks