-
Notifications
You must be signed in to change notification settings - Fork 68
Catd graphql play #2100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Catd graphql play #2100
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2100 +/- ##
==========================================
- Coverage 74.30% 71.24% -3.07%
==========================================
Files 91 94 +3
Lines 7083 7476 +393
==========================================
+ Hits 5263 5326 +63
- Misses 1405 1719 +314
- Partials 415 431 +16
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
closing this as stale - please reopen if needed |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
c7edbd2 to
559b18f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request refactors the catalogd storage layer to introduce GraphQL support for querying catalog data. The changes extract HTTP handler logic into a separate server package, introduce a service layer for GraphQL schema management with caching, and replace direct struct initialization with a constructor pattern for LocalDirV1.
- Introduces a new GraphQL endpoint at
/api/v1/graphqlwith dynamic schema generation - Refactors HTTP handlers into a dedicated
serverpackage with cleaner separation of concerns - Adds a GraphQL service layer with schema caching to improve performance
- Updates
LocalDirV1to use a constructor pattern (NewLocalDirV1) for proper initialization
Reviewed Changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/catalogd/storage/localdir.go | Refactored to use constructor pattern, added GraphQL service integration, moved HTTP handlers to server package |
| internal/catalogd/storage/localdir_test.go | Updated all test instantiations to use new NewLocalDirV1 constructor |
| internal/catalogd/storage/http_preconditions_check.go | Removed (moved to server package with simplified implementation) |
| internal/catalogd/server/handlers.go | New file implementing HTTP handlers extracted from storage layer |
| internal/catalogd/server/http_helpers.go | New file with simplified HTTP precondition checking |
| internal/catalogd/service/graphql_service.go | New GraphQL service with caching for schema generation |
| internal/catalogd/graphql/graphql.go | New dynamic GraphQL schema generation implementation |
| internal/catalogd/graphql/graphql_test.go | Tests for GraphQL schema discovery |
| internal/catalogd/graphql/discovery_test.go | Additional comprehensive tests for schema discovery edge cases |
| internal/catalogd/graphql/sample-queries.txt | Documentation of sample GraphQL queries |
| internal/catalogd/graphql/README.md | Documentation for GraphQL integration |
| cmd/catalogd/main.go | Updated to use NewLocalDirV1 constructor |
| go.mod, go.sum | Added graphql-go/graphql dependency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Check If-Modified-Since | ||
| if r.Method == http.MethodGet || r.Method == http.MethodHead { | ||
| if t, err := time.Parse(http.TimeFormat, r.Header.Get("If-Modified-Since")); err == nil { | ||
| // The Date-Modified header truncates sub-second precision, so |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'Date-Modified' to 'Last-Modified'. The HTTP header name is 'Last-Modified', not 'Date-Modified'.
| // Check If-Unmodified-Since | ||
| if r.Method != http.MethodGet && r.Method != http.MethodHead { | ||
| if t, err := time.Parse(http.TimeFormat, r.Header.Get("If-Unmodified-Since")); err == nil { | ||
| // The Date-Modified header truncates sub-second precision, so |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'Date-Modified' to 'Last-Modified'. The HTTP header name is 'Last-Modified', not 'Date-Modified'.
| // Allow POST requests only for GraphQL endpoint | ||
| if r.URL.Path != "" && len(r.URL.Path) >= 7 && r.URL.Path[len(r.URL.Path)-7:] != "graphql" && r.Method == http.MethodPost { | ||
| http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed) | ||
| return | ||
| } |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string suffix check using r.URL.Path[len(r.URL.Path)-7:] is fragile and uses a magic number (7). This logic can fail if the path is shorter than 7 characters or doesn't end exactly with 'graphql'. Consider using strings.HasSuffix(r.URL.Path, \"graphql\") or checking against the actual GraphQL path pattern for more robust route matching.
internal/catalogd/server/handlers.go
Outdated
| code = http.StatusInternalServerError | ||
| } | ||
| // Log the actual error for debugging | ||
| fmt.Printf("HTTP Error %d: %v\n", code, err) |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using fmt.Printf for error logging bypasses structured logging and may not be captured in production logging systems. Consider using a proper logger (e.g., from a logging library or context) to ensure errors are properly tracked and monitored.
| fmt.Printf("HTTP Error %d: %v\n", code, err) | |
| log.Printf("HTTP Error %d: %v\n", code, err) |
| m sync.RWMutex | ||
| // this singleflight Group is used in `getIndex()`` to handle concurrent HTTP requests | ||
| // optimally. With the use of this slightflight group, the index is loaded from disk | ||
| // this singleflight Group is used in `getIndex()` to handle concurrent HTTP requests |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment refers to getIndex() but the method has been renamed to GetIndex(). Update the comment to reference the correct method name.
| // this singleflight Group is used in `getIndex()` to handle concurrent HTTP requests | |
| // this singleflight Group is used in `GetIndex()` to handle concurrent HTTP requests |
internal/catalogd/graphql/graphql.go
Outdated
| // appendUnique adds a value to slice if not already present | ||
| func appendUnique(slice []interface{}, value interface{}) []interface{} { | ||
| for _, existing := range slice { | ||
| if reflect.DeepEqual(existing, value) { | ||
| return slice | ||
| } | ||
| } |
Copilot
AI
Oct 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The appendUnique function uses reflect.DeepEqual which can be expensive, especially when called repeatedly during schema discovery. Consider using a simpler equality check or a map-based approach if the performance impact is significant for large catalogs.
| // appendUnique adds a value to slice if not already present | |
| func appendUnique(slice []interface{}, value interface{}) []interface{} { | |
| for _, existing := range slice { | |
| if reflect.DeepEqual(existing, value) { | |
| return slice | |
| } | |
| } | |
| // appendUnique adds a value to slice if not already present, using JSON string as key for uniqueness | |
| func appendUnique(slice []interface{}, value interface{}) []interface{} { | |
| seen := make(map[string]struct{}, len(slice)) | |
| for _, existing := range slice { | |
| key, err := json.Marshal(existing) | |
| if err != nil { | |
| continue // skip values that can't be marshaled | |
| } | |
| seen[string(key)] = struct{}{} | |
| } | |
| valueKey, err := json.Marshal(value) | |
| if err != nil { | |
| return slice // skip value if it can't be marshaled | |
| } | |
| if _, exists := seen[string(valueKey)]; exists { | |
| return slice | |
| } |
Signed-off-by: grokspawn <jordan@nimblewidget.com>
b9a05a8 to
58fbad3
Compare
Signed-off-by: grokspawn <jordan@nimblewidget.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 15 out of 16 changed files in this pull request and generated 12 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Allow POST requests only for GraphQL endpoint | ||
| if !strings.HasSuffix(r.URL.Path, "graphql") && r.Method == http.MethodPost { | ||
| http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed) | ||
| return | ||
| } | ||
| if !allowedMethodSet.Has(r.Method) { | ||
| http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed) | ||
| return | ||
| } |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allowedMethodsHandler function allows POST for all handlers if the path contains "graphql", but this check uses strings.HasSuffix(r.URL.Path, "graphql") which will match any path ending with "graphql". This logic is inverted - POST should only be allowed for GraphQL endpoints, not blocked. The current implementation at line 211 blocks POST for non-GraphQL endpoints (correct), but line 215 then checks if the method is in the allowed set which includes POST (line 69), creating confusion. Consider simplifying the logic to be more explicit about which methods are allowed for which endpoints.
| // Allow POST requests only for GraphQL endpoint | |
| if !strings.HasSuffix(r.URL.Path, "graphql") && r.Method == http.MethodPost { | |
| http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed) | |
| return | |
| } | |
| if !allowedMethodSet.Has(r.Method) { | |
| http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed) | |
| return | |
| } | |
| // Only allow POST for the GraphQL endpoint | |
| if r.Method == http.MethodPost { | |
| // Match exact GraphQL endpoint path (adjust as needed for your routing) | |
| if !(strings.HasSuffix(r.URL.Path, "/graphql") && allowedMethodSet.Has(http.MethodPost)) { | |
| http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed) | |
| return | |
| } | |
| } else if !allowedMethodSet.Has(r.Method) { | |
| http.Error(w, http.StatusText(http.StatusMethodNotAllowed), http.StatusMethodNotAllowed) | |
| return | |
| } |
| // Get or build the schema | ||
| // TODO: prevent cache rebuild on this callpath | ||
| dynamicSchema, err := s.GetSchema(catalog, catalogFS) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get GraphQL schema: %w", err) |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TODO comment "prevent cache rebuild on this callpath" suggests a known performance issue where the schema cache might be rebuilt unnecessarily. This should be addressed before merging, or tracked with a specific issue reference.
| // Get or build the schema | |
| // TODO: prevent cache rebuild on this callpath | |
| dynamicSchema, err := s.GetSchema(catalog, catalogFS) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get GraphQL schema: %w", err) | |
| // Get the schema from cache if available, otherwise build and cache it | |
| s.schemaMux.RLock() | |
| dynamicSchema, ok := s.schemaCache[catalog] | |
| s.schemaMux.RUnlock() | |
| if !ok { | |
| var err error | |
| dynamicSchema, err = s.GetSchema(catalog, catalogFS) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to get GraphQL schema: %w", err) | |
| } |
| test func(*testing.T, *LocalDirV1, fs.FS) | ||
| cleanup func(*testing.T, *LocalDirV1) | ||
| }{ | ||
| { |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test case name "TestLocalDirStoraget" has a typo - it should be "TestLocalDirStorage" (missing the final 'e').
| s.graphqlSvc.InvalidateCache(catalog) | ||
|
|
||
| // Pre-warm the GraphQL schema cache | ||
| if _, err := s.graphqlSvc.GetSchema(catalog, fsys); err != nil { |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-warming the GraphQL schema cache during Store() operation (lines 127-130) happens after the catalog has been successfully stored. If schema building fails, it returns an error, but the catalog data has already been persisted. This could leave the system in an inconsistent state where catalog data exists but has no GraphQL schema. Consider whether this error should be non-fatal or if the catalog storage should be rolled back on schema build failure.
| if _, err := s.graphqlSvc.GetSchema(catalog, fsys); err != nil { | |
| if _, err := s.graphqlSvc.GetSchema(catalog, fsys); err != nil { | |
| // Rollback: remove catalogDir if schema build fails | |
| _ = os.RemoveAll(catalogDir) |
| if t, err := time.Parse(http.TimeFormat, r.Header.Get("If-Unmodified-Since")); err == nil { | ||
| // The Last-Modified header truncates sub-second precision, so | ||
| // use ModTime >= t+1s instead of ModTime > t to check for modified. | ||
| if modtime.After(t.Add(-time.Second)) { |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checkPreconditions function has incorrect logic for checking If-Unmodified-Since. Line 28 checks modtime.After(t.Add(-time.Second)) which adds negative one second to the request time, but the comment says "use ModTime >= t+1s instead of ModTime > t". The logic should be modtime.After(t) or !modtime.Before(t.Add(time.Second)) to properly implement the RFC 7232 semantics. The current implementation will incorrectly trigger precondition failures.
| if modtime.After(t.Add(-time.Second)) { | |
| if !modtime.Before(t.Add(time.Second)) { |
|
|
||
| // Collect all metas from the catalog filesystem | ||
| // WalkMetasFS walks the filesystem concurrently, so we need to protect the metas slice | ||
| err := declcfg.WalkMetasFS(context.Background(), catalogFS, func(path string, meta *declcfg.Meta, err error) error { | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if meta != nil { | ||
| metasMux.Lock() | ||
| metas = append(metas, meta) | ||
| metasMux.Unlock() |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buildSchemaFromFS function uses declcfg.WalkMetasFS which walks the filesystem concurrently, requiring mutex protection for the metas slice (lines 91, 100-102). However, there's no protection for potential errors during concurrent execution. If one goroutine encounters an error, others may still be appending to the metas slice, leading to potential inconsistencies. Consider if error handling needs improvement here.
| // Collect all metas from the catalog filesystem | |
| // WalkMetasFS walks the filesystem concurrently, so we need to protect the metas slice | |
| err := declcfg.WalkMetasFS(context.Background(), catalogFS, func(path string, meta *declcfg.Meta, err error) error { | |
| if err != nil { | |
| return err | |
| } | |
| if meta != nil { | |
| metasMux.Lock() | |
| metas = append(metas, meta) | |
| metasMux.Unlock() | |
| var walkErr error | |
| // Collect all metas from the catalog filesystem | |
| // WalkMetasFS walks the filesystem concurrently, so we need to protect the metas slice and error | |
| err := declcfg.WalkMetasFS(context.Background(), catalogFS, func(path string, meta *declcfg.Meta, err error) error { | |
| metasMux.Lock() | |
| defer metasMux.Unlock() | |
| if err != nil { | |
| // Set shared error so other goroutines can check | |
| if walkErr == nil { | |
| walkErr = err | |
| } | |
| return err | |
| } | |
| // If an error has already occurred, skip further mutation | |
| if walkErr != nil { | |
| return walkErr | |
| } | |
| if meta != nil { | |
| metas = append(metas, meta) |
| // If we have an empty part after having content, it means there was a trailing separator | ||
| // Add a capitalized version of the last word | ||
| if hasContent && i == len(parts)-1 { | ||
| // Get the base word (first non-empty part) | ||
| for _, p := range parts { | ||
| if p != "" { | ||
| result += strings.ToUpper(string(p[0])) + strings.ToLower(p[1:]) | ||
| break | ||
| } | ||
| } | ||
| } |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field remapping logic in remapFieldName has complex handling for trailing underscores (lines 68-78) that seems incorrect. When an empty part is found at the end of the parts array (line 68), it tries to capitalize and append a word, but this logic appears to duplicate parts of field names rather than properly handle trailing separators. For example, a field ending with an underscore might get unexpected capitalization. This needs clarification or simplification.
| // If we have an empty part after having content, it means there was a trailing separator | |
| // Add a capitalized version of the last word | |
| if hasContent && i == len(parts)-1 { | |
| // Get the base word (first non-empty part) | |
| for _, p := range parts { | |
| if p != "" { | |
| result += strings.ToUpper(string(p[0])) + strings.ToLower(p[1:]) | |
| break | |
| } | |
| } | |
| } | |
| // Skip empty parts (e.g., from trailing or consecutive underscores) |
| // Check for unexpected query parameters | ||
| expectedParams := map[string]bool{ | ||
| "schema": true, | ||
| "package": true, | ||
| "name": true, | ||
| } | ||
|
|
||
| for param := range r.URL.Query() { | ||
| if !expectedParams[param] { | ||
| httpError(w, errInvalidParams) | ||
| return | ||
| } | ||
| } |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The check for unexpected query parameters in handleV1Metas validates against a hardcoded map (lines 89-93), but this validation logic could be more maintainable. Consider extracting this to a constant or using a more descriptive approach. This is a minor maintainability improvement.
| // checkPreconditions checks HTTP preconditions (If-Modified-Since, If-Unmodified-Since) | ||
| // Returns true if the request has already been handled (e.g., 304 Not Modified response sent) | ||
| func checkPreconditions(w http.ResponseWriter, r *http.Request, modtime time.Time) (done bool) { | ||
| // Check If-Modified-Since | ||
| if r.Method == http.MethodGet || r.Method == http.MethodHead { | ||
| if t, err := time.Parse(http.TimeFormat, r.Header.Get("If-Modified-Since")); err == nil { | ||
| // The Last-Modified header truncates sub-second precision, so | ||
| // use ModTime < t+1s instead of ModTime <= t to check for unmodified. | ||
| if modtime.Before(t.Add(time.Second)) { | ||
| w.WriteHeader(http.StatusNotModified) | ||
| return true | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Check If-Unmodified-Since | ||
| if r.Method != http.MethodGet && r.Method != http.MethodHead { | ||
| if t, err := time.Parse(http.TimeFormat, r.Header.Get("If-Unmodified-Since")); err == nil { | ||
| // The Last-Modified header truncates sub-second precision, so | ||
| // use ModTime >= t+1s instead of ModTime > t to check for modified. | ||
| if modtime.After(t.Add(-time.Second)) { | ||
| w.WriteHeader(http.StatusPreconditionFailed) | ||
| return true | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return false | ||
| } |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplified checkPreconditions function removes support for ETag-based conditional requests (If-None-Match, If-Match) that were present in the deleted http_preconditions_check.go file. This is a breaking change that removes RFC 7232 compliance. The old implementation had comprehensive ETag support, while the new one only handles time-based preconditions. If this is intentional, it should be documented; otherwise, ETag support should be preserved.
| GraphQLCatalogQueries = featuregate.Feature("GraphQLCatalogQueries") | ||
| ) | ||
|
|
||
| var catalogdFeatureGates = map[featuregate.Feature]featuregate.FeatureSpec{ | ||
| APIV1MetasHandler: {Default: false, PreRelease: featuregate.Alpha}, | ||
| GraphQLCatalogQueries: {Default: false, PreRelease: featuregate.Alpha}, |
Copilot
AI
Nov 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The feature gate GraphQLCatalogQueries is defined in features.go but is never checked before enabling the GraphQL endpoint. The endpoint is unconditionally registered in handlers.go line 67, regardless of the feature gate setting. This means the feature gate has no effect. The feature gate should be checked in the handler registration logic or when creating the CatalogHandlers.
|
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Description
Reviewer Checklist