feat: Add multi-platform support (AWS/EKS, Generic K8s) and in-cluster deployment#2
Merged
feat: Add multi-platform support (AWS/EKS, Generic K8s) and in-cluster deployment#2
Conversation
…r deployment
This commit adds comprehensive multi-platform support, enabling k8s-node-proxy
to run on AWS/EKS, generic Kubernetes clusters, and as an in-cluster deployment,
while maintaining full backward compatibility with existing GCP/GKE deployments.
## Platform Support
### AWS/EKS Support
- Add platform detection module (`internal/platform/`) for automatic cloud provider identification
- Implement EKS-specific service discovery using AWS SDK v2
- Implement EKS-specific node discovery with AWS authentication
- Add EKS server factory with cluster metadata fetching
- Support AWS credentials via default credential chain
### Generic Kubernetes Support
- Add generic service discovery supporting three authentication methods:
* Kubeconfig file (KUBECONFIG environment variable)
* Direct credentials (K8S_ENDPOINT, K8S_TOKEN, K8S_CA_CERT)
* In-cluster configuration (service account token)
- Add generic node discovery compatible with any Kubernetes cluster
- Implement automatic in-cluster detection for pod-based deployments
- Add generic server factory for non-cloud-specific clusters
### In-Cluster Deployment
- Enable running k8s-node-proxy as a Kubernetes deployment
- Automatic service account token detection and authentication
- No configuration required when deployed as a pod
## Architecture Changes
### Code Organization
- Extract common node discovery utilities to `internal/nodes/common.go`
- Create platform-specific server factories:
* `cmd/server/generic_server.go` - Generic Kubernetes
* `cmd/server/eks_server.go` - AWS/EKS
* Modified `cmd/server/main.go` - Platform routing
- Maintain separate implementations per platform (conservative refactoring approach)
### Discovery Improvements
- Refactor node discovery to share health monitoring logic
- Add context-based timeouts to prevent handler hangs
- Improve error handling in homepage handlers with graceful fallbacks
- Extract common failover logic while maintaining platform-specific implementations
## Testing Infrastructure
### E2E Test Improvements
- Remove outdated placeholder tests (eks_basic_test.go)
- Fix proxy handler tests to properly separate IP and port handling
- Fix test logic bugs (failover threshold test)
- Update tests to support in-cluster configuration
- Add comprehensive mock infrastructure for offline testing
### Integration Testing
- Add kind-based integration test script (test/e2e/kind_integration_test.sh)
- Create GitHub Actions CI/CD workflow (.github/workflows/test.yml)
- Add Makefile targets for different test levels:
* `make test` - All tests (unit + e2e mocks)
* `make test-unit` - Unit tests only
* `make test-e2e` - E2E tests with mocks
* `make test-e2e-kind` - Full integration with real cluster
### Test Coverage
- All unit tests passing with >80% coverage
- E2E tests passing with mocked services
- Race detector clean
## Build System
### Makefile Improvements
- Fix build targets to include all source files (`./cmd/server` instead of single file)
- Add comprehensive test targets
- Add cluster management targets (setup/teardown)
### Documentation
- Add comprehensive Contributing section to README
- Include platform support matrix
- Add development workflow guide
- Document testing strategy
- Add code structure overview
## Configuration
### Environment Variables
Platform detection priority:
1. `PROJECT_ID`/`GOOGLE_CLOUD_PROJECT` → GCP/GKE
2. `AWS_REGION` → AWS/EKS
3. `KUBECONFIG` or K8S_* vars → Generic
4. Service account token → In-cluster
Required per platform:
- GCP/GKE: `PROJECT_ID`, `NAMESPACE`
- AWS/EKS: `AWS_REGION`, `CLUSTER_NAME`, `NAMESPACE`
- Generic: `KUBECONFIG`, `NAMESPACE`
- In-cluster: `NAMESPACE` only
## Breaking Changes
None. All changes are backward compatible with existing GCP/GKE deployments.
## Dependencies
- Add AWS SDK v2 packages (config, eks, sts)
- Update to Go 1.25.1
- Update Kubernetes client-go to v0.34.1
## Known Issues
- Integration test script needs refinement for kubectl run cleanup
- Homepage handler may timeout on first request (resolved with fallback to cached data)
## Files Changed
Modified (10 files):
- .gitignore, Makefile, README.md
- cmd/server/main.go
- go.mod, go.sum
- internal/nodes/discovery.go, internal/nodes/discovery_test.go
- internal/proxy/handler.go
- internal/server/portmanager.go
Added (30+ files):
- cmd/server/eks_server.go, cmd/server/generic_server.go
- internal/platform/* (detector, AWS auth)
- internal/nodes/* (EKS, generic discovery + tests)
- internal/services/* (EKS, generic discovery + tests)
- test/e2e/* (integration tests, mocks, helpers)
- .github/workflows/* (CI/CD)
There was a problem hiding this comment.
Pull Request Overview
This PR transforms k8s-node-proxy from a GCP/GKE-specific tool into a universal Kubernetes NodePort proxy supporting multiple cloud platforms and deployment models.
- Adds AWS/EKS support with IAM authentication and automatic platform detection
- Implements Generic Kubernetes support for any cluster via kubeconfig or in-cluster deployment
- Maintains full backward compatibility with existing GCP/GKE deployments
Reviewed Changes
Copilot reviewed 41 out of 43 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
cmd/server/main.go |
Platform detection factory pattern routing to platform-specific servers |
cmd/server/generic_server.go |
Generic Kubernetes server implementation with kubeconfig support |
cmd/server/eks_server.go |
AWS EKS-specific server with IAM authentication |
internal/platform/ |
Platform detection system and AWS authentication utilities |
internal/services/ |
Service discovery interfaces for EKS and Generic platforms |
internal/nodes/ |
Node discovery implementations across all platforms |
internal/proxy/handler.go |
Abstracted interface for multi-platform node discovery |
test/ |
Comprehensive test suite with mocks and integration tests |
go.mod |
Updated dependencies for AWS SDK and IAM authenticator |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
- Fix containsPath logic to use strings.Contains instead of broken string slicing - Fix package comment formatting in discovery interfaces - Use time.Duration constant instead of magic number in test helper
Major improvements: - Unified homepage template across all platforms (GKE, EKS, Generic K8s) - Moved shared HTML template to internal/server/homepage.go - Removed duplicate templates from platform-specific server files Security improvements: - Service port now explicitly blocks all non-management endpoints - Only allows / (homepage) and /health on management port - Prevents accidental proxying on service port Lifecycle improvements: - Trigger initial node selection on startup with timeout - Improved shutdown logging for health monitoring - Better context handling in health check loops - Added defer statements for cleanup tracking Health endpoint improvements: - Simplified health response to include current node name - Removed unused /info endpoint - Consistent health check format across platforms Code cleanup: - Removed code duplication (net -41 lines) - Improved code comments and removed unnecessary ones - Better separation of concerns Testing: - Added concurrent request test (1000 requests) - Enhanced integration test script with better validation
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 44 out of 46 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR transforms k8s-node-proxy from a GCP/GKE-specific tool into a universal Kubernetes NodePort proxy that works across cloud providers and deployment models.
🎯 What's New
Multi-Platform Support
Automatic Platform Detection
The proxy now intelligently detects the environment and configures itself automatically:
Production-Ready Testing
📊 Impact
43 files changed, 7,193 additions(+), 166 deletions(-)
New Files:
🏗️ Architecture Changes
Before: Single GCP-specific server
main.go → GKE discovery → Node monitoring → Proxy
After: Platform-agnostic factory pattern
main.go → Platform Detection → Factory → [GCP|EKS|Generic] Server → Node monitoring → Proxy
Key Design Patterns:
🔧 Platform-Specific Details
AWS/EKS
New environment variables:
Authentication:
Files: cmd/server/eks_server.go, internal/platform/aws_auth.go, internal/nodes/eks_discovery.go
Generic Kubernetes
New environment variables:
Authentication:
Files: cmd/server/generic_server.go, internal/nodes/generic_discovery.go
In-Cluster Mode
Detection:
Requirements:
🧪 Testing Infrastructure
Unit Tests (make test-unit) - 30 seconds
E2E Tests (make test-e2e) - 1 minute
Integration Tests (make test-e2e-kind) - 10-15 minutes
GitHub Actions
Jobs:
✓ Unit tests with race detector
✓ E2E tests with mocks
✓ Integration tests with kind
✓ Build verification
Manually trigger: gh workflow run test.yml
📝 Configuration Examples
External VM with IAM role
In-cluster deployment
External machine with kubeconfig
Existing deployments work without modification
None. This PR is fully backward compatible. Existing GCP/GKE deployments will continue to work without any changes.
🚀 Migration Guide
For existing GCP users: No action required.
To switch platforms:
📚 Updated Documentation
✅ Checklist
🔍 Review Focus Areas
For reviewers, please pay special attention to:
🙏 Testing Requests
If you have access to AWS/EKS or other Kubernetes platforms, please test this branch and report results!
Clone and test