Skip to content

[pull] master from openyurtio:master#38

Draft
pull[bot] wants to merge 163 commits intoLavenderQAQ:masterfrom
openyurtio:master
Draft

[pull] master from openyurtio:master#38
pull[bot] wants to merge 163 commits intoLavenderQAQ:masterfrom
openyurtio:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Jun 18, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot added the ⤵️ pull label Jun 18, 2024
rambohe and others added 17 commits June 26, 2024 14:01
1. use entire user agent for cloud working mode, so requests that posted by controller clients of each yurt-manager can be monitored in yurthub metrics
2. wrap response in order to recored the traffic for any resource requests instead only for requests that be cached.
* feat: add autonomy controller
Signed-off-by: rambohe-ch <linbo.hlb@alibaba-inc.com>
Signed-off-by: rambohe-ch <linbo.hlb@alibaba-inc.com>
* fix: upload autonomy status correctly
Signed-off-by: rambohe-ch <linbo.hlb@alibaba-inc.com>
…t will be used (#2098)

Signed-off-by: rambohe-ch <linbo.hlb@alibaba-inc.com>
…2099)

Signed-off-by: rambohe-ch <linbo.hlb@alibaba-inc.com>
Signed-off-by: rambohe-ch <linbo.hlb@alibaba-inc.com>
@LavenderQAQ LavenderQAQ marked this pull request as draft August 3, 2024 07:12
zyjhtangtang and others added 30 commits September 9, 2025 12:34
Co-authored-by: xiaomi <xiaomi@xiaomideMacBook-Pro-2.local>
Co-authored-by: bingchang.tbc <bingchang.tbc@alibaba-inc.com>
* Initial version, this version completes the deployment mode from static pod to systemd service deployment mode, and passes the deployment test

* Add yurthub download and place it in the working directory

* Added the section about yurthub systemd service in yurtadm reset

* Add unit tests for main functions: TestCheckAndInstallYurthub, TestCreateYurthubSystemdService, TestCheckYurthubServiceHealth

* Modified the code according to the comments, carried out deployment testing, and modified the previous unit test content

* Modify the previous comments

* Make some changes and add some tests

* Adjust some comments

* Modify unit tests

* Add unit tests
update and re-generate config

Co-authored-by: zhihao jian <zhihao.jian@shopee.com>
Co-authored-by: LiuDui <duiliu333@gmail.com>
Co-authored-by: bingchang.tbc <bingchang.tbc@alibaba-inc.com>
Co-authored-by: bingchang.tbc <bingchang.tbc@alibaba-inc.com>
Co-authored-by: bingchang.tbc <bingchang.tbc@alibaba-inc.com>
Co-authored-by: bingchang.tbc <bingchang.tbc@alibaba-inc.com>
Co-authored-by: bingchang.tbc <bingchang.tbc@alibaba-inc.com>
…fied the multiplexer cache. (#2481)

Co-authored-by: bingchang.tbc <bingchang.tbc@alibaba-inc.com>
Co-authored-by: kyrie <qinshaolong.qsl@alibaba-inc.com>
* Adds pod image prepull info to upgrade condition

Enhances pod upgrade handling by including updated daemonset
image details, image pull secrets, and service account info in
the pod condition message. Facilitates image prepull for OTA
users and supports more robust upgrade workflows.

* Adds image pre-pull API for pods via OTA update

Introduces an endpoint to trigger image pre-pull requests for pods,
enabling faster startup during upgrades. Enhances OTA update workflow
by allowing users to initiate image pulls, improving efficiency and
reducing pod readiness latency.

* Adds image pull job controller for DaemonSet pods

Introduces a new controller to automate image pre-pulling for pods managed by DaemonSets. Improves pod startup reliability by creating and tracking image pull jobs, updating pod conditions based on job outcomes, and ensuring only relevant events trigger reconciliation.

Enhances maintainability and observability for image readiness in distributed environments.

* Image preheating controller.

---------

Co-authored-by: luchen <cl382465@alibaba-inc.com>
Co-authored-by: bingchang.tbc <bingchang.tbc@alibaba-inc.com>
Signed-off-by: promalert <promalert@outlook.com>
Signed-off-by: Kartik Angiras <angiraskartik@gmail.com>
Remove deprecated rand.Seed() calls and unused random number generator
variables from cmd/yurt-manager, cmd/yurthub, and cmd/yurt-node-servant.
The rand.New() function already seeds the generator, making the explicit
Seed() call redundant and deprecated in future Go versions.

This prevents compilation failures in future Go versions.
…ate/write failure (#2507)

* fix: restore from backup and return error on ReplaceComponentList create/write failure

ReplaceComponentList used to continue on CreateDir/CreateFile failure and
always deleted the tmp backup at the end, causing data loss when some
writes failed while the function could still return success.

- On CreateDir(absPath), CreateDir(filepath.Dir(path)), or CreateFile
  failure: restore by Rename(tmpPath, absPath) and return the error.
- Delete tmp dir only when the full replacement succeeds.
- Add restoreReplaceFromBackup helper (same style as recoverFile/recoverDir).

* test: add restoreReplaceFromBackup unit tests for coverage

- should move tmpPath to absPath and return restoreErr (success path)
- should return combined error when Rename fails (failure path)
Covers the new helper at 100% to address codecov patch coverage.
)

When the NodeAutonomy condition status changes, LastTransitionTime
was never updated due to a copy-paste bug where LastHeartbeatTime
was set twice instead of setting LastTransitionTime on the second line.

This caused stale transition timestamps in the node condition, breaking
observability, alerting, and debugging for edge node autonomy state
changes.

Signed-off-by: Aman-Cool <aman017102007@gmail.com>
* fix: race condition in cache manager's inMemoryCache

Fixed critical race condition where inMemoryCache map was accessed
concurrently without proper synchronization. The issue occurred when:

1. saveWatchObject runs in a goroutine and calls updateInMemoryCache
2. queryInMemoryCache returns pointers directly from the map
3. Objects stored in cache could be modified concurrently

Changes:
- Deep copy objects when storing in inMemoryCacheFor to ensure independence
- Deep copy objects when retrieving in queryInMemoryCache while holding lock
- Added nil safety checks for defensive programming
- Prevents data corruption and ensures thread-safe cache access

The fix ensures that:
- Cached objects are independent copies, preventing concurrent modification
- Lock is held during copy operation to prevent map entry replacement
- Nil objects are handled gracefully

This follows Kubernetes best practices using DeepCopyObject() method
which is the standard pattern for creating independent object copies.

Priority: CRITICAL
Impact: Data corruption, panics, inconsistent cache state

* test: add test for in-memory cache deep copy

Added test to verify deep copy behavior in in-memory cache:
- Objects stored in cache are independent copies
- Objects retrieved from cache are independent copies
- Modifying original object does not affect cached copy

Follows existing test patterns with table-driven approach.

* test: make flannel edge-autonomy test more robust

Check if flannel container is already stopped before attempting to stop it.
This handles cases where the container may have been stopped by a previous
test run or due to timing issues, preventing flaky test failures.

The fix follows the same pattern as other container stop operations in
the test file, using direct execution rather than Eventually wrapper.

* fix: correct indentation in flannel test

Fix gci linter error by removing extra tab indentation on line 74.

* fix: address review feedback for inMemoryCache nil handling

- Don't store nil objects in cache, just return instead
- This ensures queryInMemoryCache returns ErrInMemoryCacheMiss for non-existent keys
- Simplify queryInMemoryCache by removing unnecessary nil check
- Fix indentation in flannel e2e test (gci linter error)
Add a length check before accessing Pod ownerReferences and return a 400 error when missing.
…2515)

* Bugfix: fix nil pointer in local proxy (localDelete/localPost)

Add nil checks for RequestInfo in localDelete, localPost, localReqCache,
and WithFakeTokenInject. The code previously ignored the 'ok' return value
from apirequest.RequestInfoFrom(), which could cause panics when RequestInfo
was not present in the request context.

- localDelete: return InternalError when info is nil
- localPost: return InternalError when info is nil
- localReqCache: return InternalError when reqInfo is nil in NotFound path
- faketoken: delegate to inner handler when info is nil

* Test: add unit tests for nil RequestInfo defensive paths

Add coverage for nil RequestInfo branches in localDelete, localPost, and
WithFakeTokenInject to satisfy Codecov. Tests follow existing patterns:
table-driven with testcases in local_test.go, t.Run wrapper in faketoken_test.go.
* fix: guard nil request info in autonomy proxy

* test: cover autonomy proxy nil RequestInfo path
* Upgrade Kubernetes dependencies to v1.34.0

* fix: add package-level nolint for remaining SA1019 deprecation warnings

Add package-level //nolint:staticcheck directives to suppress SA1019
warnings for corev1.Endpoints and corev1.EndpointSubset in files that
intentionally use these deprecated APIs for backward compatibility.

Files updated:
- cmd/yurthub/app/config/config.go
- pkg/yurthub/locallb/locallb.go
- pkg/yurthub/locallb/locallb_test.go
- pkg/yurtmanager/controller/raven/gatewayinternalservice/gateway_internal_service_controller.go
- pkg/yurtmanager/controller/raven/gatewaypublicservice/gateway_public_service_controller.go
- pkg/yurtmanager/controller/servicetopology/adapter/endpoints_adapter.go
- pkg/yurttunnel/informers/serverinformer.go
- pkg/yurttunnel/server/serveraddr/addr.go

Signed-off-by: xenonnn4w <xenonnn4w@gmail.com>

* Regenerate CRD manifests

* fix: update expected kubeconfig in token test for k8s v1.34.0

clientcmd.Write() in k8s v1.34.0 no longer emits empty preferences field

* fix: race condition in PlatformAdmin and YurtAppSet readiness checks

* fix: conditionally add SELinux ,z mount flag for Podman only

---------

Signed-off-by: xenonnn4w <xenonnn4w@gmail.com>
* docs: add label-driven yurthub

* docs: update yurthub install logic

* docs: update controller job flag names

* docs: update trigger logic and add not-pause container restart
Co-authored-by: bingchang.tbc <bingchang.tbc@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.