Skip to content

Add TimeSync support for nodes with TimeSynchronization cluster#440

Open
markvp wants to merge 5 commits intomatter-js:mainfrom
markvp:feat/245-time-sync
Open

Add TimeSync support for nodes with TimeSynchronization cluster#440
markvp wants to merge 5 commits intomatter-js:mainfrom
markvp:feat/245-time-sync

Conversation

@markvp
Copy link
Copy Markdown
Contributor

@markvp markvp commented Mar 24, 2026

Summary

  • Adds TimeSyncManager that proactively syncs UTC time on nodes with the TimeSynchronization cluster (0x38)
  • Syncs time on three triggers: node connect/reconnect (immediate), timeFailure event (reactive), and periodic resync every 12 hours
  • Follows the existing CustomClusterPoller pattern for consistency

Context

Devices like IKEA ALPSTUGA lack battery backup and lose their time after power loss. The controller previously only set time during commissioning, causing timeSynchronization.timeFailure events after reconnection.

The setUtcTime command is sent with MillisecondsGranularity and TimeSource.Admin. TlvEpochUs handles automatic conversion from Unix epoch to Matter epoch (2000-01-01).

Changes

  • New: packages/ws-controller/src/controller/TimeSyncManager.tsTimeSyncManager class and hasTimeSyncCluster() utility
  • New: packages/ws-controller/test/controller/TimeSyncManagerTest.ts — 12 unit tests
  • New: packages/ws-controller/test/tsconfig.json — enables test compilation for ws-controller
  • Modified: packages/ws-controller/src/controller/ControllerCommandHandler.ts — integrates TimeSyncManager at all lifecycle points
  • Modified: packages/ws-controller/tsconfig.json — adds test reference

Test plan

  • npm run format passes
  • npm run lint passes (0 warnings, 0 errors)
  • npm run build passes (including test type validation)
  • npm test — all 12 ws-controller unit tests pass
  • Manual test: commission an ALPSTUGA, power-cycle it, verify time sync on reconnect

Closes #245

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 24, 2026 11:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds proactive UTC time synchronization for Matter nodes that implement the TimeSynchronization cluster, ensuring devices that lose time after power loss are resynced on reconnect, on timeFailure, and periodically.

Changes:

  • Introduce TimeSyncManager with cluster detection and periodic resync scheduling
  • Integrate TimeSyncManager into controller lifecycle (connect, events, removal, close)
  • Add ws-controller test compilation config and unit tests for sync triggers

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/ws-controller/tsconfig.json Adds test project reference so ws-controller tests are typechecked/compiled.
packages/ws-controller/test/tsconfig.json New tsconfig for ws-controller tests, wiring in shared test settings and references.
packages/ws-controller/test/controller/TimeSyncManagerTest.ts New unit tests for cluster detection + immediate/reactive sync behavior.
packages/ws-controller/src/controller/TimeSyncManager.ts Implements the TimeSyncManager (register/unregister, event-driven sync, periodic resync).
packages/ws-controller/src/controller/ControllerCommandHandler.ts Wires TimeSyncManager into node lifecycle/events and adds setUtcTime command sender.

Comment thread packages/ws-controller/src/controller/TimeSyncManager.ts Outdated
Comment thread packages/ws-controller/src/controller/TimeSyncManager.ts Outdated
Comment thread packages/ws-controller/src/controller/TimeSyncManager.ts Outdated
Comment thread packages/ws-controller/src/controller/TimeSyncManager.ts Outdated
@piitaya
Copy link
Copy Markdown

piitaya commented Mar 24, 2026

How will it work when multiple matter controller wants to sync time. For example, if I connected my IKEA ALPSTUGA to 2 matters server with different time. I know it's an edge case but we have concurrency issue because each control may have different logic to sync time.

*/
handleEvent(nodeId: NodeId, data: DecodedEventReportValue<any>): void {
const { path } = data;
if (path.clusterId === TIME_SYNC_CLUSTER_ID && path.eventId === TIME_FAILURE_EVENT_ID) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event can also be received because of the update on a reconnect so we need to ensure we do not execute the sync to often in parallel or such

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also see above directly subscribe to the event and not check any triggered event. thats too inefficient

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: syncNode() now tracks in-flight syncs via Map<NodeId, Promise<void>>. Calling syncNode() when a sync is already in-progress for that node is a no-op (with a debug log). The promise is cleaned up in a .finally() block.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event handler is now in a per-node ObserverGroup with direct filtering (clusterId + eventId) before calling syncNode(). The generic handleEvent method is removed.

/**
* Send a setUtcTime command to a node's TimeSynchronization cluster.
*/
async #syncNodeTime(nodeId: NodeId): Promise<void> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think thats more needed for it ... check home-assistant/core#166133 which basically implements anything correctly ... and we might also need a trigger on DST changes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reference. The HA PR uses ObserverGroup for event cleanup (which we've now adopted). Regarding DST changes: the 24-hour periodic resync is sufficient for DST correction. Pushing UTC epoch time to Matter nodes means DST is irrelevant on the device side — the device converts UTC to local time itself. No additional DST trigger is needed.

async #syncNodeTime(nodeId: NodeId): Promise<void> {
const client = this.#nodes.clusterClientByIdFor(nodeId, EndpointNumber(0), TimeSynchronization.Cluster.id);
await client.commands.setUtcTime({
utcTime: Date.now() * 1000,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have matter.js Time.nowMs that can be used here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: now uses Time.nowMs instead of Date.now().

node.events.eventTriggered.on(data => this.events.eventChanged.emit(nodeId, data));
node.events.eventTriggered.on(data => {
this.events.eventChanged.emit(nodeId, data);
this.#timeSyncManager.handleEvent(nodeId, data);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we only care about one event then it is better to subscribe for this one event using node.eventsOf(TimeSynchronizationClient).xxy.on(...) and add this to an ObserverGroup to clean up on close

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adopted ObserverGroup per node. Note: node.eventsOf(TimeSynchronizationClient) is not available because TimeSynchronizationClient is in @matter/node which is not a dependency of ws-controller. Instead, node.events.eventTriggered is filtered directly by clusterId === TIME_SYNC_CLUSTER_ID && eventId === TIME_FAILURE_EVENT_ID. The ObserverGroup per node ensures cleanup on decommission (close() in decommissionNode) and on manager stop.

* Syncs UTC time on three triggers:
* 1. Node connects/reconnects (immediate)
* 2. timeFailure event from the node (reactive)
* 3. Periodic resync every 12 hours
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think 24h is sufficient because it is an intermediate solution anyway

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to Hours(24).

const RESYNC_INTERVAL_MS = 12 * 60 * 60 * 1000;

// Maximum initial delay in milliseconds (random 0-60s to stagger startup)
const MAX_INITIAL_DELAY_MS = 60_000;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use "Seconds(60)" and also no _MS and better use 30-60mins because we need to ensure to not add additional traffic while the server tries to connect all nodes initially ... we have a bit of time when we do all this normally. We should also prevent initial "hey i connected" triggers from being executed while we startup ...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: startup delay is now Minutes(30) + Math.random() * Minutes(30) (random 30–60 min). Startup protection is implemented via a #startupComplete flag that is set to true only after the first periodic resync cycle fires. During the startup window, registerNode does not trigger immediate syncs. The first periodic cycle handles all registered nodes and sets the flag.

}

// Sync time immediately on connect/reconnect
this.#syncNode(nodeId);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above: we need an initial startup protection delay!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added: #startupComplete flag blocks immediate syncs until the first periodic cycle completes (30–60 min after startup).

*/
stop(): void {
this.#closed = true;
this.#currentDelayPromise?.cancel(new Error("Close"));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should track the current "time sync promise" and await this here if any is in progress and so make this async to have a clean stop

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: stop() is now async. It cancels the delay promise, stops the timer, then await Promise.allSettled(#inFlightSyncs.values()) before clearing.

logger.debug(`Node ${nodeId} not connected, skipping time sync`);
return;
}
this.#connector.syncTime(nodeId).then(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as said above. track that promise and cleanup so that we can await it on close and it is not hidden

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: #inFlightSyncs: Map<NodeId, Promise<void>> tracks all in-flight immediate syncs. stop() awaits all of them via Promise.allSettled() before clearing.

/**
* Manages time synchronization for nodes with the TimeSynchronization cluster.
*/
export class TimeSyncManager {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the whole class adds a lot of code duplication with the CustomClusterPoller.ts ... because this is done by AI anyway please refactor to use a common "NodeProcessor" class as basis and try to streamline the two use cases as sub classes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: extracted NodeProcessor abstract base class (packages/ws-controller/src/controller/NodeProcessor.ts). Both TimeSyncManager and CustomClusterPoller now extend it. The base handles timer lifecycle, node registration/unregistration, the per-node processing loop (with 2s inter-node delay and re-entrancy guard), and async stop(). Subclasses implement shouldProcess(nodeId), processNode(nodeId), and the optional onCycleComplete() hook.

@Apollon77
Copy link
Copy Markdown
Collaborator

@piitaya I completely agree :-) Thats why also the real official solution is a bit more effort :-) ... and I need to real all spec on it

@agners
Copy link
Copy Markdown
Contributor

agners commented Mar 25, 2026

@piitaya I completely agree :-) Thats why also the real official solution is a bit more effort :-) ... and I need to real all spec on it

In Home Assistant the Supervisor exposes if the time go synchronized through NTP (see /supervisor/info endpoint in the Supervisor developer docs). I'd only sync time if that is true, since otherwise a freshly started Raspberry Pi (which has no RTC) may sync a wrong time to devices after power outage 🥶 ).

Currently we don't have an event when synchronization state changes 😢 . So we'd probably have to poll the flag for a while if it is false on startup.

@Apollon77
Copy link
Copy Markdown
Collaborator

Ok, so from what I get ... the interestung question is how to determine IF a host time is synced with any relevant source and so a valid source for it ... and when that happened ... that's maybe a tough one

@Apollon77
Copy link
Copy Markdown
Collaborator

Sooo, we had a discussion yesterday because we should know if the host is synced time wise:

  • add a commandline/ENV-var option to enable time sync (because server is considered) synced
  • Any only ten enable it here
  • The HA app/addon should check the above mentioned supervisor ntp information and set this cli flag accordingly

@markvp
Copy link
Copy Markdown
Contributor Author

markvp commented Apr 11, 2026

@Apollon77 do you want me to work on this or do you want to close it. IF you want me to work on it, please let me know what you would like me to do / not do.

@Apollon77
Copy link
Copy Markdown
Collaborator

Give all my comments to your AI and I guess he can work with it. If you like feel free to continue. If now I will take it over somewhen ;-)

@codersaur
Copy link
Copy Markdown

Would be great to get this one finally over the line. FWIW, having an ENV var to enable the time sync would work for me.

@markvp
Copy link
Copy Markdown
Contributor Author

markvp commented May 3, 2026

Review addressed

All feedback from @Apollon77 and @copilot-pull-request-reviewer has been incorporated in the latest commit. Summary of changes:

New: --enable-time-sync opt-in flag

  • Added --enable-time-sync CLI flag and ENABLE_TIME_SYNC env var (default: off). Time sync must only be enabled when the host's NTP is reliable.

New: NodeProcessor abstract base class

  • Extracted shared timer lifecycle, node registration, and per-node processing loop into NodeProcessor. Both TimeSyncManager and CustomClusterPoller now extend it, eliminating ~80 lines of duplication in each.

TimeSyncManager changes

  • Resync interval: Hours(24) (was 12 * 60 * 60 * 1000 with _MS naming)
  • Startup window: Minutes(30) + Math.random() * Minutes(30) (was random 0–60s)
  • Startup protection: #startupComplete flag blocks immediate syncs until the first periodic cycle fires; only then do reconnects trigger an immediate sync
  • In-flight dedup: #inFlightSyncs: Map<NodeId, Promise<void>>syncNode() is a no-op if a sync is already running for that node
  • Async stop(): cancels delay, stops timer, then await Promise.allSettled(inFlightSyncs) before clearing
  • handleEvent() removed — caller now calls syncNode() directly

ControllerCommandHandler changes

  • ObserverGroup per node: all node.events.* subscriptions are registered in a per-node ObserverGroup, closed on decommission and on manager stop
  • eventTriggered handler filters directly by TIME_SYNC_CLUSTER_ID + TIME_FAILURE_EVENT_ID before calling syncNode()
  • Note: node.eventsOf(TimeSynchronizationClient) is not used because @matter/node is not a dependency of ws-controller; cluster/event ID filtering achieves the same result
  • Time.nowMs replaces Date.now()

Tests rewritten

  • Tests now cover: startup protection, immediate sync after completeStartup(), in-flight deduplication, post-unregister/post-stop guards, error resilience

…n cluster

Introduces opt-in time synchronization via --enable-time-sync / ENABLE_TIME_SYNC.
Adds NodeProcessor abstract base class for timer-driven periodic node processing,
and TimeSyncManager which extends it to sync UTC time using PeerAddress (matching
the CustomClusterPoller upstream migration). Syncs on reconnect after startup window
and periodically every 24 hours.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@markvp markvp force-pushed the feat/245-time-sync branch from bf8c31b to e72bf8d Compare May 3, 2026 03:14
markvp and others added 4 commits May 2, 2026 20:19
…rClientByIdFor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Eliminates duplicated timer/loop boilerplate (~50 lines) by extending the
shared NodeProcessor abstraction. Per-node attribute paths stay in the subclass;
shouldProcess, processNode, and onCycleComplete implement the polling logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tests cover: registration (startup window guard, connectivity check, cluster
detection), syncNode (immediate fire, deduplication, in-flight tracking),
unregisterNode, periodic resync via NodeProcessor timer (startup delay,
24h cycle, skip disconnected/in-flight peers), and stop() awaiting in-flight
syncs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Math.random() produces a float; @matter/testing's MockTimer requires
integer milliseconds and throws on fractional values.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@markvp
Copy link
Copy Markdown
Contributor Author

markvp commented May 6, 2026

@Apollon77 please can you confirm if my updates address all your concerns or if there are any other changes required? thanks for your patience while I try to get this right :)

@Apollon77
Copy link
Copy Markdown
Collaborator

@markvp Yes I have seen the updates and did not had time so far to review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add TimeSync support

6 participants