Backport 5.8: AP-25788: Dump log buffer in shutdown hook if not yet initialized#81
Conversation
AP-25788 (NodeLogger may swallow buffered log messages)
There was a problem hiding this comment.
Pull request overview
Backports AP-25788 to prevent buffered startup log messages from being lost if the JVM shuts down before KNIME logging finishes initialization by introducing a failsafe shutdown-hook dump path.
Changes:
- Refactors
LogBufferdraining to return a structured snapshot (BufferContents) that includes evicted-entry metadata. - Adds failsafe shutdown-hook handling in
KNIMELoggerto dump buffered startup logs to stdout/stderr with configurable minimum level. - Introduces new environment variables in
KNIMEConstantsto control failsafe logging target and minimum level.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
org.knime.core/src/eclipse/org/knime/core/node/logging/LogBuffer.java |
Adds BufferContents and a drain() API to support snapshot-style draining for shutdown dumping. |
org.knime.core/src/eclipse/org/knime/core/node/logging/KNIMELogger.java |
Installs a shutdown hook and implements failsafe emission of buffered logs when initialization never completes. |
org.knime.core/src/eclipse/org/knime/core/node/KNIMEConstants.java |
Adds env var constants for failsafe logging configuration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| private static void installFailsafeLoggingHandlers() { | ||
| if (isShutdownHookInstalled.compareAndSet(false, true)) { | ||
| try { | ||
| Runtime.getRuntime().addShutdownHook(new Thread( | ||
| () -> drainBufferedMessagesTo(FAILSAFE_LOGGING_TARGET, | ||
| "JVM is shutting down before KNIME logging could finish initialization"), | ||
| "KNIME startup log dump")); | ||
| } catch (IllegalStateException | SecurityException ex) { | ||
| isShutdownHookInstalled.set(false); | ||
| FAILSAFE_LOGGING_TARGET.println("Unable to register KNIME startup log shutdown hook:"); | ||
| ex.printStackTrace(FAILSAFE_LOGGING_TARGET); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Drains buffered startup log messages to the given stream. | ||
| * | ||
| * @param out the target stream | ||
| * @param reason a human-readable reason included in the dump header | ||
| */ | ||
| private static void drainBufferedMessagesTo(final PrintStream out, final String reason) { | ||
| drainBufferedMessagesTo(out, reason, FAILSAFE_LOGGING_MIN_LEVEL); | ||
| } | ||
|
|
||
| private static void drainBufferedMessagesTo(final PrintStream out, final String reason, final Level minLevel) { | ||
| Objects.requireNonNull(out); | ||
| final LogBuffer.BufferContents contents; | ||
| synchronized (BUFFER) { | ||
| if (isFailsafeBufferDumped) { | ||
| return; | ||
| } | ||
| isFailsafeShutdownLoggingActive = true; | ||
| isFailsafeBufferDumped = true; | ||
| contents = BUFFER.drain(); | ||
| } | ||
| if (contents.isEmpty()) { | ||
| return; | ||
| } | ||
|
|
||
| final var matchingMessages = Arrays.stream(contents.messages()) | ||
| .filter(message -> message.level().isGreaterOrEqual(minLevel)) | ||
| .toList(); | ||
| final var shouldEmitEvictionNotice = contents.evictedEntries() > 0 | ||
| && contents.evictionMessageLevel().isGreaterOrEqual(minLevel); | ||
| if (matchingMessages.isEmpty() && !shouldEmitEvictionNotice) { | ||
| return; | ||
| } | ||
|
|
||
| final var total = contents.messages().length + contents.evictedEntries(); | ||
| final var countMessages = total > 1 ? "%d messages were".formatted(total) : "1 message was"; | ||
| out.printf("KNIME startup failsafe logging dump: %s. %s buffered. Minimum level: %s.%n", | ||
| reason, countMessages, minLevel); | ||
| if (shouldEmitEvictionNotice) { | ||
| emitToFailsafeLoggingTarget(KNIMELogger.class.getName(), contents.evictionMessageLevel(), | ||
| "[*** Log incomplete: log buffer did wrap around -- %d messages were evicted from buffer in total ***]" | ||
| .formatted(contents.evictedEntries()), | ||
| null, Instant.now()); | ||
| } | ||
| for (final var message : matchingMessages) { | ||
| emitToFailsafeLoggingTarget(message.name(), message.level(), message.message(), message.cause(), | ||
| message.instant()); | ||
| } | ||
| out.println("End of KNIME startup failsafe logging dump."); | ||
| out.flush(); | ||
| } |
There was a problem hiding this comment.
The new failsafe shutdown-hook dumping logic (level filtering, eviction notice emission, and the isFailsafeBufferDumped/isFailsafeShutdownLoggingActive state transitions) isn’t covered by tests. Since there are already KNIMELogger/initialization tests in this repo, it would be good to add coverage to ensure only messages >= min level are emitted and eviction notices respect the configured min level.
| out.printf("KNIME startup failsafe logging dump: %s. %s buffered. Minimum level: %s.%n", | ||
| reason, countMessages, minLevel); | ||
| if (shouldEmitEvictionNotice) { | ||
| emitToFailsafeLoggingTarget(KNIMELogger.class.getName(), contents.evictionMessageLevel(), | ||
| "[*** Log incomplete: log buffer did wrap around -- %d messages were evicted from buffer in total ***]" | ||
| .formatted(contents.evictedEntries()), | ||
| null, Instant.now()); | ||
| } | ||
| for (final var message : matchingMessages) { | ||
| emitToFailsafeLoggingTarget(message.name(), message.level(), message.message(), message.cause(), | ||
| message.instant()); | ||
| } |
There was a problem hiding this comment.
drainBufferedMessagesTo(PrintStream out, ...) prints the header/footer to the provided out, but the actual formatted log events are written via emitToFailsafeLoggingTarget(...) to FAILSAFE_LOGGING_TARGET. This makes the out parameter ineffective (and can split the dump across two streams if a different out is ever passed). Consider either routing event output to the passed-in stream (e.g., pass out through) or removing the out parameter and always using the configured target consistently.
|
|
||
| private static void emitToFailsafeLoggingTarget(final String name, final Level level, final Object message, | ||
| final Throwable cause, final Instant instant) { | ||
| final var event = new LoggingEvent(name, Logger.getLogger(name), instant.toEpochMilli(), level, message, cause); |
There was a problem hiding this comment.
The LoggingEvent constructor’s first argument is the FQN of the calling class (used for location info), but this code passes the logger name instead. This can lead to incorrect caller/location data if the layout ever includes location patterns. Prefer using KNIMELogger.class.getName() (or the actual caller FQN) for the FQN parameter and keep the logger/category name in the Logger.getLogger(name) argument.
| final var event = new LoggingEvent(name, Logger.getLogger(name), instant.toEpochMilli(), level, message, cause); | |
| final var event = new LoggingEvent(KNIMELogger.class.getName(), Logger.getLogger(name), | |
| instant.toEpochMilli(), level, message, cause); |
| void drainTo(final Consumer<BufferedLogMessage> consumer) { | ||
| if (!m_logBuffer.isEmpty()) { | ||
| final var contents = drain(); | ||
| if (!contents.isEmpty()) { |
There was a problem hiding this comment.
drainTo(...) now delegates to drain(), but drain() is documented as requiring callers to synchronize on the LogBuffer instance. Since drainTo(...) doesn’t synchronize (and doesn’t mention this requirement), it’s easy to call it unsafely from elsewhere. Consider either synchronizing inside drainTo(...)/drain(), or explicitly documenting the synchronization requirement on drainTo(...) as well.
AP-25788 (NodeLogger may swallow buffered log messages)