-
Notifications
You must be signed in to change notification settings - Fork 604
Description
Description
Summary:
In PersonKryoSerializer, a static final ThreadLocal is used to cache expensive Kryo instances. However, the code completely lacks a cleanup mechanism (no remove() is ever called).
Root Cause:
When a Hazelcast internal worker thread (e.g., partition operation thread or I/O thread) executes the write or read method, the Kryo instance is attached to the thread's ThreadLocalMap. Because the thread is pooled and long-lived, the Kryo instance is never garbage collected.
Impact (Critical):
-
Memory Leak: Accumulation of Kryo instances in Hazelcast worker threads.
-
ClassLoader Leak (Metaspace OOM): The Kryo instance registers the domain class: kryo.register(Person.class). If this serializer is deployed within a web container (e.g., Tomcat) or an OSGi environment, the Kryo instance holds a strong reference to Person.class, which in turn holds a reference to the WebappClassLoader. The unmanaged ThreadLocal forms a strong reference chain (Worker Thread -> ThreadLocalMap -> Kryo -> Person.class -> WebappClassLoader), preventing the application from being undeployed cleanly. Repeated redeployments will inevitably result in java.lang.OutOfMemoryError: Metaspace.
Code Snippet
Location: PersonKryoSerializer.java
// Definition: Static ThreadLocal holding application classes
private static final ThreadLocal KRYO_THREAD_LOCAL = new ThreadLocal<>() {
@OverRide
protected Kryo initialValue() {
Kryo kryo = new Kryo();
kryo.register(Person.class); // <--- Danger: Holds reference to WebappClassLoader
return kryo;
}
};
// Usage: get() is called, but remove() is NEVER called in write() or read()
Expected Behavior
ThreadLocals used in serialization components must not pin application classloaders.
Proposed Fix:
-
Avoid Static ThreadLocal for stateful serializers: If the serializer must be loaded by a child classloader, manage the Kryo instance lifecycle properly.
-
Use Object Pooling: Instead of ThreadLocal, consider using an object pool (like Apache Commons Pool) where instances are explicitly borrowed and returned/cleared in a try-finally block during the write/read operations.