De/serialization performance improvements and span computation improvements#1810
De/serialization performance improvements and span computation improvements#1810
Conversation
| * 3. DeferredSerializedValue: lazy caching + direct streaming | ||
| * 4. Large composite types: deeply nested records with collections | ||
| */ | ||
| class JacksonStreamingTortureTest { |
There was a problem hiding this comment.
Any objection to calling it a "stress test"? Torture is a little harsh sounding
| default void writeJson(T value, JsonGenerator gen) throws IOException { | ||
| serializeValue(value).writeTo(gen); | ||
| } |
There was a problem hiding this comment.
Some context: the purpose of SerializedValue is to provide a format-agnostic interface (so you could just as easily write to avro, orc, parquet, etc).
I'm trying to decide whether or not writeJson fits that goal. I suppose I would defend it a couple ways:
- it's a default method, which means you don't have to implement it unless you need to for performance reasons
- JSON is currently the only format we use in practice
There was a problem hiding this comment.
A thought occurred to me: it seems to me the key innovation of this PR is not the ability to bypass the SerializedValue interface and write JSON directly, but the ability to serialize without building up a parallel in-memory data structure.
I think it's worth exploring an interface that does effectively the same thing as this PR at runtime, but at compile time abstracts away the "JSON" part. I think it would look very similar to writeJson, but we'd wrap JsonGenerator in an interface
Description
1. Jackson Streaming Migration
javax.jsonparser, testing showsJacksonis substantially faster.writeJson(JsonGenerator)toValueMapperandwriteTo(JsonGenerator)toSerializedValuefor direct unparsingJsonValueorJsonNodeallocations.2.
SerializedValueOptimizationdoublesupport toSerializedValueto preventBigDecimaloverhead for double-precision resource samples and params.3. Simulation Engine Speedups
SimulationEnginenow pre-indexes serializable topics using anIdentityHashMapwhich means O(1) topic indexing. I think this is ok, but I'm curious to know what the Experts have to say.computeSpanInfowith constant-time lookups4. Result I/O Improvements
SimulationResultsWriteris refactored to stream large results directly to disk/network usingJsonGenerator.PostgresProfileQueryHandlernow streamsjsonbdata directlyNote:
SerializedValue.Visitornow requires implementation ofonDouble(double).Verification
SerializationBenchmark: Micro-benchmarks comparing legacy vs. streaming paths.JacksonStreamingTortureTest: Verification of complex nested record serialization.