Fix GL renderer thread-safety hazards and a STREAM upload CCE#22
Open
onefuncman wants to merge 12 commits intodolda2000:masterfrom
Open
Fix GL renderer thread-safety hazards and a STREAM upload CCE#22onefuncman wants to merge 12 commits intodolda2000:masterfrom
onefuncman wants to merge 12 commits intodolda2000:masterfrom
Conversation
Removed the shared GLRender prep field that all prepare(...) callers wrote into. Each prepare call now builds its own private GLRender and enqueues it into a new prepq queue, drained by process() ahead of submitted (preserving the prep-before-submit ordering). - Hazard 1 (multi-writer prep): prepare(GLObject), prepare(BGL.Request), and prepare(Consumer<GLRender>) each construct a fresh per-call GLRender via enqprep(). No more shared writer. - Hazard 2 (buf.ro published before data store written): in prepare(Model.Indices) and prepare(VertexArray.Buffer) STREAM/STATIC cases, the upload work is now enqueued into prepq before buf.ro is assigned. The happens-before chain (writer enqueues prep -> publishes ro under buf monitor -> reader reads ro -> reader submits) combined with process()'s snapshot order (submitted then prepq) guarantees any reader-submitted render referencing the new ro lands behind our prep in execution order. - Hazard 3 (disposal interleaves with in-flight prep): dispose() sets invalid (now volatile) up front, then drains submitted and prepq, aborting and disposing each. enqprep checks invalid under prepmon, so any prep submitted after dispose is rejected and aborted instead of leaking.
Fixes a ClassCastException introduced by f8ef8d6 (GLEnvironment: eliminate this.prep shared render). That commit reordered the STREAM prepare paths to assign buf.ro after enqueueing the upload prep, but fillbuf(DataBuffer, int, int) was deciding whether to return a StreamBuffer.Fill or a FillBuffers.Array based on the current buf.ro, so the cast `(StreamBuffer.Fill)buf.init.fill(buf, this)` started hitting FillBuffers.Array. Rather than re-couple buf.ro ordering to the Fill-subtype decision, remove the coupling entirely: - New runStreamFill(StreamBuffer, T, Filler) helper pre-allocates the StreamBuffer.Fill and runs the Filler against a proxy Environment whose fillbuf returns that pre-allocated Fill for the target buffer. Standard Fillers call env.fillbuf(tgt) and write into the result, so they end up writing straight into the Fill (zero-copy preserved). A non-standard Filler that bypasses env.fillbuf falls back to a pull(push().flip()) copy. - fillbuf(DataBuffer, int, int) becomes a pure factory that always returns a FillBuffers.Array. The buf.ro inspection is gone, so the cast at the prepare callsite is structurally guaranteed to receive the Fill type runStreamFill constructed. - Hazard-2 ordering from f8ef8d6 is restored: buf.ro is published after the prep is enqueued in both STREAM branches. Filler interface and its 31-ish in-repo implementors are unchanged. Testability + regression coverage: - StreamBuffer's get/put/dispose pool moves into a package-private StreamBuffer.Pool with an injectable IntFunction<SysBuffer> allocator. Same locking semantics, same external API on StreamBuffer. - New StreamBufferPoolTest covers reuse-after-put, two-concurrent-gets allocate two, freed-slot reuse, rewind on reuse, dispose releases all, harmless put of unknown buffer, NPE on null put. - doc/gl-test-coverage.md catalogues remaining GLEnvironment-side coverage gaps (cheap, medium, hard tiers) and recommends the next round of investment.
Pin the lowest-index-free-slot reuse policy and stress get/put from 8 threads to guard the synchronized-monitor invariants (mutual exclusion, bounded growth). Demote the Fill-lifecycle test from cheap to medium in the coverage doc -- it needs a test-only ctor since StreamBuffer's real ctor builds a GLBuffer(env) and calls env.prepare.
Move the proxy-Environment factory and copy fallback out of GLEnvironment.runStreamFill into a static StreamFiller.runWithPreallocated so the whole-range / partial-range / other-target / bypass branches can be exercised against fakes. The whole-range case is the direct guard for the ClassCastException fixed in fa07814.
Decouple Fill.compatible from rbuf.env via a dedicated compatEnv field and add a package-private StreamBuffer(int, Pool, Environment) ctor that skips the GLBuffer/env wiring. StreamBufferFillTest then drives Fill through compatible / dispose-idempotent / get-then-dispose-no- auto-put / pull / get-rewinds without standing up a GL context.
The submitted/prepq/invalid trio gets pulled into RenderQueue<R>, with drain() preserving the "snapshot submitted before prep" ordering that guarantees every render in a drain has its prep in the same drain (or a prior one). RenderQueueTest stresses that invariant from a producer thread across 200 trials and pins enqueue rejection after invalidate(), which the GLEnvironment.submit / enqprep wrappers now lean on.
The two STREAM branches in update() were sibling sites to the bug fixed in fa07814 -- they cast (StreamBuffer.Fill) on the result of fill.fill(buf, env), which only worked while GLEnvironment.fillbuf inspected buf.ro to choose between Fill and FillBuffers.Array. After fa07814, fillbuf is a pure Array factory, so these casts blew up on the first STREAM index/vertex update. Promote runStreamFill from private to package-private and reuse it here so the same proxy-based fast path covers GLRender.update too.
Initial sequse size jumps from 16 to 32k to match observed in-flight Sequence count during normal rendering, eliminating the warm-up resize churn (16 -> 32k goes through 11 doublings, each one used to fire a warning past 16384). Bump the warning watermark to 4x of steady-state (0x20000) so a real leak still surfaces but transient spikes don't spam warnings.
Fixes recurring `EXCEPTION_ACCESS_VIOLATION` in `glDrawElements` seen across NVIDIA (nvoglv64.dll) and AMD (atio6axx.dll) drivers, with varying `count` arguments. In every crash, `indices=0` and the driver vector-loaded from NULL -- meaning no `GL_ELEMENT_ARRAY_BUFFER` was bound to the active VAO at draw time, so the driver fell back to treating `indices` as a client pointer. Two independent layers, both kept: 1. `VaoBindState.applyto`: previously only rebound the EBO when the VAO itself changed. A `(vao=V, ebo=A) -> (vao=V, ebo=B)` transition silently dropped the EBO bind, leaving the VAO's internal EBO slot pointing at whichever buffer was current when the VAO was first populated. If that buffer was later deleted, the next draw read NULL. Now also rebinds when only the EBO changed, mirroring the unconditional `apply()` semantics. 2. `GLDrawList.SlotRender.draw`: emit `glBindBuffer(EBO)` directly into the per-draw `BufferBGL` immediately before `glDrawElements`. This makes the draw correct regardless of any state-tracker or VAO-internal-slot confusion. Bumps initial `BufferBGL` capacity to 2 (it grows anyway, just a hint).
…agnostics
Three small changes addressing the observed ~32k-span seqhead/seqtail
gap. In this ring, `sequnreg` only advances `seqtail` when the freed
slot is exactly at the tail, so a single never-disposed Sequence blocks
all subsequent tail advancement even if thousands of younger sequences
have been properly freed.
1. submit(): if the GLRender has no recorded commands (gl == null), the
early return used to drop it silently. The caller already handed
over ownership via submit, so the render's Sequence sat in the ring
until finalization. Now dispose before returning.
2. prepare(GLObject|BGL.Request|Consumer<GLRender>): the GLRender was
constructed before the user's work ran; any throw skipped enqprep
and leaked the render. Each overload now uses try/finally with a
success flag so a throw (including Error) disposes the partial
render.
3. Diagnostics: added seqstats() returning {span, alive} = {seqhead -
seqtail, count of sequse[]==true}. If span >> alive, the tail is
stuck. process() logs the pair at a coarse cadence (every ~30k
frames) when span > 1000.
Map of src/haven/render/gl/: GLEnvironment/GLRender/GLDrawList, the BGL buffered-GL indirection, frame lifecycle (prep -> submitted -> disposeall), RenderQueue's submitted-first drain invariant, GLObject refcounting with deferred dispose via the seqhead/seqtail ring, VAO/EBO binding (including the historical crash bug and both fix layers), STREAM buffer upload ordering, state diffing via Applier/curstate, the thread model with a submit call-site table, cache lifetimes, and a chronological list of bugs/fixes applied on this branch plus instrumentation points.
Add `compile-tests` and `test` ant targets so the GL refactor test sources under src/test/java can build and run. Main `javac` now excludes `test/**` since JUnit only belongs on the test classpath. Drops the bundled junit-platform-console-standalone jar in lib/.
Owner
|
Wow, that's certainly the pull-request of the decade for the Haven client. ;) It will probably take me a while to go through it and recognize the fundamentals of what you're doing, but I'll get on it. Thanks for the effort in the meantime! |
DerKamii
pushed a commit
to DerKamii/KamiClient
that referenced
this pull request
Apr 15, 2026
…weird crashes in the mines. Credits go to onefuncman.
Author
|
The fixes themselves are quite small number of lines impacted, a lot of the refactor is just for testability, Kami's commit linked here is a nicely boiled down version if you prefer that instead. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Addresses a cluster of thread-safety hazards in the GL renderer (
GLEnvironment,StreamBuffer) surfaced whilechasing intermittent crashes and seeing corrupted render output. Splits the work into small, reviewable commits and
ships them with a JUnit test suite covering the non-GL parts (pool, fill-proxy, render queue, seq-ring) plus an
architecture writeup under
doc/.What changed
GLEnvironment.prepGLRender — eachprepare(...)call now builds its own privateGLRenderand enqueues it via a new
prepq, drained byprocess()ahead ofsubmitted. Removes a multi-writer hazard whereconcurrent prepare calls could interleave on one shared render.
buf.ropublication — fixes aClassCastExceptionwherefillbufwas decidingStreamBuffer.FillvsFillBuffers.Arraybased onbuf.ro, which was published after prep enqueue. NewrunStreamFillpre-allocates theFilland runs theFilleragainst a proxyEnvironmentsofillbufbecomes apure factory. Zero-copy preserved for standard fillers; non-standard fillers fall back to
pull(push().flip()).Fillerinterface and all in-repo implementors unchanged.StreamBuffer.Poolwith an injectableIntFunction<SysBuffer>allocator so the pool lifecycle istestable outside a live GL context. Same locking semantics and external API.
StreamFillerhelper for the proxy/copy path to lock the CCE regression behind a test.RenderQueuefromGLEnvironmentand lock the prep/submit invariants behind tests.GLRender.updateSTREAM writes throughrunStreamFillso the refactored path is the only one.EBO binding.
submit/prepareand add a seq-ring for diagnostics when a leak is suspected.Tests
New
src/test/java/haven/render/gl/suite:StreamBufferPoolTest,StreamBufferFillTest,StreamFillerTest,RenderQueueTest. Run withant test— 26/26 pass. Test targets live alongside the existingbintarget and don'ttouch the main jar's classpath.
Docs
doc/gl-rendering.md— architecture and invariants of the GL layer as they stand after this PR.doc/gl-test-coverage.md— what's covered, remaining gaps (cheap/medium/hard tiers).Test plan
ant binbuilds clean.ant testpasses.