Deepdive: SIMPLE and libneo increasingly need GPU-friendly, cache-friendly evaluation over many points. Current APIs are point-at-a-time.\n\nGoal:\n- Add multi-point batch-over-points evaluation entry points for 3D batch splines for:\n - value only\n - first derivatives\n - second derivatives\n - rmix second derivatives\n\nNotes:\n- Coordinate layout should be SoA (x(3,npts) or separate x1(:),x2(:),x3(:)) to avoid gather/scatter.\n- Keep existing single-point APIs as wrappers/harnesses.\n\nAcceptance:\n- Unit tests validating equivalence to single-point evaluation.\n- Provide a microbenchmark target showing improved throughput.