Releases: OpsiClear/gsmod
Releases · OpsiClear/gsmod
v0.1.7
Changed
- Style Consistency Improvements
- Unified
inplacedefaults: All methods now default toinplace=Truefor consistency- GSTensorPro methods (35 methods)
- Filter.call(), FilterGPU.call(), FilterProcessor.apply() protocol
- Exception:
compose_with_transforms()remainsinplace=False(composition utility)
- Parameter naming unification:
factor->scalein scale methods,value->gammain adjust_gamma - Dict/list comprehensions used for cleaner code in GSTensorPro (apply_color_preset, compute_histogram)
- Unified
- TransformValues Non-Uniform Scale Support
TransformValues.scalenow stores 3-tuple(sx, sy, sz)for per-axis scalingfrom_scale()accepts both uniform (float) and per-axis (tuple) scalesfrom_matrix()extracts per-axis scales from column normsis_neutral()properly checks 3-tuple scale against[1.0, 1.0, 1.0]
- Rotation/Scale Center Point Support
- New
centerparameter on TransformValues for rotation/scale center - Factory methods
from_rotation_euler(),from_rotation_axis_angle(),from_scale()acceptcenter to_matrix()correctly applies T(center) @ SR @ T(-center) transformation- LearnableTransform correctly handles center in forward pass
- New
- Transform Pipeline Enhancements
- New
Transform.from_srt()factory for standard Scale-Rotate-Translate order - New
rotate_euler_deg()method for degree-based rotation - Improved docstrings with center point usage examples
- New
- Docstring Format Standardized
- All docstrings now use Sphinx reST format (
:param:,:returns:) consistently - Removed Google-style Args/Returns in favor of Sphinx style
- All docstrings now use Sphinx reST format (
- TransformConfig/LearnableTransform Updates
TransformConfig.scalenow supports float or 3-tuple- LearnableTransform handles center point for rotation/scale around arbitrary points
- Numba Kernel Optimizations
- Filter kernels refactored with dict/list comprehensions
- Color kernels use cleaner iteration patterns
Fixed
- LearnableTransform now correctly applies SRT order with center support
v0.1.6
Changed
- Enhanced SH-Aware Color Processing (CPU and GPU)
- Brightness, Contrast, Saturation, Temperature, Gamma, Hue Shift now apply to BOTH sh0 (DC) and shN (higher-order SH coefficients)
- Improved SH mode handling: brightness/contrast use corrected formulas for rendered RGB behavior
- Vibrance now correctly converts SH->RGB->SH (required for adaptive saturation)
- Split toning (shadow/highlight tints) remain DC-only (additive operations)
- Conditional clamping: only applied when is_sh0_rgb=True (SH coefficients can be negative or >1)
- Automatic SH->RGB->SH conversion for operations requiring RGB format
- Dependencies Updated
- gsply requirement updated from
0.2.11to>=0.2.13 - torch moved to optional dependency (install separately:
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu128)
- gsply requirement updated from
- Code Consolidation
_axis_angle_to_rotation_matrix_numpy()moved toshared/rotation.py(single canonical implementation)filter/api.pyandfilter/apply.pynow import from shared module
- Documentation Updated
- CLAUDE.md: Updated module references (
color/module structure, distributed*/kernels.pyfiles) - WORKFLOWS.md: Updated API references to current exports (GSDataPro, ColorValues, Filter, Pipeline)
- CLAUDE.md: Updated module references (
Fixed
- CRITICAL: Format Tracking in GSTensorPro
- Fixed format auto-detection incorrectly changing format during initialization
_formatnow properly passed in all factory methods (from_gsdata, from_gstensor, clone, to)- Prevents LINEAR scales from being incorrectly detected as PLY log-scales
- Affects: All GSTensorPro creation and conversion operations
- Numerical Stability Improvements
- LUT kernels: clamp to [0,1] before conversion to int() to avoid NaN/Inf issues
- Ellipsoid filter: guard against division by zero with minimum radius (1e-7)
- Logit conversion: use log subtraction form for better numerical stability
- Consistent epsilon values: standardized to 1e-7 across codebase
- FilterValues.is_neutral() Bug Fix
- Now correctly checks
invertparameter (invert=True is NOT neutral)
- Now correctly checks
- GPU Opacity/Scale Filtering
- Improved threshold conversion with proper clamping to avoid boundary issues
- Optimized: compute max_scales once when needed for both min/max filters
Performance
- GPU scale filtering: eliminated redundant max_scales computation (2x evaluation -> 1x)
v0.1.5
Added
- Spherical Harmonics (SH) Color Support (
gsmod.color.sh_kernels,gsmod.color.sh_utils)- Full SH-aware color operations matching GPU ground truth behavior
- Brightness and Saturation: Apply to BOTH sh0 (DC) and shN (higher-order SH bands)
- All other operations: Apply to sh0 ONLY (contrast, gamma, temperature, tint, hue, vibrance, fade)
- Numba-optimized kernels:
apply_scale_to_sh_numba(),apply_matrix_to_sh_numba(),apply_contrast_to_sh_numba(), etc. - SH utilities:
sh_to_rgb(),compute_luminance(), matrix builders for saturation/temperature/hue - CPU/GPU consistency: CPU now matches GPU SH handling exactly
- Triton GPU Kernels (
gsmod.torch.triton_kernels)- Fused GPU kernels for maximum performance on NVIDIA GPUs
- Kernels:
triton_adjust_brightness(),triton_adjust_contrast(),triton_adjust_gamma(),triton_adjust_saturation(),triton_adjust_temperature() - Graceful fallback to PyTorch operations if Triton unavailable
- Block-based parallelization optimized for GPU architecture
- GSTensorPro.from_gstensor() Factory Method
- Convert gsply GSTensor to GSTensorPro while preserving format state
- Recommended workflow: load with gsply, convert formats, then wrap with GSTensorPro
- Preserves
is_sh0_rgb,is_scales_ply,is_opacities_plyformat tracking - Optional device/dtype conversion during wrapping
- Format-Aware CPU Filtering (
gsmod.filter.apply)- CPU filters now handle PLY format (logit opacities, log scales) correctly
- Automatic threshold conversion: linear -> logit/log when filtering PLY format data
- Matches GPU FilterGPU behavior for CPU/GPU consistency
- Uses
is_opacities_plyandis_scales_plyformat properties from gsply 0.2.11
- Transform Log-Space Kernel (
gsmod.transform.kernels)- Added
elementwise_add_scalar_numba()for log-space scale transforms - Enables efficient scale operations in PLY format (log space) where multiplication becomes addition
- Added
- New Benchmarks
benchmarks/benchmark_sh_color.py: NumPy vs Numba performance for SH operationsbenchmarks/benchmark_triton.py: PyTorch vs Triton kernel performance comparisonbenchmarks/benchmark_gpu_vs_cpu.py: Comprehensive GPU vs CPU performance analysis
Changed
- Color Application Refactored (
gsmod.color.apply)apply_color_values()now acceptsshNparameter and returns tuple(sh0, shN)- Matches GPU ground truth: brightness/saturation on both sh0+shN, all else on sh0 only
- Removed LUT-based implementation in favor of direct operations for better SH support
- Breaking change: Return type changed from
np.ndarraytotuple[np.ndarray, np.ndarray | None]
- GPU Color Methods Enhanced (
gsmod.torch.gstensor_pro)- All color adjustment methods now use Triton kernels when available
- Shadows/highlights now use smoothstep curves matching supersplat shader
- Better visual quality with smooth shadow/highlight transitions
- Temperature and tint operations preserve format awareness
- Dependencies Updated
- gsply requirement updated from
0.2.10to0.2.11 - Supports latest format query properties and performance improvements
- gsply requirement updated from
Performance
- SH color operations: 10-30x faster with Numba kernels vs pure NumPy
- GPU color operations: Additional 10-20% speedup with Triton kernels (when available)
- CPU-GPU consistency: No performance penalty for matching behavior
Fixed
- CPU color operations now correctly handle SH formats matching GPU behavior
- Shadow/highlight curves now use proper smoothstep interpolation (matches supersplat)
- Format-aware filtering eliminates opacity/scale threshold bugs in PLY format
Documentation
- Updated CLAUDE.md with SH color operation semantics
- Documented brightness/saturation special case (both sh0 and shN)
- Added GPU ground truth reference in color module docstrings
v0.1.4
[0.1.4] - 2025-11-26
Added
- Auto-Correction Module (
gsmod.color.auto)- Industry-standard automatic color correction algorithms (Photoshop/Lightroom/iOS Photos style)
auto_enhance(): Combined enhancement (exposure + contrast + white balance), like iOS Photos Autoauto_contrast(): Percentile-based histogram stretching (0.1% clipping), like Photoshop Auto Contrastauto_exposure(): 18% gray midtone targeting (0.45 in gamma space)auto_white_balance(): Gray World and White Patch methodscompute_optimal_parameters(): Minimal adjustments to reach target statisticsAutoCorrectionResult: Dataclass with computed adjustments, converts to ColorValues via.to_color_values()- Self-referential analysis (no target histogram required)
- Perceptual Loss Functions (
gsmod.histogram.loss)PerceptualColorLoss: Comprehensive loss addressing flat histogram problem- Contrast preservation (penalizes reduction below threshold)
- Dynamic range matching (5th/95th percentiles)
- Parameter regularization (keeps values near neutral)
ContrastPreservationLoss: Standalone contrast preservation lossParameterBoundsLoss: Soft penalty for extreme parameter valuescreate_balanced_loss(): Factory function for balanced defaults
Changed
- Filter Atomic Class Architecture (
gsmod.filter.atomic.Filter)- Rewritten to use
FilterValuesinternally for fused kernel path - AND operations (
Filter & Filter) now merge FilterValues for single kernel execution - OR/NOT operations correctly fall back to mask combination approach
- Added internal helpers:
_from_values()and_from_mask_fn() - Factory methods simplified to construct FilterValues directly
- Rewritten to use
- Pipeline Operation Merging (
gsmod.pipeline.Pipeline)- Added
_merge_operations()method to merge consecutive same-type operations - Color, transform, and filter operations merged using their
+operator - Reduces number of kernel calls for better performance
- Added
Performance
- Filter AND operations: 2.8x faster via merged FilterValues (single fused kernel)
- Pipeline transform merge: 3.5x faster when consecutive transforms combined
- Pipeline full chain: 1.6x faster overall with operation merging
- Filter.get_mask(): 2.5x faster after removing logger.debug overhead
Documentation
- Documented expected ~2% color merge difference due to LUT quantization
- This is mathematically correct behavior (quantization artifacts)
- Performance benefit outweighs minor precision difference
Fixed
- Filter.to_values() now correctly supports AND combinations (returns merged FilterValues)
- OR and NOT combinations correctly raise ValueError (cannot be represented as single FilterValues)
v0.1.3 feature update
[0.1.3] - 2025-11-26
Added
- Unified Pipeline Class (
gsmod.pipeline.Pipeline)- CPU pipeline class matching GPU PipelineGPU interface
- Fluent API for chaining color, transform, and filter operations
- Method chaining:
.brightness(),.saturation(),.translate(),.scale(),.min_opacity(), etc. - Operations accumulated and executed in order when called
- Single unified interface for all processing operations
- Filter Atomic Class (
gsmod.filter.atomic.Filter)- Immutable filter class with boolean operators (&, |, ~)
- Factory methods:
min_opacity(),max_opacity(),min_scale(),max_scale(),sphere(),box(),ellipsoid(),frustum() - Combine filters with logical operators for complex patterns
- Direct mask computation via
.get_mask()method - Apply filters via callable interface:
filter(data, inplace=False)
- Extended GSDataPro Filter Methods
- Individual filter methods:
filter_min_opacity(),filter_max_opacity(),filter_min_scale(),filter_max_scale() - Geometry filters:
filter_within_sphere(),filter_outside_sphere(),filter_within_box(),filter_outside_box() - Advanced geometry:
filter_within_ellipsoid(),filter_outside_ellipsoid(),filter_within_frustum(),filter_outside_frustum() - Transform methods:
translate(),scale_uniform(),scale_nonuniform(),rotate_quaternion(),rotate_euler(),rotate_axis_angle(),transform_matrix() - Color adjustment:
adjust_brightness()and other individual color methods
- Individual filter methods:
- GPU Filter Enhancements (
gsmod.torch.filter.FilterGPU)- Rotated box filtering:
within_rotated_box(),outside_rotated_box() - Ellipsoid filtering:
within_ellipsoid(),outside_ellipsoid() - Frustum filtering:
within_frustum(),outside_frustum() - Optimized kernels:
_filter_rotated_box(),_filter_ellipsoid(),_filter_frustum() - Axis-angle to rotation matrix conversion:
_axis_angle_to_rotation_matrix()
- Rotated box filtering:
Changed
- Filter Architecture Simplified
- Atomic Filter class now uses single fused kernel path
- Removed redundant kernel implementations
- Improved performance through kernel consolidation
- Full CPU-GPU Parity
- All filter operations now available on both CPU and GPU
- Consistent API between GSDataPro and GSTensorPro
- Unified behavior across backends
Fixed
- Test Suite Stability
- All 498 tests passing (with 55 skipped GPU tests when CUDA unavailable)
- Improved test coverage to 61% overall
- Enhanced equivalence tests between CPU and GPU implementations
Performance
- Filter atomic operations use optimized Numba kernels for 40-100x speedup
- Unified Pipeline reduces method call overhead
- GPU filters maintain 100-180x speedup over CPU for large datasets
v0.1.2 Clean up
Changed
- Dependencies
- Updated gsply requirement from
>=0.2.8to==0.2.10(exact version pin)
- Updated gsply requirement from
- Style Harmonization with gsply
- All docstrings now use
:returns:instead of:return:(consistent with gsply) - Enhanced module-level docstrings with performance notes and examples
color/__init__.py: Added performance metrics (1,091M colors/sec)transform/__init__.py: Added performance metrics (698M Gaussians/sec)filter/__init__.py: Added performance metrics (46M Gaussians/sec)torch/__init__.py: Added GPU benchmark details (183x speedup, 1.09B Gaussians/sec)
- All docstrings now use
- Configuration Class Naming
- Renamed
GsproConfigtoGsmodConfigfor consistency with project name - Updated all references in codebase (config module, AGENTS.md, benchmarks, examples)
- Export name updated: import
GsmodConfigfromgsmod.config
- Renamed
- Configuration Class Documentation
- Fixed misleading "will be deprecated" comment on ColorGradingConfig
- Clarified that config classes (ColorGradingConfig, TransformConfig, etc.) are canonical and actively used
- Reorganized torch module imports for better clarity
Removed
- Deprecated Backward Compatibility Aliases (Breaking Change)
- Removed
LearnableColorGrading(useLearnableColorinstead) - Removed
SoftFilter(useLearnableFilterinstead) - Removed
GSTensorProLearn(useLearnableGSTensorinstead) - Removed
SoftFilterConfig(useLearnableFilterConfiginstead) - Impact: Zero usage found in codebase, tests, benchmarks, or documentation
- Migration: Update imports to use new standardized names with "Learnable" prefix
- Removed
- Property Aliases from ColorValues (Breaking Change)
- Removed
black_level,white_level(usebrightness,contrast) - Removed
lift,gain(useshadows,highlights) - Removed
exposure(usebrightness) - Removed
midtones(usegamma) - Removed
vibrancy(usevibrance) - Removed
blacks,whites(useshadows,highlights) - Impact: Zero usage found in codebase
- Rationale: Simplifies API, aligns with gsply's design philosophy of no property aliases
- Removed
Fixed
- Removed confusing "legacy" comment from Color pipeline import
- Improved clarity of module organization and export structure
v0.1.1 improved method coverage
Added
- Opacity adjustment module with format-aware opacity scaling
- OpacityValues config dataclass with fade() and boost() factory methods
- GaussianProcessor unified interface for auto-dispatching CPU/GPU operations
- Shared rotation utilities module for code reuse between backends
- Enhanced protocol definitions with ColorProcessor, TransformProcessor, FilterProcessor
- Opacity support in GSDataPro and GSTensorPro classes
- GSDataPro: Primary API with direct
.color(),.filter(),.transform()methods - GSTensorPro: GPU tensor wrapper with same API as GSDataPro
- Config values:
ColorValues,FilterValues,TransformValuesdataclasses - Presets: Built-in color, filter, and transform presets
- Histogram computation with
HistogramResult - Scene composition utilities:
concatenate,compose_with_transforms,merge_scenes - Format verification with
FormatVerifier - GPU acceleration with up to 183x speedup
Changed
- Unified parameter semantics between CPU and GPU pipelines
- Filter radius now uses absolute world units (not relative to scene bounds)
- Improved Numba JIT compilation with automatic warmup
- Rotation utilities moved to shared module for better code organization
- Format property access now uses public API (is_opacities_ply, is_scales_ply)
- Improved format tracking in GSDataPro and GSTensorPro
- Fixed TransformValues.is_neutral() with robust float comparison
Performance
- Color: 1,091M Gaussians/sec
- Transform: 698M Gaussians/sec
- GPU: 1.09B Gaussians/sec on RTX 3090 Ti
Initial release
See readme