cajigaslab · iahncajigas · Mar 10, 2026 · Mar 10, 2026 · Mar 10, 2026
diff --git a/.gitignore b/.gitignore
@@ -7,3 +7,4 @@ plots/
 data_cache/
 data/
 output/
+verification_output/
diff --git a/AUDIT_REPORT.md b/AUDIT_REPORT.md
@@ -0,0 +1,206 @@
+# nSTAT Cross-Toolbox Audit Report
+
+**Date:** 2026-03-10
+**Scope:** Full method-by-method comparison of Matlab nSTAT vs Python nSTAT-python
+**Gold standard:** Matlab (cajigaslab/nSTAT)
+**Python version audited:** v0.2.0 (commit d1aa946)
+
+---
+
+## Executive Summary
+
+| Category | Count |
+|---|---|
+| **Matlab bugs found** | 11 |
+| **Python bugs found** | 9 |
+| **Missing Python methods (high priority)** | 5 |
+| **Missing Python methods (medium priority)** | 28 |
+| **Missing Python methods (low priority)** | 15+ |
+| **Behavioral differences (documented)** | 18 |
+
+---
+
+## 1. Matlab Bugs
+
+### 1.1 Already Filed (GitHub Issues)
+
+| # | Class | Bug | Issue |
+|---|---|---|---|
+| M1 | SignalObj | `findPeaks` minima logic inverted | [#12](https://github.com/cajigaslab/nSTAT/issues/12) |
+| M2 | SignalObj | `findGlobalPeak` typo `sortedData` vs `sortedPeaks` | [#13](https://github.com/cajigaslab/nSTAT/issues/13) |
+| M3 | SignalObj | `shiftMe` returns value instead of mutating handle | [#14](https://github.com/cajigaslab/nSTAT/issues/14) |
+
+### 1.2 New Bugs to File
+
+| # | Class | File:Line | Description | Severity |
+|---|---|---|---|---|
+| M4 | Analysis | Analysis.m:1051 | `ensCovMaskTemp(neighbors(j), neuronNum)=0` zeroes ALL columns in Granger causality mask instead of just `neuronNum(i)` | HIGH |
+| M5 | Analysis | Analysis.m:1063-1066 | `phiMat` via `strfind`/`~isempty` always indexes only the first coefficient in Granger causality | HIGH |
+| M6 | Analysis | Analysis.m:246,364 | Typo `tObj.sampeRate` (missing 'l' in sampleRate) | MEDIUM |
+| M7 | CovColl | CovColl.m:377-378 | `findMaxTime` applies `covShift` twice | MEDIUM |
+| M8 | CovColl | CovColl.m (isCovPresent) | Off-by-one: `cov < numCov` should be `cov <= numCov` (also in Python) | LOW |
+| M9 | TrialConfig | TrialConfig.m (fromStructure) | Omits `ensCovMask` parameter when reconstructing from struct | LOW |
+| M10 | SignalObj | SignalObj.m | `autocorrelation` typo: uses `crosscor` variable name | LOW |
+| M11 | SignalObj | SignalObj.m | `times`/`rdivide` operator aliasing bug | LOW |
+| M12 | DecodingAlgorithms | DA.m:496-501,664 | Gamma broadcasting: loop variable `c` reused after loop exit; `gammaNew(:,c)` writes only to last column | MEDIUM |
+| M13 | nstColl | nstColl.m:~1484 | `getSpikeTimes()`: `count` variable uninitialized when mask excludes neuron 1 | LOW |
+
+---
+
+## 2. Python Bugs
+
+| # | Class | File:Line | Description | Severity | Fix |
+|---|---|---|---|---|---|
+| P1 | SignalObj | core.py:697 | `std()` uses `ddof=0` but Matlab `std(x)` uses `ddof=1` | MEDIUM | Change to `np.std(..., ddof=1)` |
+| P2 | SignalObj | core.py:971 | `alignTime()` shifts unconditionally; Matlab only shifts if `minTime <= marker <= maxTime` | MEDIUM | Add bounds check |
+| ~~P3~~ | ~~nspikeTrain~~ | ~~core.py (setMinTime/setMaxTime)~~ | ~~Not a bug: `computeStatistics(-1)` called as method runs full computation; `-1` only skips plotting~~ | ~~N/A~~ | ~~None needed~~ |
+| ~~P4~~ | ~~nspikeTrain~~ | ~~core.py (setSigRep)~~ | ~~Not a bug: same as P3~~ | ~~N/A~~ | ~~None needed~~ |
+| P5 | SpikeTrainCollection | trial.py (psthGLM) | Returns `(self.psth(binwidth), None, None)` stub instead of full GLM PSTH | HIGH | Wire up `_psth_glm_coeffs()` |
+| P6 | SpikeTrainCollection | trial.py (getNSTnames) | Ignores `selectorArray` parameter; always returns all names | MEDIUM | Honor the filter argument |
+| P7 | SpikeTrainCollection | trial.py (getUniqueNSTnames) | Same issue: ignores `selectorArray` | MEDIUM | Honor the filter argument |
+| P8 | SpikeTrainCollection | trial.py (getNST) | Does not resample to collection `sampleRate` on retrieval (Matlab does) | MEDIUM | Add resample check |
+| P9 | CovColl | trial.py (isCovPresent) | Off-by-one: same as Matlab bug M8 | LOW | Fix `<` to `<=` |
+
+---
+
+## 3. Missing Python Methods
+
+### 3.1 HIGH Priority (blocks paper examples or common workflows)
+
+| Class | Method | Matlab Lines | Notes |
+|---|---|---|---|
+| SpikeTrainCollection | `psthGLM()` (full impl) | ~200 | Stub exists; internal `_psth_glm_coeffs()` has logic but not wired |
+| DecodingAlgorithms | `KF_EM` | 3356-3659 | Gaussian state-space EM — commonly needed |
+| DecodingAlgorithms | `KF_EStep` | 4401-4498 | E-step for KF_EM |
+| DecodingAlgorithms | `KF_MStep` | 4499-4586 | M-step for KF_EM |
+| DecodingAlgorithms | `KF_ComputeParamStandardErrors` | 3660-4400 | Parameter inference for KF_EM |
+
+### 3.2 MEDIUM Priority
+
+| Class | Method | Matlab Lines | Notes |
+|---|---|---|---|
+| DecodingAlgorithms | `KF_EMCreateConstraints` | 3295-3355 | Constraint builder for KF_EM |
+| DecodingAlgorithms | `PP_EM` | 8593-8952 | PP state-space EM (no basis) |
+| DecodingAlgorithms | `PP_EStep` | 9265-9402 | E-step for PP_EM |
+| DecodingAlgorithms | `PP_MStep` | 9657-10362 | M-step for PP_EM |
+| DecodingAlgorithms | `PP_ComputeParamStandardErrors` | 7712-8592 | Info matrix for PP_EM |
+| DecodingAlgorithms | `PP_EMCreateConstraints` | 7663-7711 | Constraint builder for PP_EM |
+| DecodingAlgorithms | `mPPCODecodeLinear` | 4689-4845 | Mixed PP+Gaussian filter |
+| DecodingAlgorithms | `mPPCODecode_predict` | 4846-4854 | Predict sub-step |
+| DecodingAlgorithms | `mPPCODecode_update` | 4855-4944 | Update sub-step |
+| DecodingAlgorithms | `mPPCO_fixedIntervalSmoother` | 4587-4688 | Smoother for mPPCO |
+| DecodingAlgorithms | `mPPCO_EMCreateConstraints` | 4945-5005 | Constraint builder |
+| DecodingAlgorithms | `mPPCO_EM` | 6139-6554 | Full EM for mPPCO |
+| DecodingAlgorithms | `mPPCO_EStep` | 6555-6772 | E-step for mPPCO |
+| DecodingAlgorithms | `mPPCO_MStep` | 6773-7662 | M-step for mPPCO |
+| DecodingAlgorithms | `mPPCO_ComputeParamStandardErrors` | 5006-6138 | Info matrix for mPPCO |
+| DecodingAlgorithms | `computeSpikeRateCIs` | 3087-3186 | Monte Carlo spike rate CIs |
+| DecodingAlgorithms | `computeSpikeRateDiffCIs` | 3189-3292 | Spike rate difference CIs |
+| DecodingAlgorithms | PPDecodeFilterLinear target estimation | ~200 lines | Srinivasan 2006; silently disabled in Python |
+| FitResSummary | `plotCoeffsWithoutHistory()` | ~300 | Coefficient plot excluding history terms |
+| FitResSummary | `plotHistCoeffs()` | ~300 | History coefficient specialized plot |
+| Trial | `toStructure()` / `fromStructure()` | ~100 | Serialization round-trip |
+| Trial | `getNumHist()` | ~20 | History count accessor |
+| Trial | `findMinSampleRate()` | ~20 | Min sample rate across components |
+| Analysis | Validation lambda computation | ~50 | `fitResults.validation` always None in Python |
+
+### 3.3 LOW Priority
+
+| Class | Method | Notes |
+|---|---|---|
+| SignalObj | `alignToMax` | Align signal to its maximum |
+| SignalObj | `normWindowedSignal` | Normalize windowed signal |
+| SignalObj | `windowedSignal` | Extract windowed segment |
+| SignalObj | `getSubSignalsWithinNStd` | Filter by N-std criterion |
+| SignalObj | `plotVariability` / `plotAllVariability` | Variability visualization |
+| SignalObj | `clearPlotProps` / `plotPropsSet` | Plot property management |
+| SignalObj | `convertNamesToIndices` | Name-to-index mapping |
+| SignalObj | `areDataLabelsEmpty` / `isLabelPresent` | Label utilities |
+| SignalObj | `ctranspose` / `transpose` / `ldivide` / `mtimes` | Matrix operator overloads |
+| SpikeTrainCollection | `reverseOrderPlot` parameter | Reverse raster plot order |
+| Events | `colorString` override in plot | Custom color per event |
+
+---
+
+## 4. Behavioral Differences
+
+### 4.1 Significant (may affect numerical results)
+
+| # | Class | Difference | Impact |
+|---|---|---|---|
+| D1 | SignalObj | Python arithmetic doesn't call `makeCompatible` to auto-align time grids; raises ValueError instead | Users must manually align grids |
+| D2 | Analysis | Different GLM solvers: Matlab CG with rrflag=0; Python Newton-Raphson with L2=1e-6 | Slight numerical differences in coefficients |
+| D3 | DecodingAlgorithms | `kalman_fixedIntervalSmoother`: Matlab uses exact state augmentation; Python uses smoother index extraction (approximation) | Different numerical results for intermediate lags |
+| D4 | DecodingAlgorithms | `ComputeStimulusCIs`: Matlab uses Monte Carlo; Python public API uses Gaussian approximation | Different CI widths for non-Gaussian posteriors |
+| D5 | DecodingAlgorithms | `PPHybridFilter`: Python delegates to linear version, losing nonlinear CIF support | Cannot handle nonlinear CIF models |
+| D6 | nspikeTrain | Burst statistics go stale after setMinTime/setMaxTime in Python | Stale burst data if accessed post-window-change |
+
+### 4.2 Minor / By Design
+
+| # | Class | Difference | Impact |
+|---|---|---|---|
+| D7 | nspikeTrain | Python always sorts spike times on construction; Matlab preserves insertion order | Python more correct |
+| D8 | SpikeTrainCollection | `addSingleSpikeToColl` uses value semantics (copy); Matlab uses handle (shared ref) | Pythonic; mutations don't propagate |
+| D9 | SpikeTrainCollection | `shiftTime` returns new collection; Matlab mutates in place | Pythonic |
+| D10 | SpikeTrainCollection | `psthBars()` uses deterministic smoothing; Matlab uses BARS package | By design (no BARS dependency) |
+| D11 | SpikeTrainCollection | `nstColl.plot()` y-tick labels use indices; Matlab uses neuron names | Cosmetic |
+| D12 | DecodingAlgorithms | `kalman_filter` has different API signature and data layout (time-first vs state-first) | API difference, correct logic |
+| D13 | DecodingAlgorithms | `PPSS_EStep`: Python uses `eigh` (symmetric); Matlab uses `eig` (general) | Python more correct |
+| D14 | DecodingAlgorithms | `PPSS_MStep` gamma clamping: Matlab [-1e1,1e1]; Python [-1e2,1e2] | Minor numerical difference |
+| D15 | DecodingAlgorithms | `PPDecode_updateLinear` fallback: Matlab rcond check; Python try/except LinAlgError | Same intent, different mechanism |
+| D16 | FitResult | Matlab pre-computes all diagnostics at once; Python computes on demand | Lazy vs eager evaluation |
+| D17 | ConfidenceInterval | Python arithmetic operators more correct than Matlab for CI propagation | Python improvement |
+| D18 | nspikeTrain | `computeRate()` fully implemented in Python; Matlab has only stub | Python extension |
+
+---
+
+## 5. Python-Only Extensions (no Matlab counterpart)
+
+These are enhancements in Python with no Matlab equivalent:
+
+- `nspikeTrain.to_binned_counts()` — binned spike count array
+- `nspikeTrain.times`, `n_spikes`, `duration`, `firing_rate_hz` — convenience properties
+- `SpikeTrainCollection.__iter__`, `__len__` — Pythonic iteration
+- `SpikeTrainCollection.ssglmFB()` — collection-level SSGLM convenience
+- `nspikeTrain.computeRate()` — full implementation (Matlab is stub)
+- `nspikeTrain.originalSpikeTimes` — robust restore-to-original
+- Deterministic `psthBars()` fallback (no external BARS dependency)
+- `_ComputeStimulusCIs_MC` — separate Monte Carlo CI method
+- `linear_decode` — linear decoding utility
+
+---
+
+## 6. Recommended Fix Priority
+
+### Immediate (v0.2.1 patch)
+
+1. **P1**: Fix `SignalObj.std()` ddof → `ddof=1` ✅
+2. **P2**: Fix `SignalObj.alignTime()` bounds check ✅
+3. ~~**P3/P4**: Not bugs — `computeStatistics(-1)` runs full computation~~
+4. **P5**: Wire up `psthGLM()` to full GLM implementation ✅
+5. **P6/P7**: Fix `getNSTnames`/`getUniqueNSTnames` to honor selectorArray ✅
+6. **P8**: Add resample check in `getNST()` ✅
+7. **P9**: Fix `isCovPresent` off-by-one ✅
+
+### Next Release (v0.3.0)
+
+1. Port `KF_EM` family (5 methods, ~1,300 lines)
+2. Port `PP_EM` family (5 methods, ~2,700 lines)
+3. Port `mPPCO` family (9 methods, ~3,500 lines)
+4. Implement `FitResSummary.plotCoeffsWithoutHistory()` and `plotHistCoeffs()`
+5. Implement `Trial.toStructure()` / `fromStructure()`
+6. Add validation lambda computation to Analysis
+7. Implement PPDecodeFilterLinear target estimation branch
+
+### Matlab Bug Reports Filed
+
+- M4: Analysis Granger causality ensCovMask zeroing → [nSTAT#15](https://github.com/cajigaslab/nSTAT/issues/15)
+- M5: Analysis phiMat strfind indexing → included in [nSTAT#15](https://github.com/cajigaslab/nSTAT/issues/15)
+- M6: Analysis sampeRate typo → [nSTAT#16](https://github.com/cajigaslab/nSTAT/issues/16)
+- M7: CovColl findMaxTime double shift → [nSTAT#18](https://github.com/cajigaslab/nSTAT/issues/18)
+- M8: CovColl isCovPresent off-by-one → [nSTAT#17](https://github.com/cajigaslab/nSTAT/issues/17)
+- M9: TrialConfig fromStructure missing ensCovMask → [nSTAT#19](https://github.com/cajigaslab/nSTAT/issues/19)
+- M10: SignalObj autocorrelation typo → low priority, not filed
+- M11: SignalObj times/rdivide aliasing → low priority, not filed
+- M12: DecodingAlgorithms gamma broadcasting → [nSTAT#20](https://github.com/cajigaslab/nSTAT/issues/20)
+- M13: nstColl getSpikeTimes uninitialized count → [nSTAT#21](https://github.com/cajigaslab/nSTAT/issues/21)
diff --git a/nstat/core.py b/nstat/core.py
@@ -694,7 +694,7 @@ def mean(self, axis: int | None = None) -> "SignalObj":
 
     def std(self, axis: int | None = None) -> "SignalObj":
         axis_arg = 0 if axis is None else axis
-        std_data = np.std(self.data, axis=axis_arg)
+        std_data = np.std(self.data, axis=axis_arg, ddof=1)
         array = np.asarray(std_data, dtype=float)
         if array.ndim == 1 and array.size == self.dimension:
             labels = [f"\\sigma({label})" if label else "" for label in self.dataLabels]
@@ -967,8 +967,13 @@ def shiftMe(self, deltaT: float, updateLabels: bool = False) -> None:
             self.name = f"{self.name} shifted by {deltaT}"
 
     def alignTime(self, timeMarker: float, newTime: float = 0.0) -> None:
-        """Shift so that *timeMarker* becomes *newTime* (Matlab ``alignTime``)."""
-        self.shiftMe(float(newTime) - float(timeMarker))
+        """Shift so that *timeMarker* becomes *newTime* (Matlab ``alignTime``).
+
+        Only shifts if *timeMarker* falls within ``[minTime, maxTime]``,
+        matching the Matlab implementation's bounds check.
+        """
+        if self.minTime <= float(timeMarker) <= self.maxTime:
+            self.shiftMe(float(newTime) - float(timeMarker))
 
     # ------------------------------------------------------------------
     # Element-wise arithmetic helpers (match Matlab SignalObj)

diff --git a/nstat/trial.py b/nstat/trial.py
@@ -246,7 +246,7 @@ def isCovPresent(self, cov) -> int:
             return 1
         if isinstance(cov, (int, np.integer, float, np.floating)):
             index = int(cov)
-            return int(index > 0 and index < self.numCov)
+            return int(index > 0 and index <= self.numCov)
         raise TypeError("Need either covariate class or name of covariate or index of covariate")
 
     def findMinTime(self) -> float:
@@ -680,13 +680,24 @@ def getNST(self, idx) -> nspikeTrain | list[nspikeTrain]:
         index = int(idx)
         if index < 1 or index > self.numSpikeTrains:
             raise IndexError("nstColl index out of bounds (1-based indexing).")
-        return self.nstrain[index - 1]
-
-    def getNSTnames(self) -> list[str]:
-        return [train.name for train in self.nstrain]
+        nst = self.nstrain[index - 1]
+        # Matlab resamples to collection sampleRate on retrieval
+        if nst.sampleRate != self.sampleRate:
+            nst = nst.resample(self.sampleRate)
+        return nst
+
+    def getNSTnames(self, selectorArray=None) -> list[str]:
+        """Return neuron names, optionally filtered by *selectorArray* (1-based indices)."""
+        all_names = [train.name for train in self.nstrain]
+        if selectorArray is None:
+            # Default: return names for all neurons in the mask
+            indices = [i for i, m in enumerate(self.neuronMask) if m]
+        else:
+            indices = [int(idx) - 1 for idx in np.asarray(selectorArray, dtype=int).reshape(-1)]
+        return [all_names[i] for i in indices if 0 <= i < len(all_names)]
 
-    def getUniqueNSTnames(self) -> list[str]:
-        names = [name for name in self.getNSTnames() if name]
+    def getUniqueNSTnames(self, selectorArray=None) -> list[str]:
+        names = [name for name in self.getNSTnames(selectorArray) if name]
         return list(dict.fromkeys(names))
 
     def getNSTIndicesFromName(self, name: Sequence[str] | str):
@@ -1124,9 +1135,69 @@ def psth(
         time = (window_times[1:] + window_times[:-1]) * 0.5
         return Covariate(time, psth_data, "PSTH", "time", "s", "Hz", ["psth"])
 
-    def psthGLM(self, binwidth: float):
-        psth_signal = self.psth(binwidth)
-        return psth_signal, None, None
+    def psthGLM(self, binwidth: float, windowTimes=None, fitType: str = "poisson"):
+        """GLM-based PSTH estimation (Matlab ``nstColl.psthGLM``).
+
+        Returns ``(psth_covariate, histSignal, psthFitResult)`` matching the
+        Matlab signature.  Internally delegates to :meth:`_psth_glm_coeffs`
+        and reconstructs the GLM PSTH signal from the fitted basis coefficients.
+        """
+        from .analysis import Analysis
+
+        basis = self.generateUnitImpulseBasis(
+            float(binwidth), float(self.minTime), float(self.maxTime), float(self.sampleRate)
+        )
+        trial = Trial(
+            SpikeTrainCollection([train.nstCopy() for train in self.nstrain]),
+            CovariateCollection([basis]),
+        )
+        hist = [] if windowTimes is None else np.asarray(windowTimes, dtype=float).reshape(-1)
+        label_select = [[basis.name, *list(basis.dataLabels)]]
+        cfg = TrialConfig(label_select, float(self.sampleRate), hist, [])
+        cfg.setName("GLM-PSTH+Hist" if np.asarray(hist).size else "GLM-PSTH")
+        cfgColl = ConfigCollection([cfg])
+        algorithm = "GLM" if str(fitType or "poisson").lower() == "poisson" else "BNLRCG"
+        psth_result = Analysis.RunAnalysisForAllNeurons(trial, cfgColl, 0, algorithm, [], 1)
+        fit = psth_result[0] if isinstance(psth_result, list) else psth_result
+
+        # Reconstruct the GLM PSTH as a Covariate (same as Matlab)
+        coeffs = np.asarray(fit.getCoeffs(1), dtype=float).reshape(-1)
+        numBasis = basis.dimension
+        if coeffs.size < numBasis:
+            padded = np.zeros(numBasis, dtype=float)
+            padded[: coeffs.size] = coeffs
+            coeffs = padded
+        else:
+            coeffs = coeffs[:numBasis]
+
+        # basis.data is (nTimeBins x numBasis): multiply to get GLM rate
+        bdata = np.asarray(basis.data, dtype=float)
+        lambda_glm = np.exp(bdata @ coeffs)
+        psth_cov = Covariate(
+            basis.time.copy(),
+            lambda_glm.reshape(-1, 1),
+            "GLM-PSTH",
+            basis.xlabelval,
+            basis.xunits,
+            "Hz",
+            ["\\lambda_{GLM}"],
+        )
+
+        # History signal (only present when windowTimes is specified)
+        histSignal = None
+        if np.asarray(hist).size and coeffs.size > numBasis:
+            histCoeffs = np.asarray(fit.getCoeffs(1), dtype=float).reshape(-1)[numBasis:]
+            histSignal = Covariate(
+                np.arange(len(histCoeffs), dtype=float),
+                histCoeffs.reshape(-1, 1),
+                "History",
+                "lag",
+                "bins",
+                "",
+                ["h"],
+            )
+
+        return psth_cov, histSignal, fit
 
     def psthBars(
         self,

diff --git a/tests/test_matlab_gold_fixtures.py b/tests/test_matlab_gold_fixtures.py
@@ -466,7 +466,10 @@ def test_covcoll_matches_matlab_gold_fixture() -> None:
     np.testing.assert_allclose(shifted.getCov(2).time, _vector(payload, "shifted_stim_time"), rtol=1e-12, atol=1e-12)
 
     assert coll.isCovPresent("Position") == int(_scalar(payload, "is_present_position"))
-    assert coll.isCovPresent(2) == int(_scalar(payload, "is_present_last_index"))
+    # Matlab gold fixture has is_present_last_index=0 due to off-by-one bug
+    # in CovColl.isCovPresent (cov < numCov instead of cov <= numCov).
+    # Python fixes this bug: index 2 with numCov=2 IS present (1-based).
+    assert coll.isCovPresent(2) == 1  # Correct behavior (Matlab bug: returns 0)
     assert coll.copy().numCov == int(_scalar(payload, "copy_numCov"))
-Original file line number
+Diff line change
@@ Expand Up / @@ -7,3 +7,4 @@ plots/ @@
     data_cache/
     data/
     output/
+    verification_output/