Rewrite core library, add outlier rejection, characterization harness, and full tooling#2
Rewrite core library, add outlier rejection, characterization harness, and full tooling#2rfrenchseti merged 17 commits intomainfrom
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 24 minutes and 19 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (9)
WalkthroughThis PR migrates to a src-layout, introduces a characterize_gauss_fit analysis suite and a packaged psf_gui, replaces the PSF implementation with a new PSF abstract API and refactored Gaussian/HST modules, expands tests and CI, and adds extensive docs, repository rules, tooling config, and helper scripts. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes |
There was a problem hiding this comment.
Actionable comments posted: 57
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/psfmodel/hst.py (1)
497-505:⚠️ Potential issue | 🟠 MajorAlways restore cwd and temp files in a
finally.Any exception after Line 498, including the new
RuntimeErrorat Line 572, exits before the cleanup at Lines 580-584. That leaves the process cwd stuck inTINY_TIM_DIRand leaks temp artifacts, which can break later library calls in the same process.Also applies to: 571-584
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/psfmodel/hst.py` around lines 497 - 505, Wrap the block that sets orig_cwd, os.chdir(TINY_TIM_DIR), creates filenames (psf_filename, full_psf_filename, temp_fits_filename, params_filename, temp_filename) and opens params_fp into a try/finally so the finally always restores cwd (os.chdir(orig_cwd)), closes params_fp if it was opened, and removes any created temp files (full_psf_filename, temp_fits_filename, params_filename, temp_filename) if they exist; ensure the cleanup runs even if the subsequent code (including the RuntimeError raised later) throws. This keeps TINY_TIM_DIR changes and temp artifact leaks from persisting across exceptions.src/psfmodel/gaussian.py (1)
147-174:⚠️ Potential issue | 🟡 MinorValidate
sigmaconsistently ingaussian_1d().
gaussian_integral_1d()now rejects non-positivesigma, butgaussian_1d()still accepts it and treats negative values like positive ones because the implementation only usessigma**2. That leaveseval_point()and the rotatedgaussian_integral_2d()path accepting invalid sigmas that the axis-aligned path rejects.Proposed fix
def gaussian_1d( x: float | npt.NDArray[np.floating], *, sigma: float = 1.0, @@ ) -> float | npt.NDArray[np.floating]: """Return a 1-D Gaussian. @@ - ret = ( + if sigma <= 0.0: + raise ValueError(f'sigma must be positive, got {sigma}') + + ret = ( scale / np.sqrt(2 * np.pi * sigma**2) * np.exp(-((x - mean) ** 2) / (2 * sigma**2)) ) + baseAs per coding guidelines, "NEVER trust external input (function arguments from callers, file contents, environment variables, data from remote URLs). Validate inputs at the public API boundary of the library with clear
ValueErrororTypeErrorexceptions for invalid arguments."Also applies to: 321-322
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/psfmodel/gaussian.py` around lines 147 - 174, The gaussian_1d() function must validate sigma the same way gaussian_integral_1d() does: check that sigma is a positive finite number and raise a ValueError for non-positive or non-finite values; update gaussian_1d() to perform this validation at the start (so eval_point() and the rotated gaussian_integral_2d() path cannot pass invalid sigma through), and mirror the same error message/exception semantics used in gaussian_integral_1d() for consistency across the public API.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.cursor/skills/run-all-checks/SKILL.md:
- Around line 61-69: Update the Options section to list the individual tool
flags and pytest worker override so users can run tools granularly: add entries
for --ruff-check, --mypy, --pytest, --pyroma, --bandit, --vulture, --sphinx,
--pymarkdown and -w/--pytest-workers; also clarify that the -c/--code option
includes ruff, mypy, pytest as well as pyroma, bandit and vulture (i.e.,
enumerate those tools under the -c/--code description) so the behavior of
run-all-checks and its flags is explicit.
In @.github/ISSUE_TEMPLATE/bug_report.md:
- Line 7: The markdown file is missing a top-level H1 which triggers
markdownlint MD041; insert a top-level heading "# Bug Report" immediately after
the YAML front matter (before the existing "## Environment" heading) so the
document begins with an H1; target the "## Environment" header in
.github/ISSUE_TEMPLATE/bug_report.md as the anchor point to add the new "# Bug
Report" line right after the front matter block.
- Around line 7-43: Update the bug report template to add explicit "Severity"
and "Evidence" sections and a side-by-side "Expected vs Actual" comparison;
modify the existing headings (e.g., "Description", "Steps to Reproduce",
"Expected Behavior", "Actual Behavior", "Additional Context") to include a new
"Severity" field (with guidance like P0–P3), an "Evidence" section for
logs/traces/attachments, and replace the separate "Expected Behavior"/"Actual
Behavior" blocks with a single "Expected vs Actual" subsection formatted for
side-by-side comparison so reporters provide consistent, actionable information.
In @.github/ISSUE_TEMPLATE/feature_request.md:
- Line 7: Add a top-level H1 heading "# Feature Request" immediately after the
front matter so the document no longer starts with a secondary heading; locate
the existing "## Problem or Motivation" heading and insert the single-line H1
above it to satisfy markdown-lint rule MD041.
In @.github/ISSUE_TEMPLATE/other.md:
- Line 7: Add a top-level H1 heading to fix markdownlint MD041 by inserting "#
Other" above the existing "## Category" heading; update the
.github/ISSUE_TEMPLATE/other.md file so the first visible heading is a
single-line "# Other" followed by the existing "## Category" to preserve current
structure and content.
In @.github/workflows/run-tests.yml:
- Around line 54-56: The GH Actions test matrix currently sets matrix.os to only
"ubuntu-latest", so add additional OS entries to matrix.os (e.g., "macos-latest"
and "windows-latest") alongside "ubuntu-latest" and keep the existing
python-version entries to ensure cross-platform test coverage; update the matrix
block (the keys "matrix", "os", and "python-version" in the workflow) to include
the extra OS strings so CI runs on macOS and Windows as well as Linux.
In `@CODEBASE_ANALYSIS.md`:
- Around line 5-13: Update the historical snapshot in CODEBASE_ANALYSIS.md by
either regenerating the identified findings to reflect the current PR state or
clearly tagging them as an “historical baseline” with a date/commit reference;
specifically edit the paragraph beginning with "rms-psfmodel" and the separate
"priority" section so they no longer claim “no logging”, “~45% coverage”, or
unpatched bugs as current facts—either remove those assertions or prepend a
clear historical label (e.g., "Historical snapshot (commit XYZ, YYYY-MM-DD): …")
and do the same for the other referenced blocks around the same sections so the
document no longer contains stale/conflicting information.
In `@CONTRIBUTING.md`:
- Around line 134-145: The "Reporting Issues" checklist is missing required
fields; update that section to mandate numbered minimal reproduction steps, a
side-by-side "Expected vs. Actual behavior" block, explicit severity/priority,
exact version numbers and full tracebacks, and a "one issue per report" rule;
edit the "Reporting Issues" heading and its bullet list to add these required
fields and brief examples or templates so reporters cannot omit them.
- Around line 42-44: Update the example command in CONTRIBUTING.md to call the
repository-relative executable with an explicit relative path: replace the bare
"scripts/run-all-checks.sh" entry with a "./scripts/run-all-checks.sh"
invocation so the shell finds the script when PATH doesn't include the repo
root.
- Around line 46-50: Replace the example commit message "Add feature:
description of your changes" in CONTRIBUTING.md with guidance to use the
Conventional Commits format; specifically state the required pattern as
"type(optional scope): imperative summary (50 chars max)", update the sample to
follow that pattern (e.g., use "feat(scope): short imperative summary") and
ensure the surrounding text instructs contributors to use type, optional scope,
and an imperative 50-char max summary when committing.
In `@docs/conf.py`:
- Around line 25-28: The release lookup in docs/conf.py calls
importlib.metadata.version('rms-psfmodel') which doesn't match the actual
distribution name in pyproject.toml; update the argument to
importlib.metadata.version to 'psfmodel' (and keep the existing
PackageNotFoundError fallback) so the release variable is set from the real
package metadata instead of always falling back to '1.0.0'.
In `@docs/make.bat`:
- Around line 1-35: The batch script make.bat currently uses LF-only line
endings; convert make.bat to use CRLF line endings to match Sphinx .bat
conventions (ensure the file is saved with CRLF in your editor or convert with a
tool before committing), then commit the updated make.bat; optionally add a
.gitattributes entry (e.g., for *.bat text eol=crlf) if you want to enforce CRLF
for future commits—the changes affect the existing make.bat which references
SPHINXBUILD, SOURCEDIR and BUILDDIR.
In `@docs/performance_report/conf.py`:
- Around line 1-12: The docs conf is missing common Sphinx extensions used by
the main docs; update the extensions list (the extensions variable) to include
at minimum "myst_parser" if you use MyST markdown and "sphinx.ext.autodoc" (and
optionally "sphinx.ext.napoleon" and "sphinx.ext.intersphinx") so the
performance report can parse markdown and resolve cross-references; also add any
minimal MyST config (e.g., enabling myst headings) and intersphinx mapping if
you rely on cross-project links—make these changes in the same conf.py where the
extensions list is declared.
In `@docs/performance_report/index.rst`:
- Around line 1087-1094: The example command uses a repo-internal config path,
making it non-reproducible for installed users; update the reproducibility
section to either (A) replace the hardcoded path with the CLI's supported option
by showing use of the characterize_gauss_fit CLI with the --copy-hires-config-to
flag so the config is exported for users, or (B) explicitly state “run from the
repository root” and keep the existing command. Modify the text around the
example and the note about summary.json accordingly so readers know which
workflow (export via --copy-hires-config-to or run from repo root) to follow.
In `@docs/psf_gui.md`:
- Around line 28-32: Replace the private helper invocation tkinter._test() with
the documented public verifier by changing the verification command from python
-c "import tkinter; tkinter._test()" to python -m tkinter; update any references
to tkinter._test() in the docs to recommend using the module invocation (python
-m tkinter) as the supported, public way to verify Tkinter installation.
In `@pyproject.toml`:
- Around line 74-76: The dev dependency list in pyproject.toml incorrectly
includes the package itself ("psfmodel"), causing a circular dependency; remove
"psfmodel" from the dev = [...] array so that the package is not listed as its
own dev dependency, leaving only true development packages like "coverage>=7.0"
in that list.
- Around line 52-53: The pytest addopts currently hard-codes parallelism with
"-n 4" which may not match available CPUs; update the addopts entry in
pyproject.toml (the addopts array containing "-n 4", "--cov=psfmodel",
"--strict-markers", "--strict-config") to use "-n auto" instead of "-n 4" so
pytest-xdist picks an appropriate worker count, or alternatively document the
override flag (e.g., using -w) in project README; ensure the modified addopts
preserves the other options unchanged.
In `@README.md`:
- Line 28: Replace the Zenodo badge URL in README.md that currently uses the
repository name "rms-psfmodel" with the numeric GitHub repository ID: call the
GitHub API for the repo (GET /repos/:owner/:repo) and read the "id" field, then
update the badge markdown so the image src is
https://zenodo.org/badge/{GITHUB_REPO_ID}.svg and the link is
https://zenodo.org/badge/latestdoi/{GITHUB_REPO_ID}; modify the existing badge
line (the markdown containing the current
https://zenodo.org/badge/rms-psfmodel.svg URL) to use these numeric-ID URLs.
In `@requirements.txt`:
- Around line 1-4: Replace the current multi-line guidance in requirements.txt
with a single line containing "-e ." to comply with repo policy; move the
installation and documentation guidance text into the README or docs (not in
requirements.txt) so requirements.txt remains only the editable-install
directive; ensure no other lines or comments remain in requirements.txt.
In `@scripts/run-all-checks.sh`:
- Line 306: Update the header string passed to print_header to reflect the
renamed project: change the literal "rms-psfmodel - Running All Checks" to
"psfmodel - Running All Checks" in the print_header call (look for the
print_header invocation that currently contains "rms-psfmodel - Running All
Checks").
- Around line 396-419: In the pytest runner block (checking RUN_PYTEST,
ENABLE_PYTEST, PYTEST_WORKERS_SET and using PYTEST_WORKERS) replace the direct
"-n \"$PYTEST_WORKERS\"" invocation with the pytest config override flag "-o
numprocesses=\"$PYTEST_WORKERS\"" so pytest-xdist doesn't receive duplicate -n
options from pyproject.toml; update both branches where "-n \"$PYTEST_WORKERS\""
is used to use "-o numprocesses=\"$PYTEST_WORKERS\"" while keeping the
surrounding logging (print_info, print_success, print_error) and failure
handling intact.
In `@src/characterize_gauss_fit/__init__.py`:
- Around line 1-3: Delete the redundant banner comment block at the top of
characterize_gauss_fit/__init__.py (remove the lines that only restate the
filename), leaving the module with no such file-banner comments; ensure the
module still exports whatever it should (e.g., keep or add any necessary __all__
or import statements if required elsewhere) but remove the noisy header.
In `@src/characterize_gauss_fit/_study_utils.py`:
- Around line 129-153: The progress_callback currently writes directly to stderr
via print() (in _callback), which violates library logging policy; change
progress_callback to emit structured log records instead: add an optional
logger: Optional[logging.Logger] = None parameter (or obtain one via
logging.getLogger(__name__) when None) and replace the print() calls in
_callback with logger.info/debug calls that include structured fields
(study_name, completed, total, pct) as part of the message or extra dict; remove
direct use of sys.stderr/print and ensure the completion branch logs a final
record rather than printing a newline.
In `@src/characterize_gauss_fit/config.py`:
- Around line 524-593: The _validate_config function must be extended to catch
the remaining invalid numeric knobs at load time: add explicit ValueError checks
inside _validate_config (called by load_config) to ensure
cfg.studies.constraint_modes.psf_shapes is non-empty and that each psf_shape has
positive sigma (e.g. check shape.sigma > 0) and valid angle (you already check
angle but enforce non-empty list), enforce
cfg.studies.constraint_modes.max_bad_frac is within [0,1], require
cfg.studies.constraint_modes.num_sigma and
cfg.studies.constraint_modes.bkgnd_num_sigma are either None or > 0, and ensure
cfg.studies.constraint_modes.search_limit and
cfg.studies.constraint_modes.scale_limit are >= 0; raise clear ValueError
messages naming the offending field (use the existing _validate_config function
and the cfg.studies.* attribute names to locate where to add these checks).
In `@src/characterize_gauss_fit/executor.py`:
- Around line 171-174: The post-check that filters None from ordered is
redundant after the RuntimeError check; replace the list comprehension with a
type-narrowing approach so type-checkers know no None remains: add "cast" to the
typing imports and set results = cast(list, ordered) (or use an assert like
"assert all(r is not None for r in ordered)" and then "results = ordered") to
preserve type information; refer to the ordered variable and results assignment
in executor.py and update imports accordingly.
In `@src/characterize_gauss_fit/hires_config.yaml`:
- Around line 8-14: Update the example CLI invocations in the hires_config.yaml
comments to use the supported module invocation shown in the docs (e.g. replace
standalone "characterize_gauss_fit" with the module form such as "python -m
characterize_gauss_fit" or the project-recommended equivalent) so the two
example lines that currently show "characterize_gauss_fit --config ..." and
"characterize_gauss_fit --config src/characterize_gauss_fit/hires_config.yaml"
will run for a fresh checkout; edit the comment text around those examples (look
for the examples in the header comments of hires_config.yaml) to use the exact
module invocation string used elsewhere in the docs.
In `@src/characterize_gauss_fit/main.py`:
- Around line 107-137: The parser currently allows combining early-exit modes
(--list-studies, --copy-default-config-to, --copy-test-config-to,
--copy-hires-config-to) and main() only runs the first matching branch; change
argument parsing to treat these options as mutually exclusive by adding a
mutually exclusive argument group to the ArgumentParser and placing all four
flags into that group (so argparse will error on conflicting combinations), then
update any references in main() that assume multiple may be set to rely on the
single chosen option (e.g., checks that inspect args.list_studies or
args.copy_default_config_to etc.); ensure error messages remain user-friendly
when argparse reports conflicts.
In `@src/characterize_gauss_fit/output.py`:
- Around line 28-72: The CSV schema in _CSV_COLUMNS is missing several TrialSpec
fields so trials.csv drops important per-trial inputs; update the _CSV_COLUMNS
list to include the missing keys (at minimum: 'rng_seed', 'base',
'bkgnd_num_sigma', 'max_bad_frac', 'allow_nonzero_base', 'use_angular_params',
'tolerance', 'search_limit', and 'scale_limit') and ensure the same set is added
where the schema is duplicated (the similar block around the later occurrence),
so all TrialSpec attributes are emitted for replay and downstream filtering.
In `@src/characterize_gauss_fit/plotting.py`:
- Around line 287-301: plot_grouped_bars currently replaces NaNs with 0.0
(bar_vals/display_vals) and then unconditionally calls ax.set_yscale('log'),
which fails for zero/negative values; update the end of plot_grouped_bars to
first detect positive finite values (e.g. positive_mask =
np.isfinite(display_vals) & (display_vals > 0)) and only call
ax.set_yscale('log') if positive_mask.any(), or alternatively replace
non-positive/NaN display_vals with a small positive floor before plotting (use
the bar_vals/display_vals and nan_mask variables and then call ax.set_yscale
only when safe) to avoid log-axis errors.
In `@src/characterize_gauss_fit/study_background.py`:
- Around line 137-268: The _write_outputs function builds a lookup from specs
and results but lacks a defensive count check; add a validation in
_write_outputs that len(specs) == len(results) and raise a RuntimeError with a
clear message if not, before constructing the lookup (i.e., before the for spec,
result in zip(specs, results, strict=True) loop) so mismatched ordering/count
problems are caught early.
- Line 124: The call in run() unpacks build_specs(cfg) into specs, _fit_degrees
but never uses _fit_degrees; either change build_specs to return only specs or,
if other callers need the tuple, stop discarding the value here and use or
document it. Update the build_specs function signature and its return to return
a single specs object (or alternatively remove the underscore and store
fit_degrees in a named variable if you intend to use it), and then adjust all
callers (including the call in run) so they match the new return shape; ensure
references to build_specs, run, specs, and fit_degrees are updated accordingly.
In `@src/characterize_gauss_fit/study_box_sigma.py`:
- Around line 55-64: The fit is incorrectly constrained by hardcoding
fit_angle=0.0; change the code that builds the PSF/fitting call (the parameters
including fit_angle and study.angle) to set fit_angle to study.angle (or the
configured box_vs_sigma.angle override if present) so the fitting honors the
study's configured angle instead of forcing 0.0; also update the related
duplicate block (the similar code around the 156-164 region) the same way, and
correct the formatted note that prints the angle to convert radians to degrees
when presenting the value (i.e., format degrees, not raw radians).
In `@src/characterize_gauss_fit/study_constraints.py`:
- Around line 95-106: The current use of max(study.sigma_error_fractions) can
pick 0.0 instead of the strongest non-zero sigma; change the logic to first
filter out zero fractions (e.g., non_zero = [f for f in
study.sigma_error_fractions if f != 0]) and only create the
ConstraintMode('all_fixed_errors', ...) when non_zero is not empty, selecting
max_frac = max(non_zero, key=abs) so the chosen fraction is the strongest
non-zero error (use that max_frac when constructing ConstraintMode with
sigma_is_fraction=True).
In `@src/characterize_gauss_fit/study_offset.py`:
- Around line 122-145: The code is rounding configurable parameters into
filenames and axis labels (see offset_labels, sigmas and the
f'pos_err_sigma{sigma:.1f}.png' usages), which can cause distinct sigma values
to collide and mislabel the heatmap axes; update formatting to preserve
sufficient precision (or use the raw string form) when building offset_labels
and when formatting sigma into filenames and labels (replace f'{v:.2f}' and
f'{sigma:.1f}' with a higher-precision format such as full str() or a fixed
high-precision format like '{:.6f}'), and apply the same change to the other
occurrences mentioned (the other sigma filename/label uses around the other
blocks at the referenced locations) so files and axis labels remain unique and
accurate.
- Around line 98-107: The call to _write_outputs(spcecifically the call from
study_dir = utils.ensure_study_dir(...); _write_outputs(cfg, specs, results,
study_dir)) uses four positional args; make the final parameter keyword-only by
changing _write_outputs signature to accept study_dir after a bare * (e.g. def
_write_outputs(cfg, specs, results, *, study_dir: pathlib.Path) -> None) and
update the call site to pass study_dir=study_dir; keep the function name
_write_outputs and the variable study_dir unchanged so other references are
unaffected.
In `@src/characterize_gauss_fit/study_shape.py`:
- Line 49: The angle sweep currently builds `angles = np.linspace(0.0, math.pi,
study.angle_steps)` which includes both 0 and pi (duplicate orientations);
change the sampling to exclude the endpoint so orientations are in [0, pi)
(e.g., use np.linspace with endpoint=False or drop the last element) so `angles`
does not contain both 0 and pi, keeping `study.angle_steps` distinct samples per
ratio and avoiding duplicated axis-aligned entries.
In `@src/characterize_gauss_fit/trial.py`:
- Around line 98-105: The failure branch is returning None for numeric error
fields contrary to the docstring; update the code that constructs TrialResult
(look for usages/constructors creating TrialResult in the failure path around
the block with sigma_y_err, sigma_x_err, angle_err) to set numeric error fields
to float('nan') instead of None (keep *_fit fields None where documented and
ensure scale_fit stays float('nan') as noted); apply the same change to the
other construction site referenced (the similar block around lines 335-355) so
failed fits are recorded as NaN-filled error metrics rather than empty cells.
In `@src/psf_gui/main.py`:
- Around line 151-153: The debug log prints motion_y and motion_x but those
values are not used when instantiating GaussianPSF (the movement is applied
later in refresh_psf via kwargs['movement']), so either remove the unused debug
log or change it to reflect where motion is applied; update the branch that
handles self._psf_type == 'gaussian' (the instantiation of GaussianPSF and the
surrounding debug) to either remove the logger.debug('motion=(%s, %s)',
motion_y, motion_x) line or modify it to log that movement will be applied in
refresh_psf (referencing motion_y, motion_x, GaussianPSF, refresh_psf,
self._psfobj, and self._psf_type to locate the code).
- Around line 138-154: Add a clear docstring to the _ensure_psf_object method
explaining its purpose: that it ensures a cached PSF instance exists and when it
triggers a rebuild (when self._psfobj is None, or when self._psf_type is one of
'acshrc','wfpc2pc1','wfc3uvis' and the object's subsample differs from
expected_sub computed from var_subsample, or when _psf_type == 'gaussian' which
constructs a new GaussianPSF); document the roles of var_subsample, var_motiony,
var_motionx, expected_sub, need_rebuild, and that the method updates
self._psfobj and returns early when no rebuild is needed.
- Around line 231-232: Clamp the computed pixel value to the 0–255 range to
avoid overflow: when computing raw_px from psf, min_val and denom, replace the
current one-sided clamp (max(raw_px, 0.0)) with a two-sided clamp so raw_px
cannot exceed 255; e.g. apply min/max or numpy.clip to raw_px before converting
to int so val = int(clamped_raw_px). Update the code around the raw_px and val
assignments (references: raw_px, psf[y, x], min_val, denom, val) to perform this
two-sided clamping.
- Around line 67-136: Add a PEP 257 docstring to the _build_controls method
describing its purpose (builds the Tkinter control panel for PSF parameters),
summarizing key behaviors (creates left/right scale rows via nested helpers
add_scale_row_left and add_scale_row_right, binds scales to variables like
self.var_x, self.var_y, self.var_sigmax, self.var_psf_xsize, etc., and calls
self.refresh_psf on change), and noting any side effects (grids the control
Frame into the toplevel). Keep it a short one- or two-sentence summary followed
by a brief description of parameters/side effects if needed.
- Around line 1-4: Add a PEP 257-compliant module-level docstring to the top of
the psf_gui/main.py module describing its purpose (interactive PSF visualization
using Tkinter), any important behaviors or public classes/functions it exposes,
and optional usage notes; place the triple-quoted string immediately after the
initial shebang/imports (or as the very first statement) so it becomes the
module docstring and update any existing top-of-file header comment by replacing
it with that docstring.
- Line 30: The return statement returning HSTPSF(*args, **kwargs) currently uses
a broad type ignore; replace that with minimal, specific type-checker error
codes instead of a broad multi-category ignore: update the comment on the return
line (the one with HSTPSF and return) to only ignore the exact error code(s)
your type checker emits (e.g., use "# type: ignore[abstract]" or the precise
mypy code for untyped calls) or split the ignores to the specific locations that
trigger them, removing the broad/combined ignore entry.
In `@src/psfmodel/gaussian.py`:
- Around line 64-68: The constructor currently only appends "angle" to
_additional_params when angle_subsample > 1, which causes angle=None to be
ignored for angle_subsample == 1; modify the PSF class constructor to always add
"angle" to _additional_params when angle is None (regardless of angle_subsample)
so find_position can fit it, and adjust the eval_rect/angle handling to respect
a fitted angle instead of defaulting to 0.0; update the constructor docstring
text around "angle", "angle_subsample", and the note about passing None so it
matches the new behavior.
- Around line 401-457: The rotated-rectangle branch currently returns the mean
sampled PSF value over each pixel but the API expects the integral (value *
pixel area), so after computing res from vals you must multiply per-pixel by the
rectangle area (x_hi - x_lo) * (y_hi - y_lo) before reshaping/returning; compute
areas = (x_hi - x_lo) * (y_hi - y_lo), reshape to orig_shape (or multiply
res.ravel() by areas then reshape) so the additive base and scale are properly
included in the integral; implement this change right after res = ... and before
the final cast/return in the GaussianPSF.gaussian_2d call path.
In `@src/psfmodel/hst.py`:
- Around line 308-309: Update each parity-only guard to also validate that the
size is a positive integer: for the checks that currently read like "if
self.subsample % 2 != 1:" (and the similar guards at the other occurrences
noted), first verify isinstance(self.subsample, int) and self.subsample > 0 (or
the equivalent attribute name in that scope) and then check oddness; raise
ValueError with a message that mentions both positivity and oddness if the
combined condition fails. Apply the same pattern to the other referenced blocks
(around the checks at the other two locations).
- Around line 461-469: The loop that selects min_fov from RETURNED_SIZES skips
exact matches because it uses a strict ">" comparison; update the condition in
that loop to use ">=" so entries where pixel_size == psf_size_pixels are
accepted (i.e., change "if pixel_size > psf_size_pixels:" to "if pixel_size >=
psf_size_pixels:") ensuring min_fov is set for exact-size TinyTim outputs and
the ValueError for instrument/detector/psf_size_pixels no longer fires
incorrectly.
In `@src/psfmodel/psf.py`:
- Around line 753-754: The details dictionary returned from find_position()
currently holds subimage-relative offsets in details['x'] and details['y'],
which becomes inconsistent after you shift pos_x/pos_y to full-image
coordinates; update find_position() (and the call sites that unpack ret in
find_position() and where _find_position() is used) to add the crop origin
offsets to details['x'] and details['y'] whenever you convert res_x/res_y to
absolute coordinates so details['x']/details['y'] match pos_x/pos_y; if you need
to preserve subimage-relative values for internal use, store them under new keys
like details['_local_x'] / details['_local_y'] before overwriting
details['x']/details['y'] with the absolute values.
- Around line 860-864: The base parameter is being read directly from params[3]
when allow_nonzero_base is true, but in angular mode (use_angular_params=True)
the optimizer writes an angular-mapped value into that slot; update the code
that reads and sets base (the block using allow_nonzero_base and the related
bounds code around use_angular_params) to decode the angular parameter using the
same angle-to-range mapping used for the other angular parameters instead of
treating it as a raw physical baseline, and ensure the corresponding bounds are
set to the physical baseline range (not [0, pi]); reference the symbols
allow_nonzero_base, use_angular_params, base, and params[3] when making the
change.
- Around line 247-275: Reject non-positive movement_granularity early: in the
PSF evaluation method before computing num_steps (i.e., before the int(max(...))
line that uses movement_granularity) add a guard that raises ValueError when
movement_granularity <= 0 with a clear message; keep the existing fast-path that
returns self._eval_rect(...) when movement is None or zero, but if smearing is
enabled ensure movement_granularity is validated first to avoid division by zero
or negative step counts.
- Around line 532-537: Validate bkgnd_params before inferring order in
background_gradient(): ensure bkgnd_params is a 1-D array (or convertible to
one) and that its length L matches a 2-D polynomial triangular count L =
m*(m+1)/2 for some integer m>=1; compute m from L (solve m^2+m-2L=0) and if m is
not an integer or L<1 raise a ValueError with a clear message referencing
bkgnd_params and expected counts, then use m-1 (or the derived order logic
already in PSF._background_gradient_coeffs) to compute order and proceed; this
check should be placed just before computing order/int(np.sqrt(...)) and uses
rect_size, PSF._background_gradient_coeffs, and result as in the current flow.
- Around line 724-767: The loop in find_position uses abs(residual) when
computing resid_std and then clips against that same quantity, which tightens
thresholds and breaks for non-positive num_sigma; fix by first validating
num_sigma at the start of find_position (or before this loop) and raise a
ValueError for num_sigma <= 0 (and TypeError if not numeric), then compute
signed residuals with resid = details['subimg-gradient'] -
details['scaled_psf'], compute resid_std = np.std(resid) (not on the sqrt of
squared values), and perform clipping with sub_img[np.where(np.abs(resid) >
num_sigma * resid_std)] = ma.masked so the threshold uses the signed residual
standard deviation but still compares absolute deviations.
In `@TEST_SUITE_CRITIQUE.md`:
- Around line 24-37: The critique is stale: it reports coverage/ gaps that no
longer match the rewritten test suite; update the TEST_SUITE_CRITIQUE.md by
either regenerating the coverage/report against the current post-rewrite tree
(so entries referencing hst.py, psf.py, coverage percentages and missing tests
reflect current reality) or add a prominent “historical baseline” disclaimer at
the top clarifying the report describes the pre-rewrite state and the
date/commit it reflects, then remove or annotate entries about hst.py, psf.py,
find_position, _eval_rect_smeared, background_gradient, and logging assertions
to avoid contradictions with the PR. Ensure the header/disclaimer explicitly
states it is historical and update any sections that still claim gaps (hst.py,
psf.py, test coverage) to point to the regenerated report or marked historical
status.
In `@tests/conftest.py`:
- Around line 13-30: Update the docstrings for the fixtures default_psf,
symmetric_psf, and asymmetric_psf to follow Google-style with a brief
description and a Returns: section describing the returned type/value (e.g.,
"Returns: GaussianPSF: a GaussianPSF instance with ..."), ensuring you use
"Parameters:" only when a fixture accepts inputs (none here) and include
explicit Returns lines for each function that returns a GaussianPSF instance.
- Around line 1-3: Remove the top-of-file banner comment block that contains the
repeated
"################################################################################"
lines and the literal "tests/conftest.py" header; delete that comment section so
the module starts directly with actual imports or fixtures (i.e., remove the
redundant file-identity banner at the very top of tests/conftest.py).
In `@tests/test_find_position.py`:
- Around line 157-174: In test_find_position_num_sigma_rejects_outlier_pixel,
the injected hot pixel at [10, 10] is too close to the PSF centroid and doesn't
exercise the num_sigma masking path; change the contamination location to a
clearly off-center pixel (for example set contaminated[3, 3] += 200.0 or
contaminated[16, 16] += 200.0) so default_psf.find_position(contaminated, ...)
must reject that outlier and still return the expected centroid; update only the
index used when adding the outlier (the variable contaminated and the test
function test_find_position_num_sigma_rejects_outlier_pixel) leaving the rest of
the test and assertions unchanged.
In `@tests/test_gaussian.py`:
- Around line 158-179: Add tests exercising gaussian_integral_1d with reversed
limits (x_min > x_max) for both scalar and array inputs to cover the
sign-sensitive path: call GaussianPSF.gaussian_integral_1d(1.0, 0.0) and compare
to -integrate.quad(GaussianPSF.gaussian_1d, 0.0, 1.0)[0] (and analogous
comparisons for mean, scale, and base), and add array cases like
GaussianPSF.gaussian_integral_1d(np.array([1.0, 0.0]), np.array([0.0, 1.0])) and
a 2x2 array input to assert elementwise negation/swapped results match expected
values; ensure assertions use pytest.approx or npt.assert_array_almost_equal for
numeric tolerance when referencing GaussianPSF.gaussian_integral_1d,
GaussianPSF.gaussian_1d, and integrate.quad.
---
Outside diff comments:
In `@src/psfmodel/gaussian.py`:
- Around line 147-174: The gaussian_1d() function must validate sigma the same
way gaussian_integral_1d() does: check that sigma is a positive finite number
and raise a ValueError for non-positive or non-finite values; update
gaussian_1d() to perform this validation at the start (so eval_point() and the
rotated gaussian_integral_2d() path cannot pass invalid sigma through), and
mirror the same error message/exception semantics used in gaussian_integral_1d()
for consistency across the public API.
In `@src/psfmodel/hst.py`:
- Around line 497-505: Wrap the block that sets orig_cwd,
os.chdir(TINY_TIM_DIR), creates filenames (psf_filename, full_psf_filename,
temp_fits_filename, params_filename, temp_filename) and opens params_fp into a
try/finally so the finally always restores cwd (os.chdir(orig_cwd)), closes
params_fp if it was opened, and removes any created temp files
(full_psf_filename, temp_fits_filename, params_filename, temp_filename) if they
exist; ensure the cleanup runs even if the subsequent code (including the
RuntimeError raised later) throws. This keeps TINY_TIM_DIR changes and temp
artifact leaks from persisting across exceptions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0b147de8-56d3-4b09-b9f2-ebccc003b0cd
⛔ Files ignored due to path filters (10)
docs/performance_report/images/background_pos_err_amp3_ic1_oy0p25_ox0p25.pngis excluded by!**/*.pngdocs/performance_report/images/box_vs_sigma_pos_err_oy0p25_ox0p25.pngis excluded by!**/*.pngdocs/performance_report/images/constraint_modes_summary.pngis excluded by!**/*.pngdocs/performance_report/images/hot_pixel_rejection_pos_err_hotamp2.pngis excluded by!**/*.pngdocs/performance_report/images/min_detectable_offset_pos_err_noiseless.pngis excluded by!**/*.pngdocs/performance_report/images/min_detectable_offset_recovery_snr_100.pngis excluded by!**/*.pngdocs/performance_report/images/noise_sensitivity_pos_err_vs_snr.pngis excluded by!**/*.pngdocs/performance_report/images/sigma_asymmetry_angle_angle_err_sx1.0.pngis excluded by!**/*.pngdocs/performance_report/images/sigma_asymmetry_angle_pos_err_sx1.0.pngis excluded by!**/*.pngdocs/performance_report/images/subpixel_offset_pos_err_sigma1.0.pngis excluded by!**/*.png
📒 Files selected for processing (83)
.coveragerc.cursor/rules/bug_report.mdc.cursor/rules/dependency_management.mdc.cursor/rules/documentation.mdc.cursor/rules/environment_best_practices.mdc.cursor/rules/git_workflow.mdc.cursor/rules/how_to.mdc.cursor/rules/pull_request.mdc.cursor/rules/python_best_practices.mdc.cursor/rules/security.mdc.cursor/settings.json.cursor/skills/critique-test-suite/SKILL.md.cursor/skills/python-codebase-analysis/SKILL.md.cursor/skills/python-codebase-analysis/reference.md.cursor/skills/run-all-checks/SKILL.md.flake8.github/ISSUE_TEMPLATE/bug_report.md.github/ISSUE_TEMPLATE/config.yml.github/ISSUE_TEMPLATE/feature_request.md.github/ISSUE_TEMPLATE/other.md.github/pull_request_template.md.github/workflows/publish_to_pypi.yml.github/workflows/publish_to_test_pypi.yml.github/workflows/run-tests.yml.gitignore.mypy.ini.readthedocs.yaml.vscode/settings.jsonCODEBASE_ANALYSIS.mdCONTRIBUTING.mdREADME.mdTEST_SUITE_CRITIQUE.mdcodecov.ymldocs/characterize_gauss_fit.mddocs/code_of_conduct.mddocs/conf.pydocs/contributing.rstdocs/index.rstdocs/make.batdocs/module.rstdocs/performance_report/Makefiledocs/performance_report/conf.pydocs/performance_report/index.rstdocs/psf_gui.mdprograms/psf_gui.pypsfmodel/__init__.pypsfmodel/psf.pypyproject.tomlrequirements.txtscripts/run-all-checks.shsetup.cfgsrc/characterize_gauss_fit/__init__.pysrc/characterize_gauss_fit/__main__.pysrc/characterize_gauss_fit/_study_utils.pysrc/characterize_gauss_fit/config.pysrc/characterize_gauss_fit/defaults.yamlsrc/characterize_gauss_fit/executor.pysrc/characterize_gauss_fit/hires_config.yamlsrc/characterize_gauss_fit/main.pysrc/characterize_gauss_fit/output.pysrc/characterize_gauss_fit/plotting.pysrc/characterize_gauss_fit/study_background.pysrc/characterize_gauss_fit/study_box_sigma.pysrc/characterize_gauss_fit/study_constraints.pysrc/characterize_gauss_fit/study_hot_pixels.pysrc/characterize_gauss_fit/study_min_offset.pysrc/characterize_gauss_fit/study_noise.pysrc/characterize_gauss_fit/study_offset.pysrc/characterize_gauss_fit/study_shape.pysrc/characterize_gauss_fit/test_config.yamlsrc/characterize_gauss_fit/trial.pysrc/psf_gui/__init__.pysrc/psf_gui/__main__.pysrc/psf_gui/main.pysrc/psfmodel/__init__.pysrc/psfmodel/gaussian.pysrc/psfmodel/hst.pysrc/psfmodel/psf.pysrc/psfmodel/py.typedtests/conftest.pytests/test_find_position.pytests/test_gaussian.pytests/test_psf.py
💤 Files with no reviewable changes (7)
- .flake8
- .mypy.ini
- setup.cfg
- psfmodel/init.py
- .coveragerc
- programs/psf_gui.py
- psfmodel/psf.py
There was a problem hiding this comment.
Actionable comments posted: 10
♻️ Duplicate comments (2)
src/psfmodel/psf.py (2)
270-275:⚠️ Potential issue | 🟠 MajorReject non-positive
movement_granularitybefore dividing.When smearing is enabled,
movement_granularity <= 0reaches Line 274 and either divides by zero or produces a negative step count. This should fail fast with a clearValueError.Proposed fix
if movement is None or (movement[0] == 0 and movement[1] == 0): return self._eval_rect(rect_size, offset=offset, scale=scale, base=base, **kwargs) + if movement_granularity <= 0: + raise ValueError( + f'movement_granularity must be positive, got {movement_granularity}' + ) + num_steps = int( max(abs(movement[0]) / movement_granularity, abs(movement[1]) / movement_granularity) )As per coding guidelines, "NEVER trust external input (function arguments from callers, file contents, environment variables, data from remote URLs). Validate inputs at the public API boundary of the library with clear
ValueErrororTypeErrorexceptions for invalid arguments."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/psfmodel/psf.py` around lines 270 - 275, Before computing num_steps in the smear/movement path (the block that checks movement and then computes num_steps using movement_granularity), validate that movement_granularity is a positive number; if movement_granularity <= 0 raise a ValueError with a clear message. Add the check just before the num_steps calculation in the same function (where movement, movement_granularity and _eval_rect are used) so you fail fast instead of allowing division by zero or negative step counts.
532-537:⚠️ Potential issue | 🟠 MajorValidate
bkgnd_paramsbefore inferring the polynomial order.
background_gradient()assumes a 1-D coefficient vector whose length is a valid 2-D polynomial coefficient count (1, 3, 6, 10, ...). Any other shape/count currently falls through to a broadcast error from NumPy instead of a clear library error.Proposed fix
- bkgnd_params = np.array(bkgnd_params) - - order = int(np.sqrt(len(bkgnd_params) * 2)) - 1 + bkgnd_params = np.atleast_1d(np.asarray(bkgnd_params, dtype=np.float64)) + if bkgnd_params.ndim != 1: + raise ValueError( + f'bkgnd_params must be a 1-D coefficient vector, got shape {bkgnd_params.shape}' + ) + + order = int(np.sqrt(bkgnd_params.size * 2)) - 1 + if (order + 1) * (order + 2) // 2 != bkgnd_params.size: + raise ValueError( + 'bkgnd_params must contain a valid 2-D polynomial coefficient count, ' + f'got {bkgnd_params.size}' + ) a3d = PSF._background_gradient_coeffs(rect_size, order)As per coding guidelines, "NEVER trust external input (function arguments from callers, file contents, environment variables, data from remote URLs). Validate inputs at the public API boundary of the library with clear
ValueErrororTypeErrorexceptions for invalid arguments."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/psfmodel/psf.py` around lines 532 - 537, Validate bkgnd_params before computing order: ensure bkgnd_params is a 1-D sequence/array and its length matches a triangular number of 2D polynomial coefficients ((order+1)*(order+2)/2). In the function that contains the shown lines (where bkgnd_params is converted to np.array and order is inferred), check ndim==1 and compute order_candidate = int(np.sqrt(len(bkgnd_params)*2)) - 1 then verify (order_candidate+1)*(order_candidate+2)//2 == len(bkgnd_params); if not, raise ValueError with a clear message naming bkgnd_params and the expected coefficient counts. Also keep using PSF._background_gradient_coeffs(rect_size, order_candidate) after validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/characterize_gauss_fit/config.py`:
- Around line 254-258: The YAML loader assigns raw = yaml.safe_load(fh) but
doesn't validate the result; ensure that after yaml.safe_load(fh) you check that
raw is a dict (mapping) and raise a clear ValueError if it's None or not a dict
so downstream code doesn't get confusing type errors; reference the
yaml.safe_load call and the raw variable (and the defaults.yaml context) and
update the error handling to raise ValueError(f"defaults.yaml must contain a
mapping, got {type(raw)!r}") (or similar) when the parsed value isn't a dict.
- Around line 590-611: The current validation logic only checks
constraint_modes.fitting (cm.fitting) and misses other FittingConfig instances
(e.g., cfg.fitting, box_vs_sigma.fitting), so extract the repeated checks into a
helper function (e.g., validate_fitting_config) that accepts a FittingConfig
instance and performs the checks for max_bad_frac, num_sigma, bkgnd_num_sigma,
search_limit, and scale_limit; then call that helper for cm.fitting and for each
other place that holds a FittingConfig (at minimum cfg.fitting and any
study-specific configs like box_vs_sigma.fitting) so all FittingConfig instances
are validated consistently before being used.
In `@src/characterize_gauss_fit/output.py`:
- Around line 194-196: The code constructs study_dir via "study_dir = output_dir
/ study" which allows absolute paths or '..' to escape output_dir; fix by
resolving the combined path and verifying it is inside the resolved output_dir
(e.g., resolved_output = (output_dir / study).resolve(); if not
str(resolved_output).startswith(str(output_dir.resolve()) + os.sep): raise
ValueError) before mkdir and creating csv_path; apply the same
resolve-and-verify pattern wherever similar joins occur (e.g., the other
study/output path construction around the second occurrence).
- Around line 165-170: The public writer write_csv currently takes four
positional parameters; change its signature so any parameters beyond the first
two are keyword-only (e.g., def write_csv(output_dir: pathlib.Path, study: str,
*, specs: list[TrialSpec], results: list[TrialResult]) -> pathlib.Path) and
update all call sites to pass specs=... and results=...; apply the same change
to the other public writer referenced around lines 307-315 (make extra params
keyword-only and update callers accordingly).
- Around line 29-82: _CSV_COLUMNS is a module-level constant currently typed as
a mutable list; change it to an immutable tuple (e.g., annotate and assign as
tuple[str, ...] or use tuple(...) instead of list) so the schema cannot be
mutated at runtime, and update any code that expects list-specific methods to
either convert _CSV_COLUMNS to list(...) locally or avoid mutations; keep the
symbol name _CSV_COLUMNS unchanged.
- Around line 350-353: Validate the incoming groups structure and index values
before using them: assert each group is a mapping with a list[int] under
'indices' and for each index ensure it's an int, non-negative and <
len(results); raise a TypeError if 'indices' is not a list of ints and a
ValueError for out-of-range/empty indices. In the loop that currently does "for
group in groups: indices: list[int] = group['indices']; group_results =
[results[i] for i in indices]; agg = _aggregate_results(group_results)" add
these checks and fail fast with clear error messages that reference the
offending group/index, then call _aggregate_results only with the validated
group_results.
In `@src/characterize_gauss_fit/plotting.py`:
- Around line 396-400: The function sets fig.suptitle but doesn't call
fig.tight_layout(), which can let the suptitle overlap or clip labels; add a
call to fig.tight_layout() (e.g., immediately after fig.suptitle(...)) before
returning the figure so layout is adjusted—mirror the behavior used in
plot_heatmap and plot_recovery_fraction_heatmap and update the block that
references fig, fig.suptitle and img_ref to call fig.tight_layout() prior to
return.
- Around line 85-99: In save_figure validate that the provided output_dir exists
and is a directory before calling fig.savefig: check pathlib.Path.exists() and
pathlib.Path.is_dir() on the output_dir parameter in the save_figure function
and raise a clear ValueError (or FileNotFoundError with a descriptive message)
mentioning output_dir if it doesn't exist or isn't a directory, so callers get
an explicit, informative error instead of the underlying savefig traceback.
- Around line 149-150: Add a shape-validation check before applying the mask to
the displayed data: ensure that the `mask` array has the same shape as the
`data`/`display` array and raise a clear ValueError if it does not, to avoid
silent broadcasting or confusing NumPy errors; locate the mask application (the
`if mask is not None: display = np.where(mask, np.nan, display)` block) and
mirror the validation approach used in `plot_multi_panel_heatmaps` (validate
length/shape) so the error message references the expected and actual shapes
(e.g., "mask.shape vs data.shape").
In `@src/psfmodel/psf.py`:
- Around line 258-259: The docstring for the parameter "offset" in
src/psfmodel/psf.py contains an unresolved "XXX" marker; either remove the "XXX"
or replace it with a clear explanation of the sign convention and effect (e.g.,
that a positive offset moves the PSF down and left in image coordinates),
updating the docstring for the function/class that defines the "offset"
parameter so it contains a definitive, unambiguous description rather than the
"XXX" placeholder.
---
Duplicate comments:
In `@src/psfmodel/psf.py`:
- Around line 270-275: Before computing num_steps in the smear/movement path
(the block that checks movement and then computes num_steps using
movement_granularity), validate that movement_granularity is a positive number;
if movement_granularity <= 0 raise a ValueError with a clear message. Add the
check just before the num_steps calculation in the same function (where
movement, movement_granularity and _eval_rect are used) so you fail fast instead
of allowing division by zero or negative step counts.
- Around line 532-537: Validate bkgnd_params before computing order: ensure
bkgnd_params is a 1-D sequence/array and its length matches a triangular number
of 2D polynomial coefficients ((order+1)*(order+2)/2). In the function that
contains the shown lines (where bkgnd_params is converted to np.array and order
is inferred), check ndim==1 and compute order_candidate =
int(np.sqrt(len(bkgnd_params)*2)) - 1 then verify
(order_candidate+1)*(order_candidate+2)//2 == len(bkgnd_params); if not, raise
ValueError with a clear message naming bkgnd_params and the expected coefficient
counts. Also keep using PSF._background_gradient_coeffs(rect_size,
order_candidate) after validation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 686d9ec0-6991-4467-af3d-6e10cd306e8c
📒 Files selected for processing (6)
src/characterize_gauss_fit/config.pysrc/characterize_gauss_fit/output.pysrc/characterize_gauss_fit/plotting.pysrc/psfmodel/psf.pytests/test_find_position.pytests/test_gaussian.py
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2 +/- ##
===========================================
+ Coverage 73.20% 93.10% +19.90%
===========================================
Files 3 3
Lines 459 551 +92
Branches 93 110 +17
===========================================
+ Hits 336 513 +177
+ Misses 102 19 -83
+ Partials 21 19 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 13
♻️ Duplicate comments (10)
src/psfmodel/psf.py (2)
555-560:⚠️ Potential issue | 🟡 MinorValidate
bkgnd_paramsbefore inferring polynomial order.
background_gradient()assumes a 1-D coefficient vector whose length matches a valid 2-D polynomial coefficient count (1, 3, 6, 10, ...). Invalid shapes or counts will produce confusing NumPy broadcast errors instead of clear library exceptions.Proposed fix
- bkgnd_params = np.array(bkgnd_params) - - order = int(np.sqrt(len(bkgnd_params) * 2)) - 1 + bkgnd_params = np.atleast_1d(np.asarray(bkgnd_params, dtype=np.float64)) + if bkgnd_params.ndim != 1: + raise ValueError( + f'bkgnd_params must be a 1-D coefficient vector, got shape {bkgnd_params.shape}' + ) + + n_coeffs = bkgnd_params.size + # Valid counts are triangular: (order+1)*(order+2)//2 for order >= 0 + order = int(np.sqrt(n_coeffs * 2)) - 1 + if (order + 1) * (order + 2) // 2 != n_coeffs: + raise ValueError( + f'bkgnd_params length must be a triangular number (1, 3, 6, 10, ...), ' + f'got {n_coeffs}' + )As per coding guidelines: "NEVER trust external input... Validate inputs at the public API boundary."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/psfmodel/psf.py` around lines 555 - 560, background_gradient() currently converts bkgnd_params to an array and infers a polynomial order from its length, which can produce confusing NumPy errors for invalid lengths; before computing order, validate that bkgnd_params is 1-D and that its length n satisfies n == (m+1)(m+2)/2 for some integer m (i.e., triangular numbers 1,3,6,10,...); if not, raise a clear ValueError. Locate the block using bkgnd_params, the inferred variable order, and the call to PSF._background_gradient_coeffs(rect_size, order) and add checks: ensure bkgnd_params.ndim == 1 and compute m from the quadratic equation or by looping until the triangular number matches, then set order = m; otherwise raise an explanatory exception before calling PSF._background_gradient_coeffs or doing the sum into result.
276-281:⚠️ Potential issue | 🟠 MajorReject non-positive
movement_granularitybefore dividing.When smearing is enabled (movement is not None/zero),
movement_granularity <= 0will cause division by zero or produce nonsensical step counts at Line 280. Add validation after the early-return check.Proposed fix
if movement is None or (movement[0] == 0 and movement[1] == 0): return self._eval_rect(rect_size, offset=offset, scale=scale, base=base, **kwargs) + if movement_granularity <= 0: + raise ValueError( + f'movement_granularity must be positive, got {movement_granularity}' + ) + num_steps = int( max(abs(movement[0]) / movement_granularity, abs(movement[1]) / movement_granularity) )As per coding guidelines: "NEVER trust external input... Validate inputs at the public API boundary of the library with clear
ValueErrororTypeErrorexceptions for invalid arguments."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/psfmodel/psf.py` around lines 276 - 281, The code computes num_steps using movement_granularity without validating it, which can cause division by zero or nonsensical values when smearing is enabled; after the early-return that checks movement, add a guard that ensures movement_granularity is a positive number (e.g., if not isinstance(movement_granularity, (int, float)) or movement_granularity <= 0: raise ValueError("movement_granularity must be > 0")), so the subsequent calculation of num_steps (and use of movement, movement_granularity) is safe before calling _eval_rect or performing the smear loop.src/characterize_gauss_fit/output.py (1)
195-197:⚠️ Potential issue | 🟠 MajorKeep
studyinsideoutput_dir.Both writers still build
study_dir = output_dir / studydirectly. An absolutestudyor a..segment lets callers write outside the configured root. Resolve the combined path and verify it remains underoutput_dirbeforemkdir()and file creation.As per coding guidelines "For file paths, resolve to absolute paths and verify they remain within the expected directory to prevent path traversal attacks".
Also applies to: 347-349
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/characterize_gauss_fit/output.py` around lines 195 - 197, The code currently constructs study_dir = output_dir / study and immediately mkdirs and writes files (csv_path), which allows absolute or ../ segments in study to escape output_dir; change this to resolve the combined path (e.g., resolved = (output_dir / study).resolve()) and verify it is a descendant of output_dir.resolve() (by comparing parents or using Path.relative_to in a try/except) before calling resolved.mkdir(...) and creating csv_path = resolved / 'trials.csv'; apply the same change to the other writer block that builds study_dir/csv_path.src/characterize_gauss_fit/study_shape.py (1)
49-49:⚠️ Potential issue | 🟡 MinorSample
[0, pi)instead of duplicating the axis-aligned case.
np.linspace(0.0, math.pi, study.angle_steps)includes both0andpi, which are the same physical orientation here. That duplicates one column per ratio and makes the output labels disagree with the intended unique-angle sweep.Also applies to: 116-121
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/characterize_gauss_fit/study_shape.py` at line 49, Change the angle sampling to exclude the endpoint so 0 and pi aren’t both produced; specifically update the `angles = np.linspace(0.0, math.pi, study.angle_steps)` call to use endpoint=False (and do the same for the other identical linspace usage around the 116-121 block) so the sweep samples [0, pi) with `study.angle_steps` unique orientations.src/characterize_gauss_fit/study_box_sigma.py (1)
55-63:⚠️ Potential issue | 🟠 MajorHonor
study.anglewhen fixing the fit.The generated PSF uses
study.angle, but the fit is always constrained to0.0. Any non-zerobox_vs_sigma.angleturns this into a wrong-angle-constrained benchmark instead of a box-size study. The note below also formats the radian value as though it were degrees.Also applies to: 156-164
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/characterize_gauss_fit/study_box_sigma.py` around lines 55 - 63, The fit is being forced to 0.0 via the fit_angle argument even though the generated PSF uses study.angle; change the fit_angle passed into the PSF/fitting call from 0.0 to study.angle (and do the same for the second occurrence around the other call in the file) so the fitter honors the box orientation, and update the adjacent comment/formatting that prints the angle to treat it as radians (not degrees) when formatting the value; locate the arguments list containing angle=study.angle, fit_angle=0.0 and replace the literal with fit_angle=study.angle in both places.src/characterize_gauss_fit/config.py (1)
542-639:⚠️ Potential issue | 🟠 MajorSeveral sigma fields still bypass config validation.
_validate_config()now checksconstraint_modes.psf_shapes, but zero or negative values in other sigma inputs still get through, includingbox_vs_sigma.sigmas,subpixel_offset.sigmas,min_detectable_offset.sigmas,sigma_asymmetry_angle.sigma_x_values,noise_sensitivity.sigmas,background.sigma, andhot_pixel_rejection.sigma. Those invalid configs fail much later in spec generation or fitting instead of atload_config().As per coding guidelines "NEVER trust external input (function arguments from callers, file contents, environment variables, data from remote URLs) — validate inputs at the public API boundary with clear
ValueErrororTypeErrorexceptions for invalid arguments".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/characterize_gauss_fit/config.py` around lines 542 - 639, Add explicit validation for all sigma-related config fields so invalid sigma values fail fast in _validate_config: check that cfg.studies.box_vs_sigma.sigmas, cfg.studies.subpixel_offset.sigmas, cfg.studies.min_detectable_offset.sigmas, cfg.studies.noise_sensitivity.sigmas, and cfg.studies.sigma_asymmetry_angle.sigma_x_values are non-empty sequences and that every element > 0; also validate scalar sigma fields like cfg.studies.background.sigma and cfg.studies.hot_pixel_rejection.sigma are > 0. Implement these checks alongside the existing validators (near _check_box_size and _validate_fitting_config) and raise ValueError with a clear message naming the offending field (use the exact symbol names above to locate the fields).src/characterize_gauss_fit/plotting.py (1)
96-99:⚠️ Potential issue | 🟠 MajorValidate the save target before calling
savefig().
filenameis joined directly, so an absolute path or../escapesoutput_dir, and this helper still gives no clear error whenoutput_dirdoes not exist. Resolve the final target, ensure it stays under an existing output directory, then save.As per coding guidelines "For file paths, resolve to absolute paths and verify they remain within the expected directory to prevent path traversal attacks".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/characterize_gauss_fit/plotting.py` around lines 96 - 99, Resolve and validate the save target before calling fig.savefig: ensure output_dir is an existing directory (check output_dir.resolve().is_dir() and raise an error if not), compute target = (output_dir / filename).resolve(), verify the resolved target is inside the resolved output_dir (use Path.is_relative_to or compare path parts / try Path.relative_to and catch ValueError), create the target parent directories if needed, and only then call fig.savefig(target, dpi=_SAVE_DPI); this prevents path traversal from filename and gives a clear error when output_dir doesn't exist (references: variables output_dir, filename, path and the fig.savefig call).src/characterize_gauss_fit/study_offset.py (2)
99-99: 🛠️ Refactor suggestion | 🟠 MajorMake
study_dirkeyword-only in_write_outputs.This helper currently uses four positional parameters; convert
study_dirto keyword-only to stabilize call sites.As per coding guidelines, "Limit new functions to at most three positional parameters. Additional parameters MUST be keyword-only (after `*`)."♻️ Proposed change
- _write_outputs(cfg, specs, results, study_dir) + _write_outputs(cfg, specs, results, study_dir=study_dir) def _write_outputs( cfg: Config, specs: list[TrialSpec], results: list[TrialResult], + *, study_dir: pathlib.Path, ) -> None:Also applies to: 102-107
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/characterize_gauss_fit/study_offset.py` at line 99, The helper _write_outputs currently accepts four positional args; change its signature to make study_dir keyword-only (e.g., def _write_outputs(cfg, specs, results, *, study_dir)) and update every call site (including the call where _write_outputs(cfg, specs, results, study_dir)) to pass study_dir=study_dir; ensure any other occurrences mentioned (lines 102–107 in this diff) are updated similarly so all calls use the keyword form.
122-123:⚠️ Potential issue | 🟠 MajorAvoid lossy sigma/offset formatting in artifact names and labels.
Current
:.2f/:.1fformatting can collapse distinct configured values into identical filenames/labels (silent overwrite/mislabel risk).🧪 Proposed change
- offset_labels = [f'{v:.2f}' for v in offsets] + offset_labels = [f'{v:.6g}' for v in offsets] @@ for s_idx, sigma in enumerate(sigmas): + sigma_str = f'{sigma:.6g}' + sigma_tag = sigma_str.replace('-', 'm').replace('.', 'p') @@ for hmap, metric_label, fname in [ - (grid, 'Position error (Euclidean)', f'pos_err_sigma{sigma:.1f}.png'), - (grid_y, '|pos_err_y|', f'pos_err_y_sigma{sigma:.1f}.png'), - (grid_x, '|pos_err_x|', f'pos_err_x_sigma{sigma:.1f}.png'), + (grid, 'Position error (Euclidean)', f'pos_err_sigma{sigma_tag}.png'), + (grid_y, '|pos_err_y|', f'pos_err_y_sigma{sigma_tag}.png'), + (grid_x, '|pos_err_x|', f'pos_err_x_sigma{sigma_tag}.png'), ]: @@ - title=f'{metric_label} vs. offset (sigma={sigma:.1f})', + title=f'{metric_label} vs. offset (sigma={sigma_str})', @@ - line_labels.append(f'sigma={sigma:.1f}') + line_labels.append(f'sigma={sigma_str}')Also applies to: 142-152, 202-202
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/characterize_gauss_fit/study_offset.py` around lines 122 - 123, The current creation of offset_labels (and other occurrences using '{:.2f}' / '{:.1f}') strips precision and can collapse distinct offsets into identical names; update the formatting for offset_labels and the similar uses to avoid lossy rounding by using a non-lossy representation (e.g., use str(v) or high-precision format like '{:.6f}' or repr(v)) so filenames/labels preserve uniqueness; search for uses of the pattern '{:.2f}' and '{:.1f}' (including the symbol offset_labels and the offsets variable in this module) and replace them with the chosen non-lossy formatting.src/characterize_gauss_fit/study_constraints.py (1)
96-98:⚠️ Potential issue | 🟠 MajorPick the strongest non-zero sigma error for
all_fixed_errors.Using
max(study.sigma_error_fractions)can select0.0or a weaker value instead of the strongest configured error magnitude.🧪 Proposed change
- if study.sigma_error_fractions: - max_frac = max(study.sigma_error_fractions) + nonzero_sigma_fracs = [frac for frac in study.sigma_error_fractions if frac != 0.0] + if nonzero_sigma_fracs: + max_frac = max(nonzero_sigma_fracs, key=abs) modes.append( ConstraintMode( 'all_fixed_errors',Also applies to: 100-103
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/characterize_gauss_fit/study_constraints.py` around lines 96 - 98, The code currently uses max(study.sigma_error_fractions) which can return 0.0; change the logic that computes the value used for all_fixed_errors to pick the largest non-zero fraction instead (e.g., compute max_frac = max(f for f in study.sigma_error_fractions if f > 0) with a safe fallback like 0.0 when no positive fractions exist) and update the same pattern used around modes.append and the all_fixed_errors assignment so the strongest configured (non-zero) sigma error is chosen.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/characterize_gauss_fit/config.py`:
- Around line 267-272: After loading user_raw via yaml.safe_load(path), validate
that user_raw is a mapping before merging: if user_raw is None treat as {}, but
if it's not an instance of dict raise a ValueError with a clear message
referencing the config file path; do this check right after user_raw is assigned
(near the yaml.safe_load call) and before calling _deep_merge(raw, user_raw) so
that _deep_merge and its use of override.items() are never invoked with a
non-mapping value.
In `@src/characterize_gauss_fit/plotting.py`:
- Around line 367-378: Validate each mask panel's shape before applying it in
plot_multi_panel_heatmaps: for each p_idx in the loop over zip(axes,
all_display, panel_titles, strict=True), check that mask_panels[p_idx] has the
same shape as data_panels[p_idx] (or panel_display) and raise a clear ValueError
if not; then only use np.where to apply the mask (panel_display =
np.where(mask_panels[p_idx], np.nan, panel_display)). Reference symbols:
plot_multi_panel_heatmaps, mask_panels, data_panels, panel_display, np.where,
plot_heatmap.
In `@src/characterize_gauss_fit/study_background.py`:
- Around line 79-88: The loop generates duplicate no-background trials because
when background type is BACKGROUND_TYPE_NONE you still iterate every amplitude,
set inj_amplitude = 0.0, and produce identical runs that _write_outputs()
deduplicates; fix this by short-circuiting the amplitude loop for the
no-background case: in the iteration over study.background_types (or before
entering the amplitude loop) detect BACKGROUND_TYPE_NONE and either set a single
amplitude value of 0.0 or skip iterating study.background_amplitudes entirely so
inj_amplitude is only produced once; update the code around
study.background_types / study.background_amplitudes and the assignment to
inj_amplitude (and analogous blocks at the other locations mentioned) to avoid
emitting duplicate BACKGROUND_TYPE_NONE trials that later get collapsed in
_write_outputs()/lookup.
In `@src/characterize_gauss_fit/study_constraints.py`:
- Line 201: The call to _write_outputs is using more than three positional
arguments (cfg, specs, results, study_dir, modes); change the _write_outputs
function signature so only up to three positional parameters remain and all
additional parameters are keyword-only (add a '*' in the parameter list before
the 4th parameter), then update every call site (including the calls around the
current block and lines noted) to pass the trailing parameters by name (e.g.,
study_dir=..., modes=...) to match the new signature; ensure any internal uses
of those parameters inside _write_outputs still reference the same names.
- Around line 78-79: The label generation for sigma-error modes uses int(frac *
100) which truncates toward zero and can produce misleading duplicate labels;
update the label creation (the line assigning label and the place that appends
ConstraintMode) to format the percentage using rounding instead (for example
f'sigma_fixed_err{frac*100:.0f}pct' or int(round(frac*100))) so small fractions
produce the correct nearest-percent label while keeping the rest of the
ConstraintMode(...) call and sigma_is_fraction=True unchanged.
- Line 127: The hardcoded numeric seed "5000" assigned to variable seed should
be extracted to a named module-level constant (e.g., RNG_SEED_BASE or
DEFAULT_SEED_BASE) so the seed's meaning and ownership are discoverable; add the
constant near the top of the module, replace the inline literal in the
assignment "seed = 5000" with that constant, and update any other places that
derive or reference this base seed to use the new constant name (search for
"seed" and the assignment in study_constraints.py to locate usages).
In `@src/characterize_gauss_fit/study_min_offset.py`:
- Around line 139-146: The _write_outputs function's docstring is missing a
Raises section for the RuntimeError thrown on a results-vs-specs bucketing
mismatch; update the _write_outputs docstring to include a "Raises:" entry that
documents the RuntimeError, describing the condition (when the number of result
buckets does not match the number of specs/trials) and any message content
callers can expect, so callers know this contract for error handling.
- Line 130: The call to _write_outputs passes study_dir as a positional fourth
argument making call sites brittle; change _write_outputs signature to make
study_dir keyword-only (e.g., def _write_outputs(cfg, specs, results, *,
study_dir):) and update all call sites in this file (including the call at
_write_outputs(cfg, specs, results, study_dir) and the nearby calls around the
same block) to pass study_dir by name (study_dir=study_dir) so the function now
only accepts three positional parameters and study_dir is explicit.
- Line 76: The hardcoded seed value (seed = 3000) in study_min_offset.py should
be promoted to a module-level constant: create a descriptive constant name like
DEFAULT_RNG_SEED or RNG_SEED at the top of the module and replace the inline
literal used by the local variable seed with that constant; optionally read from
an environment/config if runtime configurability is desired. This keeps the
unique symbol seed but points it to the new module constant and centralizes the
“3000” magic number for easier updates and policy changes.
In `@src/characterize_gauss_fit/study_noise.py`:
- Around line 222-223: The 'snr' grouping lambda in study_noise.py currently
rounds scale / s.noise_rms to 1 decimal which collapses distinct bins; change
the ('snr', lambda s: ...) entry to use the exact computed value (e.g., scale /
s.noise_rms) or round to much higher precision (e.g., 6+ decimals) instead of
round(..., 1) so grouping keys remain distinct; keep the existing guard for
s.noise_rms <= 0 (math.inf) and leave ('sigma', lambda s: s.sigma_y) unchanged.
In `@src/characterize_gauss_fit/study_offset.py`:
- Line 54: The literal 2000 used to initialize seed should be extracted to a
module-level constant (e.g., RNG_SEED_BASE or SEED_BASE) so the seed-allocation
policy is explicit and maintainable; add the new constant at the top of the
module and replace the inline assignment (the local variable seed in
study_offset.py) with seed = RNG_SEED_BASE (or seed = SEED_BASE) so other code
can reference the single source of truth.
In `@src/psfmodel/psf.py`:
- Around line 1-1309: The module is too large (~1309 lines); split
responsibilities into smaller files: create psf_base.py containing the PSF class
skeleton (PSF.__init__, abstract methods eval_point, eval_rect, _eval_rect, and
_eval_rect_smeared plus related constants like _FIT_PSF_BASE_BOUND_* and JACO
eps), move background-related functions _background_gradient_coeffs,
background_gradient_fit, and background_gradient into background.py, and keep
find_position/_find_position/_fit_psf_func in either psf_base.py or a new mixin
(e.g. psf_fit.py) that imports PSF and the background helpers; update all
internal references (e.g. PSF._background_gradient_coeffs → import from
background, background_gradient_fit, background_gradient) and adjust
module-level exports and imports so symbols like PSF, background_gradient_fit,
background_gradient, and _background_gradient_coeffs remain available to callers
and tests, preserving logging and __version__ handling. Ensure tests/imports
update to new module paths and maintain relative imports within the package.
- Around line 1-4: Replace the leading comment header in psfmodel/psf.py with a
proper PEP‑257 Google-style module docstring: add a top-of-file triple-quoted
string that briefly describes the module's purpose (what psfmodel.psf provides),
and optionally include short "Attributes" or "Functions" sections if the module
exposes module-level variables or functions; ensure the docstring is the first
statement in the file and use plain English summary followed by any usage notes.
---
Duplicate comments:
In `@src/characterize_gauss_fit/config.py`:
- Around line 542-639: Add explicit validation for all sigma-related config
fields so invalid sigma values fail fast in _validate_config: check that
cfg.studies.box_vs_sigma.sigmas, cfg.studies.subpixel_offset.sigmas,
cfg.studies.min_detectable_offset.sigmas, cfg.studies.noise_sensitivity.sigmas,
and cfg.studies.sigma_asymmetry_angle.sigma_x_values are non-empty sequences and
that every element > 0; also validate scalar sigma fields like
cfg.studies.background.sigma and cfg.studies.hot_pixel_rejection.sigma are > 0.
Implement these checks alongside the existing validators (near _check_box_size
and _validate_fitting_config) and raise ValueError with a clear message naming
the offending field (use the exact symbol names above to locate the fields).
In `@src/characterize_gauss_fit/output.py`:
- Around line 195-197: The code currently constructs study_dir = output_dir /
study and immediately mkdirs and writes files (csv_path), which allows absolute
or ../ segments in study to escape output_dir; change this to resolve the
combined path (e.g., resolved = (output_dir / study).resolve()) and verify it is
a descendant of output_dir.resolve() (by comparing parents or using
Path.relative_to in a try/except) before calling resolved.mkdir(...) and
creating csv_path = resolved / 'trials.csv'; apply the same change to the other
writer block that builds study_dir/csv_path.
In `@src/characterize_gauss_fit/plotting.py`:
- Around line 96-99: Resolve and validate the save target before calling
fig.savefig: ensure output_dir is an existing directory (check
output_dir.resolve().is_dir() and raise an error if not), compute target =
(output_dir / filename).resolve(), verify the resolved target is inside the
resolved output_dir (use Path.is_relative_to or compare path parts / try
Path.relative_to and catch ValueError), create the target parent directories if
needed, and only then call fig.savefig(target, dpi=_SAVE_DPI); this prevents
path traversal from filename and gives a clear error when output_dir doesn't
exist (references: variables output_dir, filename, path and the fig.savefig
call).
In `@src/characterize_gauss_fit/study_box_sigma.py`:
- Around line 55-63: The fit is being forced to 0.0 via the fit_angle argument
even though the generated PSF uses study.angle; change the fit_angle passed into
the PSF/fitting call from 0.0 to study.angle (and do the same for the second
occurrence around the other call in the file) so the fitter honors the box
orientation, and update the adjacent comment/formatting that prints the angle to
treat it as radians (not degrees) when formatting the value; locate the
arguments list containing angle=study.angle, fit_angle=0.0 and replace the
literal with fit_angle=study.angle in both places.
In `@src/characterize_gauss_fit/study_constraints.py`:
- Around line 96-98: The code currently uses max(study.sigma_error_fractions)
which can return 0.0; change the logic that computes the value used for
all_fixed_errors to pick the largest non-zero fraction instead (e.g., compute
max_frac = max(f for f in study.sigma_error_fractions if f > 0) with a safe
fallback like 0.0 when no positive fractions exist) and update the same pattern
used around modes.append and the all_fixed_errors assignment so the strongest
configured (non-zero) sigma error is chosen.
In `@src/characterize_gauss_fit/study_offset.py`:
- Line 99: The helper _write_outputs currently accepts four positional args;
change its signature to make study_dir keyword-only (e.g., def
_write_outputs(cfg, specs, results, *, study_dir)) and update every call site
(including the call where _write_outputs(cfg, specs, results, study_dir)) to
pass study_dir=study_dir; ensure any other occurrences mentioned (lines 102–107
in this diff) are updated similarly so all calls use the keyword form.
- Around line 122-123: The current creation of offset_labels (and other
occurrences using '{:.2f}' / '{:.1f}') strips precision and can collapse
distinct offsets into identical names; update the formatting for offset_labels
and the similar uses to avoid lossy rounding by using a non-lossy representation
(e.g., use str(v) or high-precision format like '{:.6f}' or repr(v)) so
filenames/labels preserve uniqueness; search for uses of the pattern '{:.2f}'
and '{:.1f}' (including the symbol offset_labels and the offsets variable in
this module) and replace them with the chosen non-lossy formatting.
In `@src/characterize_gauss_fit/study_shape.py`:
- Line 49: Change the angle sampling to exclude the endpoint so 0 and pi aren’t
both produced; specifically update the `angles = np.linspace(0.0, math.pi,
study.angle_steps)` call to use endpoint=False (and do the same for the other
identical linspace usage around the 116-121 block) so the sweep samples [0, pi)
with `study.angle_steps` unique orientations.
In `@src/psfmodel/psf.py`:
- Around line 555-560: background_gradient() currently converts bkgnd_params to
an array and infers a polynomial order from its length, which can produce
confusing NumPy errors for invalid lengths; before computing order, validate
that bkgnd_params is 1-D and that its length n satisfies n == (m+1)(m+2)/2 for
some integer m (i.e., triangular numbers 1,3,6,10,...); if not, raise a clear
ValueError. Locate the block using bkgnd_params, the inferred variable order,
and the call to PSF._background_gradient_coeffs(rect_size, order) and add
checks: ensure bkgnd_params.ndim == 1 and compute m from the quadratic equation
or by looping until the triangular number matches, then set order = m; otherwise
raise an explanatory exception before calling PSF._background_gradient_coeffs or
doing the sum into result.
- Around line 276-281: The code computes num_steps using movement_granularity
without validating it, which can cause division by zero or nonsensical values
when smearing is enabled; after the early-return that checks movement, add a
guard that ensures movement_granularity is a positive number (e.g., if not
isinstance(movement_granularity, (int, float)) or movement_granularity <= 0:
raise ValueError("movement_granularity must be > 0")), so the subsequent
calculation of num_steps (and use of movement, movement_granularity) is safe
before calling _eval_rect or performing the smear loop.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2da44e3a-2178-4f8d-b8d0-6c5d29ec38da
📒 Files selected for processing (13)
src/characterize_gauss_fit/config.pysrc/characterize_gauss_fit/output.pysrc/characterize_gauss_fit/plotting.pysrc/characterize_gauss_fit/study_background.pysrc/characterize_gauss_fit/study_box_sigma.pysrc/characterize_gauss_fit/study_constraints.pysrc/characterize_gauss_fit/study_hot_pixels.pysrc/characterize_gauss_fit/study_min_offset.pysrc/characterize_gauss_fit/study_noise.pysrc/characterize_gauss_fit/study_offset.pysrc/characterize_gauss_fit/study_shape.pysrc/psfmodel/psf.pytests/test_find_position.py
Purpose
This PR is a comprehensive rewrite and reorganisation of the entire
rms-psfmodelcodebase, motivated by a formal code analysis that identified critical numerical
bugs, absent logging, near-zero test coverage of many code paths, a crash-on-import
legacy module, and a lack of project infrastructure (CI matrix, Cursor rules, dev
scripts). The primary goals are:
find_position.src/layout withpyproject.tomlasthe single source of truth.
print()calls with structuredlogging.characterize_gauss_fitfitting-accuracy harness and a renderedperformance report.
Changes / Implementation Details
Numerical bug fixes (
src/psfmodel/)gaussian.pygaussian_integral_1d: Removed spuriousnp.abs()on thearray path so the scalar and array paths are consistent when
x_min > x_maxorscale < 0.psf.pybackground_gradient_fit: Fixed missingfprefix on the errormessage f-string so the actual shape is reported on bad input.
print()/print(f"...")diagnostic output withself._logger.debug(...)/self._logger.warning(...)calls throughout bothmodules.
PSFbase class enhancements (src/psfmodel/psf.py)num_sigma,max_bad_frac, andbkgnd_num_sigmakeyword parameters tofind_positionto enable iterative sigma-clipping of hot pixels and backgroundoutliers before and during PSF fitting. Implemented with
numpy.mamasked arraysso masked pixels are excluded from the Powell objective without altering the
caller's array.
compute_uncertaintyflag (defaultTrue) tofind_position: whenFalse, the finite-difference Jacobian step is skipped and uncertainty entriesin the returned dict are
NaN, roughly halving cost in batch loops.use_angular_params(defaultTrue) flag to toggle parameterreparametrisation used in the Powell search.
_additional_paramsprotocol soGaussianPSFcan register freesigma/angle bounds that
find_positionpicks up automatically.docstrings wrapped at 90 chars.
Repository structure
psfmodel/intosrc/psfmodel/(PEP 517src layout). Moved
psf_guistandalone app tosrc/psf_gui/. Removed legacyprograms/psf_gui.py,.coveragerc,.mypy.ini,.flake8, andsetup.cfg;all configuration consolidated into
pyproject.toml.requirements.txtto a simple-e .redirect; all deps are inpyproject.toml.characterize_gauss_fitpackage (src/characterize_gauss_fit/)Added a new
[characterize]optional-dependency package that runs systematicparameter-space sweeps to measure fitting accuracy:
config.py— typedStudyConfig/TrialConfigdataclasses loaded from YAML(
defaults.yaml,test_config.yaml,hires_config.yaml).trial.py— individual fit trial runner with full result dataclass.executor.py— parallel multi-trial sweep runner (joblib).study_*.py(8 modules) — individual accuracy studies: noise sensitivity, subpixeloffset recovery, box-size vs sigma, background contamination, shape asymmetry, hot
pixel rejection, constraint modes, and minimum detectable offset.
output.py/plotting.py— CSV and PNG report generation.main.py/__main__.py— CLI entry point (python -m characterize_gauss_fit).Documentation (
docs/)docs/performance_report/with a full Sphinx performance report(
index.rst) and nine embedded result images covering all eight study types.docs/characterize_gauss_fit.mdmodule reference anddocs/psf_gui.md.docs/index.rstanddocs/conf.pyfor the new source layout andReadTheDocs YAML.
Tests (
tests/)tests/test_find_position.py(415 lines) with 40+ tests coveringfind_positionacross clean images, noisy images, hot-pixel rejection(
num_sigma), background outlier rejection (bkgnd_num_sigma),max_bad_fracfailure,
compute_uncertainty=False, asymmetric search limits, all constraintmodes, and edge cases.
tests/test_gaussian.pyfrom 381 to 587 lines; added parametrictests for the array/scalar consistency fix in
gaussian_integral_1d, rotationintegration, and the full
GaussianPSF_additional_paramspaths.tests/test_psf.pyandtests/conftest.pyfor the new API and srclayout.
CI / tooling
.github/workflows/run-tests.yml: added Python 3.10/3.11/3.12 matrix,Codecov upload, and
bandit/vulturesteps.codecov.ymland.github/issue templates.scripts/run-all-checks.shumbrella script (ruff, mypy, pytest,pyroma, bandit, vulture, sphinx, pymarkdown) with parallel execution.
.cursor/rules/(10 always-apply rules) and.cursor/skills/(3 skills)for consistent AI-assisted development.
Type of Change
Testing
New
test_find_position.pyexercises all major code paths added in this PR.Existing
test_gaussian.pyextended to cover thegaussian_integral_1dscalar/arrayconsistency fix. Coverage gate is
fail_under = 90inpyproject.toml.Potential Impacts
Public API changes (breaking):
find_positiongains new keyword-only parameters (num_sigma,max_bad_frac,bkgnd_num_sigma,compute_uncertainty,use_angular_params). All havebackward-compatible defaults so existing callers are unaffected.
gaussian_integral_1darray-path bug fix changes return values whenx_min > x_max— previously returnedabs(result), now returns the signedresult. Any caller relying on the incorrect unsigned behavior will need updating.
Backward compatibility: The
src/layout changes the installed package pathfor editable installs; reinstall with
pip install -e ".[dev,characterize]".Performance:
compute_uncertainty=True(default, unchanged) still performs theJacobian step; set
Falsein batch loops to halve cost.Checklist
ruff check,ruff format)mypypassesNotes
src/psfmodel/hst.pyremains in the legacy-excluded state (no ruff/mypy/bandit/coverage) as agreed; it is not changed by this PR beyond the path move.
CHANGES.mddoes not exist in this repo yet — the CHANGES.md checklist itemabove is left unchecked as a reminder to create it before the first tagged release.
CODEBASE_ANALYSIS.mdandTEST_SUITE_CRITIQUE.mdare analysis artefactscommitted to the branch for reviewer reference; they can be removed or moved to
docs/after review.Made with Cursor
Summary by CodeRabbit
New Features
Documentation
Bug Fixes
Tests