Skip to content

Add toon test#80

Merged
chengcli merged 9 commits intomainfrom
cli/test_toon
Jan 19, 2026
Merged

Add toon test#80
chengcli merged 9 commits intomainfrom
cli/test_toon

Conversation

@chengcli
Copy link
Owner

No description provided.

Copilot AI review requested due to automatic review settings January 19, 2026 20:15
@chengcli chengcli requested a review from luminoctum as a code owner January 19, 2026 20:15
@chengcli chengcli added invalid This doesn't seem right enhancement New feature or request and removed invalid This doesn't seem right labels Jan 19, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the Toon-McKay 1989 radiative transfer solver implementation, converting it from a Torch-based implementation to a lower-level C/C++ implementation with custom memory management and device dispatch. It adds comprehensive testing infrastructure and GPU support.

Changes:

  • Refactored Toon-McKay 89 shortwave and longwave solvers from Torch tensor operations to C/C++ implementations with manual memory management
  • Added new test infrastructure with device testing support (CPU/CUDA/MPS)
  • Implemented custom dispatch mechanism for CPU and CUDA backends with tridiagonal solver

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
tests/test_toon.cpp New test file for Toon-McKay 89 solver with device parameterization
tests/device_testing.hpp New device testing infrastructure supporting CPU, CUDA, and MPS
tests/CMakeLists.txt Enables test_toon test (replaces test_composition)
src/rtsolver/toon_mckay89_shortwave_impl.h Refactored shortwave solver implementation with C-style memory management
src/rtsolver/toon_mckay89_longwave_impl.h Refactored longwave solver implementation with C-style memory management
src/rtsolver/toon_mckay89.hpp Updated header with new options structure and module holder pattern
src/rtsolver/toon_mckay89.cpp Main implementation using TensorIterator dispatch
src/rtsolver/dtridgl_impl.h New tridiagonal solver using Thomas algorithm
src/rtsolver/rtsolver_dispatch.hpp Dispatch declarations for CPU/CUDA backends
src/rtsolver/rtsolver_dispatch.cu CUDA kernel implementations
src/rtsolver/rtsolver_dispatch.cpp CPU implementations with parallel iteration
src/utils/alloc.h Memory allocation utilities with alignment and space calculation
src/loops.cuh GPU kernel infrastructure for chunked processing
src/radiation/bbflux.hpp Added tensor-based blackbody flux calculation
src/radiation/bbflux.cpp Implementation of tensor blackbody flux with power series
src/radiation/radiation_band.hpp Added Toon solver option to radiation band
src/radiation/radiation_band.cpp Integration of Toon solver in radiation band
python/csrc/pyrtsolver.cpp New Python bindings for Toon solver
python/csrc/pyradiation.cpp Updated radiation bindings with Toon options
python/csrc/pyharp.cpp Added rtsolver binding initialization
src/CMakeLists.txt Enabled rtsolver source files in build
cmake/macros/macro_setup_test.cmake Added CUDA library linking for tests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 174 to 179
Af[l] = E1[nlay - 1] - w_surf_in * E3[nlay - 1];
Bf[l] = E2[nlay - 1] - w_surf_in * E4[nlay - 1];
Cf[l] = 0.0;
Df[l] = bsurf - Cp[nlay - 1] + w_surf_in * Cm[nlay - 1];

dtridgl(l, Af, Bf, Cf, Df, xk);
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array index out of bounds. The arrays Af, Bf, Cf, Df, and xk are allocated with size 'l' (which equals 2nlay) on lines 54-58, making valid indices 0 through l-1. However, the code accesses these arrays at index 'l' on lines 174-177 and passes them to dtridgl with size 'l' on line 179. This means the code is accessing index 2nlay which is one past the end of arrays that have size 2*nlay. The arrays should either be allocated with size l+1, or the indexing scheme should be adjusted to be 0-based.

Suggested change
Af[l] = E1[nlay - 1] - w_surf_in * E3[nlay - 1];
Bf[l] = E2[nlay - 1] - w_surf_in * E4[nlay - 1];
Cf[l] = 0.0;
Df[l] = bsurf - Cp[nlay - 1] + w_surf_in * Cm[nlay - 1];
dtridgl(l, Af, Bf, Cf, Df, xk);
Af[lm1] = E1[nlay - 1] - w_surf_in * E3[nlay - 1];
Bf[lm1] = E2[nlay - 1] - w_surf_in * E4[nlay - 1];
Cf[lm1] = 0.0;
Df[lm1] = bsurf - Cp[nlay - 1] + w_surf_in * Cm[nlay - 1];
dtridgl(lm1, Af, Bf, Cf, Df, xk);

Copilot uses AI. Check for mistakes.
* Based on Elsie Lee's implementation in Exo-FMS_column_ck, which was
* based on CHIMERA code by Mike Line.
* Ported by Xi Zhang to Eigen
* Ported by Cheng Li to torch
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment. "Proted" should be "Ported".

Copilot uses AI. Check for mistakes.
* Based on Elsie Lee's implementation in Exo-FMS_column_ck, which was
* based on CHIMERA code by Mike Line.
* Ported by Xi Zhang to Eigen
* Ported by Cheng Li to torch
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment. "Proted" should be "Ported".

Copilot uses AI. Check for mistakes.
Comment on lines 182 to 184
xk1[n] = xk[2 * n + 1] + xk[2 * n + 2];
xk2[n] = xk[2 * n + 1] - xk[2 * n + 2];
if (fabs(xk2[n]) < 1e-30 * fabs(xk[2 * n + 1])) xk2[n] = 0.0;
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array index out of bounds. After dtridgl returns, the code accesses xk[2 * n + 2] where n ranges from 0 to nlay-1. When n = nlay-1, this accesses xk[2nlay] which is out of bounds for an array of size l = 2nlay (valid indices are 0 to 2*nlay-1). This issue exists because the solver is using 1-based indexing but the xk array is allocated with 0-based sizing.

Suggested change
xk1[n] = xk[2 * n + 1] + xk[2 * n + 2];
xk2[n] = xk[2 * n + 1] - xk[2 * n + 2];
if (fabs(xk2[n]) < 1e-30 * fabs(xk[2 * n + 1])) xk2[n] = 0.0;
xk1[n] = xk[2 * n] + xk[2 * n + 1];
xk2[n] = xk[2 * n] - xk[2 * n + 1];
if (fabs(xk2[n]) < 1e-30 * fabs(xk[2 * n])) xk2[n] = 0.0;

Copilot uses AI. Check for mistakes.
Comment on lines 169 to 171
xk1[n] = xkk[2 * n + 1] + xkk[2 * n + 2];
xk2[n] = xkk[2 * n + 1] - xkk[2 * n + 2];
if (fabs(xk2[n]) < 1e-30 * fabs(xkk[2 * n + 1])) xk2[n] = 0.0;
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array index out of bounds. After dtridgl returns, the code accesses xkk[2 * n + 2] where n ranges from 0 to nlay-1. When n = nlay-1, this accesses xkk[2nlay] which is out of bounds for an array of size l = 2nlay (valid indices are 0 to 2*nlay-1). This issue exists because the solver is using 1-based indexing but the xkk array is allocated with 0-based sizing.

Suggested change
xk1[n] = xkk[2 * n + 1] + xkk[2 * n + 2];
xk2[n] = xkk[2 * n + 1] - xkk[2 * n + 2];
if (fabs(xk2[n]) < 1e-30 * fabs(xkk[2 * n + 1])) xk2[n] = 0.0;
// Use 0-based indexing for xkk: valid indices are 0..(2 * nlay - 1)
xk1[n] = xkk[2 * n] + xkk[2 * n + 1];
xk2[n] = xkk[2 * n] - xkk[2 * n + 1];
T xkk_ref = fmax(fabs(xkk[2 * n]), fabs(xkk[2 * n + 1]));
if (xkk_ref > 0.0 && fabs(xk2[n]) < 1e-30 * xkk_ref) xk2[n] = 0.0;

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +37
void call_toon89_sw_cpu(at::TensorIterator &iter) {
AT_DISPATCH_FLOATING_TYPES(iter.dtype(), "call_toon89_sw_cpu", [&] {
int nlay = at::native::ensure_nonempty_size(iter.input(0), -2);
int grain_size = iter.numel() / at::get_num_threads();
int mem_size = toon89_sw_space<scalar_t>(nlay);
char *work = new char[mem_size];

iter.for_each(
[&](char **data, const int64_t *strides, int64_t n) {
for (int i = 0; i < n; i++) {
auto out = reinterpret_cast<scalar_t *>(data[0] + i * strides[0]);
auto prop = reinterpret_cast<scalar_t *>(data[1] + i * strides[1]);
auto umu0 = reinterpret_cast<scalar_t *>(data[2] + i * strides[2]);
auto fbeam = reinterpret_cast<scalar_t *>(data[3] + i * strides[3]);
auto albedo =
reinterpret_cast<scalar_t *>(data[4] + i * strides[4]);
toon_mckay89_shortwave(nlay, *fbeam, umu0, prop, *albedo, out,
work);
}
},
grain_size);

delete[] work;
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread safety issue with shared work buffer. The code allocates a single work buffer and then uses it across multiple threads in the parallel for_each loop. Each thread will overwrite the same work buffer, causing race conditions and incorrect results. Each thread needs its own work buffer. Consider using thread-local storage or allocating work buffers per-thread within the parallel region.

Copilot uses AI. Check for mistakes.
Comment on lines 55 to 77
if (bc->find(bname + "fbeam") != bc->end()) {
TORCH_CHECK(bc->at(bname + "fbeam").dim() == 2,
"DisortImpl::forward: bc->fbeam.dim() != 2");
TORCH_CHECK(bc->at(bname + "fbeam").size(0) == nwave,
"DisortImpl::forward: bc->fbeam.size(0) != nwave");
TORCH_CHECK(bc->at(bname + "fbeam").size(1) == ncol,
"DisortImpl::forward: bc->fbeam.size(1) != ncol");
(*bc)["fbeam"] = bc->at(bname + "fbeam");
} else {
(*bc)["fbeam"] = torch::zeros({nwave, ncol}, prop.options());
}

if (bc->find(bname + "albedo") != bc->end()) {
TORCH_CHECK(bc->at(bname + "albedo").dim() == 2,
"DisortImpl::forward: bc->albedo.dim() != 2");
TORCH_CHECK(bc->at(bname + "albedo").size(0) == nwave,
"DisortImpl::forward: bc->albedo.size(0) != nwave");
TORCH_CHECK(bc->at(bname + "albedo").size(1) == ncol,
"DisortImpl::forward: bc->albedo.size(1) != ncol");
(*bc)["albedo"] = bc->at(bname + "albedo");
} else {
(*bc)["albedo"] = torch::zeros({nwave, ncol}, prop.options());
}
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect error message. The check is for ToonMcKay89 forward method, but the error message says "DisortImpl::forward". All three error messages in this block should be updated to "ToonMcKay89Impl::forward" for consistency and clarity.

Copilot uses AI. Check for mistakes.
"DisortImpl::forward: bc->umu0.size(0) != ncol");
(*bc)["umu0"] = bc->at(bname + "umu0");
} else {
(*bc)["umu0"] = torch::ones({1, ncol}, prop.options());
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect default value for umu0. The code sets a default value of torch::ones({1, ncol}, ...) which creates a tensor with shape (1, ncol), but the dimension check on line 46-47 expects umu0 to have dim() == 1. This mismatch will cause the check to fail when using the default value. The default should be torch::ones({ncol}, prop.options()) instead.

Suggested change
(*bc)["umu0"] = torch::ones({1, ncol}, prop.options());
(*bc)["umu0"] = torch::ones({ncol}, prop.options());

Copilot uses AI. Check for mistakes.
auto offset_calc = ::make_offset_calculator<Arity>(iter);
int64_t numel = iter.numel();

// devide numel into Chunk parts to reduce memory usage
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in comment. "devide" should be "divide".

Suggested change
// devide numel into Chunk parts to reduce memory usage
// divide numel into Chunk parts to reduce memory usage

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +28
// First row
c[0] = c[0] / b[0];
d[0] = d[0] / b[0];

// Forward sweep
for (int i = 1; i < n; ++i) {
T denom = b[i] - a[i] * c[i - 1];
if (denom == 0.0) denom = 1e-12; // Avoid division by zero
c[i] = (i < n - 1) ? c[i] / denom : 0.0;
d[i] = (d[i] - a[i] * d[i - 1]) / denom;
}

// Back substitution
x[n - 1] = d[n - 1];
for (int i = n - 2; i >= 0; --i) {
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential indexing mismatch with caller. The dtridgl function is called with parameter 'n' representing the size of the system, but the implementation uses 0-based indexing throughout (accessing c[0], d[0], etc.). Meanwhile, the caller code in toon_mckay89_shortwave_impl.h and toon_mckay89_longwave_impl.h uses 1-based indexing when setting up the matrix (accessing Af[1], Bf[1], ..., Af[l], Bf[l] where l = 2*nlay). This suggests either the arrays passed should have an offset applied, or dtridgl should account for 1-based indexing. The current implementation will process elements at indices 0 through n-1, but the caller expects processing of indices 1 through n.

Suggested change
// First row
c[0] = c[0] / b[0];
d[0] = d[0] / b[0];
// Forward sweep
for (int i = 1; i < n; ++i) {
T denom = b[i] - a[i] * c[i - 1];
if (denom == 0.0) denom = 1e-12; // Avoid division by zero
c[i] = (i < n - 1) ? c[i] / denom : 0.0;
d[i] = (d[i] - a[i] * d[i - 1]) / denom;
}
// Back substitution
x[n - 1] = d[n - 1];
for (int i = n - 2; i >= 0; --i) {
// First row (1-based indexing: use elements 1..n)
c[1] = c[1] / b[1];
d[1] = d[1] / b[1];
// Forward sweep
for (int i = 2; i <= n; ++i) {
T denom = b[i] - a[i] * c[i - 1];
if (denom == 0.0) denom = 1e-12; // Avoid division by zero
c[i] = (i < n) ? c[i] / denom : static_cast<T>(0);
d[i] = (d[i] - a[i] * d[i - 1]) / denom;
}
// Back substitution
x[n] = d[n];
for (int i = n - 1; i >= 1; --i) {

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"ToonMcKay89::forward: bc->umu0.size(0) != ncol");
(*bc)["umu0"] = bc->at(bname + "umu0");
} else {
(*bc)["umu0"] = torch::ones({1, ncol}, prop.options());
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value for umu0 when not provided is created with shape {1, ncol}, but according to the check on line 48, umu0 should have dim() == 1 with size(0) == ncol. The default value has dim() == 2, which is inconsistent with the validation logic.

Suggested change
(*bc)["umu0"] = torch::ones({1, ncol}, prop.options());
(*bc)["umu0"] = torch::ones({ncol}, prop.options());

Copilot uses AI. Check for mistakes.
// Forward sweep
for (int i = 1; i < n; ++i) {
T denom = b[i] - a[i] * c[i - 1];
if (denom == 0.0) denom = 1e-12; // Avoid division by zero
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison denom == 0.0 for floating-point numbers is problematic. Consider using a threshold comparison like fabs(denom) < epsilon instead to handle floating-point precision issues.

Copilot uses AI. Check for mistakes.
Df[i - 1] = (gam[n_idx + 1] - 1.0) * (Cpm1[n_idx + 1] - Cp[n_idx]) +
(1.0 - gam[n_idx + 1]) * (Cm[n_idx] - Cmm1[n_idx + 1]);
n_idx++;
}
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable n_idx is initialized but then immediately reset to 0 on line 165. The initialization on line 156 is redundant and could be confusing. Consider removing the first initialization or adding a comment explaining why the counter is reset.

Suggested change
}
}
// Reuse n_idx: restart index from 0 for the second matrix row pattern

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +37
char *work = new char[mem_size];

iter.for_each(
[&](char **data, const int64_t *strides, int64_t n) {
for (int i = 0; i < n; i++) {
auto out = reinterpret_cast<scalar_t *>(data[0] + i * strides[0]);
auto prop = reinterpret_cast<scalar_t *>(data[1] + i * strides[1]);
auto umu0 = reinterpret_cast<scalar_t *>(data[2] + i * strides[2]);
auto fbeam = reinterpret_cast<scalar_t *>(data[3] + i * strides[3]);
auto albedo =
reinterpret_cast<scalar_t *>(data[4] + i * strides[4]);
toon_mckay89_shortwave(nlay, *fbeam, umu0, prop, *albedo, out,
work);
}
},
grain_size);

delete[] work;
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory allocated with new char[mem_size] on line 20 is deleted on line 37, but if an exception is thrown between these lines (e.g., in the lambda or for_each call), the memory will leak. Consider using RAII (e.g., std::unique_ptr) or wrapping the code in a try-catch block to ensure cleanup.

Copilot uses AI. Check for mistakes.
Comment on lines +50 to +52
(*bc)["umu0"] = bc->at(bname + "umu0");
} else {
(*bc)["umu0"] = torch::ones({1, ncol}, prop.options());
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The boundary condition checks on lines 50-52 incorrectly assign bc->at("umu0") to (*bc)["umu0"], but "umu0" doesn't have a bname prefix. This could cause issues if bname is non-empty. The assignment should either use the prefixed key (bname + "umu0") or the logic should be clarified.

Copilot uses AI. Check for mistakes.
for (int i = 0; i < nlay; i++) {
T g1 = sqrt3d2 * (2.0 - w0[i] * (1.0 + hg[i]));
T g2 = (sqrt3d2 * w0[i]) * (1.0 - hg[i]);
if (g2 == 0.0) g2 = 1.0e-10;
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison g2 == 0.0 for floating-point numbers is problematic as it may not catch values that are very close to zero due to floating-point precision issues. Consider using a threshold comparison like fabs(g2) < epsilon instead.

Suggested change
if (g2 == 0.0) g2 = 1.0e-10;
if (fabs(g2) < 1.0e-10) g2 = 1.0e-10;

Copilot uses AI. Check for mistakes.
T lam = sqrt(g1 * g1 - g2 * g2);
gam[i] = (g1 - lam) / g2;
T denom = (lam * lam) - 1.0 / (mu_zm[i] * mu_zm[i]);
if (denom == 0.0) denom = 1.0e-10;
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison denom == 0.0 for floating-point numbers is problematic as it may not catch values that are very close to zero due to floating-point precision issues. Consider using a threshold comparison like fabs(denom) < epsilon instead.

Suggested change
if (denom == 0.0) denom = 1.0e-10;
if (fabs(denom) < 1.0e-10) denom = (denom >= 0.0 ? 1.0e-10 : -1.0e-10);

Copilot uses AI. Check for mistakes.
Comment on lines +141 to +158
int n_idx = 0;
for (int i = 2; i <= lm2; i += 2) {
Af[i - 1] = (E1[n_idx] + E3[n_idx]) * (gam[n_idx + 1] - 1.0);
Bf[i - 1] = (E2[n_idx] + E4[n_idx]) * (gam[n_idx + 1] - 1.0);
Cf[i - 1] = 2.0 * (1.0 - gam[n_idx + 1] * gam[n_idx + 1]);
Df[i - 1] = (gam[n_idx + 1] - 1.0) * (Cpm1[n_idx + 1] - Cp[n_idx]) +
(1.0 - gam[n_idx + 1]) * (Cm[n_idx] - Cmm1[n_idx + 1]);
n_idx++;
}

n_idx = 0;
for (int i = 3; i <= lm1; i += 2) {
Af[i - 1] = 2.0 * (1.0 - gam[n_idx] * gam[n_idx]);
Bf[i - 1] = (E1[n_idx] - E3[n_idx]) * (1.0 + gam[n_idx + 1]);
Cf[i - 1] = (E1[n_idx] + E3[n_idx]) * (gam[n_idx + 1] - 1.0);
Df[i - 1] = E3[n_idx] * (Cpm1[n_idx + 1] - Cp[n_idx]) +
E1[n_idx] * (Cm[n_idx] - Cmm1[n_idx + 1]);
n_idx++;
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable n_idx is initialized but then immediately reset to 0 on line 151. The initialization on line 141 is redundant and could be confusing. Consider removing the first initialization or adding a comment explaining why the counter is reset.

Suggested change
int n_idx = 0;
for (int i = 2; i <= lm2; i += 2) {
Af[i - 1] = (E1[n_idx] + E3[n_idx]) * (gam[n_idx + 1] - 1.0);
Bf[i - 1] = (E2[n_idx] + E4[n_idx]) * (gam[n_idx + 1] - 1.0);
Cf[i - 1] = 2.0 * (1.0 - gam[n_idx + 1] * gam[n_idx + 1]);
Df[i - 1] = (gam[n_idx + 1] - 1.0) * (Cpm1[n_idx + 1] - Cp[n_idx]) +
(1.0 - gam[n_idx + 1]) * (Cm[n_idx] - Cmm1[n_idx + 1]);
n_idx++;
}
n_idx = 0;
for (int i = 3; i <= lm1; i += 2) {
Af[i - 1] = 2.0 * (1.0 - gam[n_idx] * gam[n_idx]);
Bf[i - 1] = (E1[n_idx] - E3[n_idx]) * (1.0 + gam[n_idx + 1]);
Cf[i - 1] = (E1[n_idx] + E3[n_idx]) * (gam[n_idx + 1] - 1.0);
Df[i - 1] = E3[n_idx] * (Cpm1[n_idx + 1] - Cp[n_idx]) +
E1[n_idx] * (Cm[n_idx] - Cmm1[n_idx + 1]);
n_idx++;
int n_idx_even = 0;
for (int i = 2; i <= lm2; i += 2) {
Af[i - 1] = (E1[n_idx_even] + E3[n_idx_even]) * (gam[n_idx_even + 1] - 1.0);
Bf[i - 1] = (E2[n_idx_even] + E4[n_idx_even]) * (gam[n_idx_even + 1] - 1.0);
Cf[i - 1] = 2.0 * (1.0 - gam[n_idx_even + 1] * gam[n_idx_even + 1]);
Df[i - 1] = (gam[n_idx_even + 1] - 1.0) * (Cpm1[n_idx_even + 1] - Cp[n_idx_even]) +
(1.0 - gam[n_idx_even + 1]) * (Cm[n_idx_even] - Cmm1[n_idx_even + 1]);
n_idx_even++;
}
int n_idx_odd = 0;
for (int i = 3; i <= lm1; i += 2) {
Af[i - 1] = 2.0 * (1.0 - gam[n_idx_odd] * gam[n_idx_odd]);
Bf[i - 1] = (E1[n_idx_odd] - E3[n_idx_odd]) * (1.0 + gam[n_idx_odd + 1]);
Cf[i - 1] = (E1[n_idx_odd] + E3[n_idx_odd]) * (gam[n_idx_odd + 1] - 1.0);
Df[i - 1] = E3[n_idx_odd] * (Cpm1[n_idx_odd + 1] - Cp[n_idx_odd]) +
E1[n_idx_odd] * (Cm[n_idx_odd] - Cmm1[n_idx_odd + 1]);
n_idx_odd++;

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +17
* Ported by Xi Zhang to Eigen
* Ported by Cheng Li to torch
* Reference: Toon, O.B., 1989, JGR, 94, 16287-16301.
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment has a typo: "Proted" should be "Ported".

Suggested change
* Ported by Xi Zhang to Eigen
* Ported by Cheng Li to torch
* Reference: Toon, O.B., 1989, JGR, 94, 16287-16301.
* Ported by Cheng Li to torch
* Reference: Toon, O.B., 1989, JGR, 94, 16287-16301.
* Reference: Toon, O.B., 1989, JGR, 94, 16287-16301.

Copilot uses AI. Check for mistakes.
Comment on lines +46 to +61
char *work = new char[mem_size];

iter.for_each(
[&](char **data, const int64_t *strides, int64_t n) {
for (int i = 0; i < n; i++) {
auto out = reinterpret_cast<scalar_t *>(data[0] + i * strides[0]);
auto prop = reinterpret_cast<scalar_t *>(data[1] + i * strides[1]);
auto be = reinterpret_cast<scalar_t *>(data[2] + i * strides[2]);
auto albedo =
reinterpret_cast<scalar_t *>(data[3] + i * strides[3]);
toon_mckay89_longwave(nlay, be, prop, *albedo, out, work);
}
},
grain_size);

delete[] work;
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory allocated with new char[mem_size] on line 46 is deleted on line 61, but if an exception is thrown between these lines (e.g., in the lambda or for_each call), the memory will leak. Consider using RAII (e.g., std::unique_ptr) or wrapping the code in a try-catch block to ensure cleanup.

Copilot uses AI. Check for mistakes.
@chengcli chengcli merged commit 20f7aa3 into main Jan 19, 2026
9 checks passed
@chengcli chengcli deleted the cli/test_toon branch January 19, 2026 20:47
@github-actions
Copy link
Contributor

🎉 Released v1.8.7!

What's Changed

Full Changelog: v1.8.6...v1.8.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant