process_quantile differing number of rows output for rstan and cmdstanr #269

micahwiesner67 · 2025-05-14T20:11:40Z

Running a model fit using a 'cmdstanr' backend and a 'rstan' backend on the same simple test data test_data.parquet and extracting the results by passing the data through process_quantiles yields a differing number of rows coming out of process_quantiles.

Intuition tells me if we are processing data for 5 reference_dates the results should contain 55 rows (2x the number of rows for each variable that contains a bound (as this is split between 0.5 and 0.95 quantile reporting), while processed_obs_cases will just have 1 row for each reference_date). The cmdstanr data has 57 rows while the rstan back-end has the documented 55 cases as expected.

It is unclear at this time if this is due to code in the process_quantiles, or extract_draws_from_fit function or the underlying EpiNow2::epinow() function

cmdstanr result - bad

rstan result - good

My current hypothesis is that the extract_draws_from_fit function is not the issue and the post_process_and_merge function code is where the issue arises

@zsusswein @kgostic putting here for documentation

The text was updated successfully, but these errors were encountered:

kgostic · 2025-05-14T21:03:13Z

Let me know if you want to jump on a call tomorrow. I'd be curious about the underlying values too.

Let's take pp_nowcast_cases as a case study. It looks like in both cases, we get the number of rows we expect, but do we get the same values? I think that's one of the variables you're trying to plot that isn't rendering.
Then focus on processed_obs_data. How do those values compare? What rows are present in the cmdstan result but not in the rstan?
Same Q as 2 for growth rate.

micahwiesner67 · 2025-05-22T14:12:39Z

We get the same values coming out of summaries (excluding the duplicate rows for processed_obs_data and growth_rate)
The processed_obs_data rows being duplicated and growth_rate rows are also duplicated
The values that come out of both are the same, but the growth_rate rows are actually missing two rows. The final date (time index = 5) values are both dropped

After further probing, this discrepancy does arise in the draws_table

Additionally,
I just tested if the cmdstanr version being installed could be related to this issue, this does not seem to be the case. I removed the commit hash in the DESCRIPTION file and installed cmdstanr version 0.9.0 (with the same version of EpiNow2 - 1.6.1) and the duplicate rows still is an issue

@PatrickTCorbett can you look into a bandaid fix in the post-processing

This was referenced May 22, 2025

Refactor stan code to harmonise time variables epiforecasts/EpiNow2#1070

Open

EpiNow2 update changes indexing of little R #303

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

process_quantile differing number of rows output for rstan and cmdstanr #269

process_quantile differing number of rows output for rstan and cmdstanr #269

micahwiesner67 commented May 14, 2025 •

edited

Loading

kgostic commented May 14, 2025

Uh oh!

micahwiesner67 commented May 22, 2025 •

edited

Loading

Uh oh!

process_quantile differing number of rows output for rstan and cmdstanr #269

process_quantile differing number of rows output for rstan and cmdstanr #269

Comments

micahwiesner67 commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

kgostic commented May 14, 2025

Uh oh!

micahwiesner67 commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

micahwiesner67 commented May 14, 2025 •

edited

Loading

micahwiesner67 commented May 22, 2025 •

edited

Loading