GROUP BY CUBE / ROLLUP / GROUPING SETS #265

mlell · 2025-12-09T18:08:32Z

This implements the SQL GROUP BY modifiers CUBE, ROLLUP, and GROUPING SETS, which allow to add subtotals to the output. For me, this is very useful to produce, for example monthly reports of income vs. expenses, where the row (Total) tells me the net gain/loss of the month:

beanquery> select yearmonth(date) as month, account, sum(position) where account !~ "Assets"  group by cube (account,month) 
pivot by account,month
  account/month      2024-02-01    2024-03-01    2024-04-01      (Total)   
------------------  ------------  ------------  ------------  -------------
Expenses:Dining                     120.50 USD     45.25 USD     165.75 USD
Expenses:Groceries    250.00 USD    310.75 USD    275.60 USD     836.35 USD
Income:Salary       -5000.00 USD  -5000.00 USD  -5000.00 USD  -15000.00 USD
(Total)             -4750.00 USD  -4568.75 USD  -4679.15 USD  -13997.90 USD

What do you think of this?

…ke (Total), (earlier), (other)

mlell · 2025-12-09T20:37:41Z

Would fix #253

…columns

… NULL for subtotals

… refactor _select(), remove _compile_select_base() to move closer to the original logic before ROLLUP, CUBE, etc.

- In AST: Change 'sets' modifier to 'grouping sets' for consistency - Update AST structure to use GroupByElement for consistency with other language constructs - Update tests to reflect new structure

dnicolodi · 2025-12-26T17:21:21Z

Thanks for working on this. This would be a nice feature to have in beanquery. Although, there are a few issues with the PR.

First, to make the PR more easily reviewable, the patches should be organized in logical commits, not in commits working a reworking the same code. In particular, GROUP BY CUBE and GROUP BY ROLLUP are simply syntactic sugar for GROUP BY GROUPING SETS thus, if anything, the latter should be introduced first, and the former added later, not the other way around.

Glancing at the code, EvalUnion is written to be specific for GROUPING SETS support, however, it does not do any optimization of the evaluation of the query for different GROUP BY clauses, but it just concatenates the queries, thus it can be used to add generic UNION support.

I don't understand which problem the addition of the Sentinel class solves: it is unused in the commit that adds and two of the module level constants added are never used, AFAICT.

Finally, the use of (Total) as a fill value for grouping columns is not standard and thus would require very good motivation, which is not provided. It also seems wrong to use (Total) for some columns when grouping over multiple columns is involved.

mlell · 2025-12-28T13:39:19Z

Thank you for considering this! I will follow your review:

I will rebase the commits to follow the logical schema you have suggested: EvalUnion, GROUPING SETS, then syntactic sugar.
I see that the Sentinel object might lead into a dead end and is out of scope here. I will remove it. The motivation was the following:
- disambiguate the values which are NULL because of the grouping pattern from those which are NULL because of missing data.
- The modified version of beangulp which I use for me contains the function date_cap, which can collate all dates outside of a given range into (earlier) or (later). I needed some way to make sure (earlier) sorts above the values and (later) sorts below.

Remaining questions

You mentioned UNION support. Its true that EvalUnion allows this. Should I add this to the BQL or is this out of scope for now?
Should I include ORDER BY .... NULLS (FIRST|LAST) into the BQL to allow for some way to sort the totals to the bottom?

mlell added 2 commits December 9, 2025 19:12

Implement GROUP BY ROLLUP (col1, col2, ...)

b7fdaee

Add a "Sentinel" object to provide row headers for summarizations, li…

fa02a5c

…ke (Total), (earlier), (other)

mlell force-pushed the dev-group-rollup branch from 9da0cc6 to 4c1ce0a Compare December 9, 2025 18:12

mlell mentioned this pull request Dec 9, 2025

Totals summary? #253

Open

mlell added 8 commits December 10, 2025 11:58

Deal with multiple data types (e.g. Sentinel objects) when rendering …

1681d54

…columns

Support PIVOT BY after GROUP BY ROLLUP by using a sentinel instead of…

299c797

… NULL for subtotals

Rename Rollup Sentinel for better consistency

0b88056

Implement GROUP BY CUBE / GROUPING SETS

6813810

Do cartesian product when combining ROLLUP, CUBE, GROUPING SET. Also,…

352efcf

… refactor _select(), remove _compile_select_base() to move closer to the original logic before ROLLUP, CUBE, etc.

Replace separate rollup/cube/sets flags with unified 'type' field in AST

c90f264

Add tests for GROUP BY CUBE / ROLLUP

7acdab6

Refactor BQL grammar to standardize GROUP BY modifier handling

16f7cee

- In AST: Change 'sets' modifier to 'grouping sets' for consistency - Update AST structure to use GroupByElement for consistency with other language constructs - Update tests to reflect new structure

mlell force-pushed the dev-group-rollup branch from 4c1ce0a to 16f7cee Compare December 10, 2025 11:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GROUP BY CUBE / ROLLUP / GROUPING SETS #265

GROUP BY CUBE / ROLLUP / GROUPING SETS #265

Uh oh!

mlell commented Dec 9, 2025

Uh oh!

mlell commented Dec 9, 2025

Uh oh!

dnicolodi commented Dec 26, 2025

Uh oh!

mlell commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GROUP BY CUBE / ROLLUP / GROUPING SETS #265

Are you sure you want to change the base?

GROUP BY CUBE / ROLLUP / GROUPING SETS #265

Uh oh!

Conversation

mlell commented Dec 9, 2025

Uh oh!

mlell commented Dec 9, 2025

Uh oh!

dnicolodi commented Dec 26, 2025

Uh oh!

mlell commented Dec 28, 2025

Remaining questions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants