Skip to content

RFC: User Defined Aggregate Functions#70

Open
wangrunji0408 wants to merge 6 commits intomainfrom
wrj/user-defined-aggregate-function
Open

RFC: User Defined Aggregate Functions#70
wangrunji0408 wants to merge 6 commits intomainfrom
wrj/user-defined-aggregate-function

Conversation

@wangrunji0408
Copy link
Copy Markdown

No description provided.

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
@wangrunji0408 wangrunji0408 force-pushed the wrj/user-defined-aggregate-function branch from 8c48501 to 7a90897 Compare August 3, 2023 07:44

User defined functions are running in a separate process called UDF server. The RisingWave kernel communicates with the UDF server through Arrow Flight RPC to exchange data.

To avoid trouble in fault tolerance, the UDF server should be **stateless** even if the aggregate function is stateful. This means the aggregate state should be maintained by the kernel. However, exchanging the state in each RPC call (batch aggregation) is not efficient, especially when the state is large. On the other hand, kernel doesn't need to know the state or the aggregate result before a barrier arrives. Therefore, we **create a streaming RPC `doExchange` to call the aggregate function, and sync the state every time a barrier arrives**. When the connection to the UDF server is broken, the executor raises an error and may retry or pause the stream.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a fatal problem that the current design doesn't take group aggregation into consideration. We still have to send intermediate states for each group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants