-
Notifications
You must be signed in to change notification settings - Fork 25
Open
Description
When using querychat with a Databricks (Spark SQL) connection via dbplyr::tbl(), the execute_query() method in TblSqlSource returns a lazy tbl_sql object. When ellmer tries to serialize this tool result to JSON, it fails because jsonlite::toJSON() calls nrow() on the lazy table, which returns NA for remote connections.
Error Message
Error in tool_string() at ellmer/R/provider-aws.R:
! Could not convert tool result from a object to JSON.
ℹ If you are the tool author, update the tool to convert the result to a string or JSON.
Caused by error in if (!nrow(x)) ...:
! missing value where TRUE/FALSE needed
Reproduction
library(querychat)
library(DBI)
library(odbc)
library(dbplyr)
con <- dbConnect(odbc(), dsn = "Databricks_DSN")
tbl_obj <- tbl(con, in_catalog("catalog", "schema", "table_name"))
qc <- QueryChat$new(
tbl_obj,
client = chat_openai() # or any ellmer chat client
)When the LLM invokes the update_dashboard tool, the error occurs
Root Cause
In TblSqlSource.R, execute_query() returns a lazy tbl:
execute_query = function(query) {
sql_query <- self$prep_query(query)
dplyr::tbl(private$conn, dplyr::sql(sql_query)) # Returns lazy tbl
}This works for the Shiny dashboard (which calls collect() before rendering), but fails when ellmer tries to serialize the tool result for the LLM.
Note: test_query() works correctly because it delegates to DBISource which returns collected data.
Attempted Workarounds
- Using DBISource directly - Passing the DBI connection + table name doesn't work with Databricks three-part naming (catalog.schema.table) due to table name validation.
- Custom DataSource subclass - Creating a subclass that overrides execute_query() to collect results hits the same table name validation issue.
- Registering S3 methods for jsonlite::toJSON - The method dispatch doesn't seem to pick up custom methods for tbl_sql classes.
Suggested Fix
One of:
- Collect in execute_query() for TblSqlSource:
execute_query = function(query) {
sql_query <- self$prep_query(query)
dplyr::collect(dplyr::tbl(private$conn, dplyr::sql(sql_query)))
} - Add a parameter to control whether results are collected (for users who need lazy evaluation).
- Handle serialization in querychat_tool_result() by collecting lazy tbls before returning.
Environment
- querychat: (version)
- ellmer: (version)
- R: (version)
- Database: Databricks (Spark SQL) via odbc
- OS: Linux
Metadata
Metadata
Assignees
Labels
No labels