Fix DuckDB SQL queries hanging under concurrent requests (#418)#419
Fix DuckDB SQL queries hanging under concurrent requests (#418)#419Rish-it wants to merge 2 commits intogetnao:mainfrom
Conversation
Resolve relative DB paths to absolute and run blocking calls in asyncio.to_thread().
There was a problem hiding this comment.
Pull request overview
Fixes a concurrency bug in the FastAPI /execute_sql endpoint where process-global working directory changes could cause DuckDB (and other file-based DBs) to resolve relative database paths incorrectly under concurrent requests.
Changes:
- Removes request-time
os.chdir()usage by resolving the project folder to an absolute path. - Converts relative database file paths to absolute paths before executing SQL.
- Runs blocking
execute_sqlwork inasyncio.to_thread()to keep the event loop responsive.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| db_path = Path(db_config.path) | ||
| if not db_path.is_absolute(): | ||
| db_config.path = str(project_path / db_path) |
There was a problem hiding this comment.
Path(db_config.path) treats values like ~/data.duckdb as a relative path, so this code would rewrite it to <project_path>/~/data.duckdb and likely break connections. Consider expanding ~ (e.g., Path(...).expanduser()) and then re-checking is_absolute() / resolving before prefixing with project_path.
| db_path = Path(db_config.path) | |
| if not db_path.is_absolute(): | |
| db_config.path = str(project_path / db_path) | |
| db_path = Path(db_config.path).expanduser() | |
| if not db_path.is_absolute(): | |
| db_config.path = str(project_path / db_path) | |
| else: | |
| db_config.path = str(db_path) |
| # Resolve relative DB paths (e.g. DuckDB) to absolute before executing, | ||
| # so concurrent requests changing cwd via os.chdir() can't cause races. |
There was a problem hiding this comment.
The inline comment mentions "concurrent requests changing cwd via os.chdir()", but this handler no longer calls os.chdir(). Updating the comment to describe the actual risk (process-global cwd affecting relative DB paths, and why paths are made absolute) would avoid confusion for future readers.
| # Resolve relative DB paths (e.g. DuckDB) to absolute before executing, | |
| # so concurrent requests changing cwd via os.chdir() can't cause races. | |
| # Resolve relative DB paths (e.g. DuckDB) to absolute based on the project | |
| # root, so changes to the process-global working directory (or starting the | |
| # server from a different cwd) do not change which database file is used. |
|
How did you successfully reproduced the error? |
|
simulated it programmatically: two concurrent async tasks calling os.chdir(project_dir) # Request A — correct dir, data.duckdb exists
os.chdir(wrong_dir) # Request B stomps cwd
ibis.duckdb.connect("data.duckdb") # Request A — resolves against wrong dir, failsWorks in CLI because it runs one query at a time. In the app, FastAPI handles concurrent requests on one event loop — Edge cases I tested
|
Summary
os.chdir()in theexecute_sqlhandler is process-global, so concurrent requests stomp each other's working directory, causing DuckDB to resolve its relative file path against the wrong locationdata.duckdb) to absolute before connecting, and offload blocking DB calls toasyncio.to_thread()to keep the event loop responsiveChanges
apps/backend/fastapi/main.py— Removeos.chdir()from handler, resolve relative paths to absolute, wrapexecute_sqlinasyncio.to_thread()Test plan
os.chdir()+ relative DuckDB pathasyncio.to_thread()keeps event loop alive during blocking queries:memory:and absolute paths are not modifiedruff,ty) and TypeScript linter (tsc,eslint) pass