TableScanner is a microservice for providing filtered and paginated access to tabular data stored in KBase. It uses local SQLite caching and indexing to provide fast access to large datasets without loading them entirely into memory.
The service provides two methods for data access:
- Hierarchical REST: Path-based endpoints for navigating objects and tables using GET requests.
- Flat POST: A single endpoint (
/table-data) that accepts a JSON payload for all query parameters.
TableScanner operates as a bridge between KBase storage and client applications:
- Data Fetching: Retrieves SQLite databases from the KBase Blobstore.
- Local Caching: Stores databases locally to avoid repeated downloads.
- Indexing: Creates indices on-the-fly for all table columns to optimize query performance.
- API Layer: A FastAPI application that handles requests and executes SQL queries against the local cache.
Technical details on race conditions and concurrency handling are available in docs/ARCHITECTURE.md.
docker compose up --build -dThe service will be available at http://localhost:8000. API documentation is at /docs.
cp .env.example .env
bash scripts/dev.shList tables:
GET /object/{upa}/tables
Query table data:
GET /object/{upa}/tables/{table_name}/data?limit=100
Query table data:
POST /table-data
Payload example:
{
"berdl_table_id": "76990/7/2",
"table_name": "Genes",
"limit": 100
}app/: Application logic and routes.app/utils/: Utilities for caching, SQLite operations, and Workspace integration.docs/: Technical documentation.scripts/: Client examples and utility scripts.
MIT License.