Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,7 @@ LLM_MODEL=gpt-3.5-turbo
LLM_API_KEY=your-api-key-here
LLM_ENGINE_PORT=8001

# Query Router Configuration
QUERY_ROUTER_PORT=8002

RUST_LOG=debug
20 changes: 10 additions & 10 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ jobs:
cache-from: type=gha
cache-to: type=gha,mode=max

build-query-runner:
build-query-router:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
Expand All @@ -55,18 +55,18 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2

- name: Build Query Runner service
- name: Build Query Router service
uses: docker/build-push-action@v4
with:
context: ./query_runner
context: ./query_router
push: false
load: true
tags: lucidata-query-runner:latest
tags: lucidata-query-router:latest
cache-from: type=gha
cache-to: type=gha,mode=max

system-test:
needs: [build-api, build-llm-engine, build-query-runner]
needs: [build-api, build-llm-engine, build-query-router]
runs-on: ubuntu-latest
steps:
- name: Checkout repository
Expand Down Expand Up @@ -95,20 +95,20 @@ jobs:
cache-from: type=gha
outputs: type=docker,dest=/tmp/llm-engine-image.tar

- name: Download Query Runner image
- name: Download Query Router image
uses: docker/build-push-action@v4
with:
context: ./query_runner
context: ./query_router
load: true
tags: lucidata-query-runner:latest
tags: lucidata-query-router:latest
cache-from: type=gha
outputs: type=docker,dest=/tmp/query-runner-image.tar
outputs: type=docker,dest=/tmp/query-router-image.tar

- name: Load saved images
run: |
docker load < /tmp/api-image.tar
docker load < /tmp/llm-engine-image.tar
docker load < /tmp/query-runner-image.tar
docker load < /tmp/query-router-image.tar
docker images

- name: Start services with docker compose
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,8 @@ jobs:
working-directory: ./llm_engine
run: cargo test

query_runner:
name: Query Runner Service
query_router:
name: Query Router Service
runs-on: ubuntu-latest

steps:
Expand All @@ -76,9 +76,9 @@ jobs:
override: true

- name: Run cargo check
working-directory: ./query_runner
working-directory: ./query_router
run: cargo check

- name: Run cargo test
working-directory: ./query_runner
working-directory: ./query_router
run: cargo test
60 changes: 47 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,37 +4,62 @@

Lucidata is an LLM based query tool designed to democratize data access. It translates natural language questions into SQL/API queries over structured datasets, returning clear, traceable answers and exports.

## Features (WIP)
## Features

- Natural Language Interface: Ask questions in plain English
- Query Translation: Automatic conversion to SQL/API queries
- Result Visualization: Clear tables and charts
- Export Options: Download results in various formats (CSV, Excel, etc.)
- Query Transparency: Track and export generated queries
- Query Translation: Automatic conversion to SQL queries
- Query Transparency: Track and export generated queries, explanations, and model confidence

### Road-Map

- Support for Generic WebAPI queries
- Result Visualization

## Getting Started

### Prerequisites

- `docker` installed
- OpenAPI `API_KEY`
- An OpenAPI `API_KEY`

### Usage

1. Clone the repository
```bash
git clone https://github.com/jdhoffa/lucidata.git
gh repo clone jdhoffa/lucidata
cd lucidata
```

2. Start the application with Docker Compose
2. Build and start the application with `docker compose`:
```bash
docker compose build # it can take a while to compile, be patient :-)
docker compose up
```

3. Enter your natural language query in the input field and click "Submit"
3. Send your query to the query_router endpoint, and check out the results!
``` bash
curl -X POST "http://localhost:8002/translate-and-execute" \
-H "Content-Type: application/json" \
-d '{
"natural_query": "Show me the cars with the best power-to-weight ratio, sorted from highest to lowest"
}'
```

4. Review the results and use the export options as needed
4. (Optional) Pipe the output to the `jq` CLI:
``` bash
curl -X POST "http://localhost:8002/translate-and-execute" \
-H "Content-Type: application/json" \
-d '{
"natural_query": "Show me the cars with the best power-to-weight ratio, sorted from highest to lowest"
}' | jq

# you can also select a specific tag
curl -X POST "http://localhost:8002/translate-and-execute" \
-H "Content-Type: application/json" \
-d '{
"natural_query": "Show me the cars with the best power-to-weight ratio, sorted from highest to lowest"
}' | jq '.results'
```

## System Architecture

Expand Down Expand Up @@ -77,9 +102,18 @@ graph TD
## Example Queries

```
"What is the projected energy mix in 2030 according to IEA's Net Zero scenario?"
# Query #1 tests mathematical operations (division of hp/wt)
"Show me the cars with the best power-to-weight ratio, sorted from highest to lowest."

# Query #2 tests sorting and multi-column selection
"Compare fuel efficiency (MPG) and horsepower for all cars, sorted by MPG."

# Query #3 tests aggregation functions with grouping
"What's the average horsepower and MPG for automatic vs manual transmission cars?"

"How does natural gas production in the US compare to China over the next decade in WoodMac's base case?"
# Query #4 tests more complex aggregation and grouping
"Show me the relationship between number of cylinders and fuel efficiency with average MPG by cylinder count"

"Show me the top 5 countries by renewable energy growth in the next 5 years."
# Query #5 tests limiting results and specific column selection
"Find the top 5 cars with the highest horsepower and their quarter-mile time (qsec)"
```
5 changes: 0 additions & 5 deletions api/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,6 @@ RUN apt-get update && \
# Copy the binary from the builder stage
COPY --from=builder /app/target/release/lucidata-api /app/lucidata-api

# Create a .env file only if environment variables aren't provided
RUN touch .env && \
echo "DATABASE_URL=\${DATABASE_URL:-postgres://postgres:postgres@db:5432/pbtar}" > .env && \
echo "RUST_LOG=\${RUST_LOG:-info}" >> .env

EXPOSE 8000

CMD ["/app/lucidata-api"]
23 changes: 15 additions & 8 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ services:
db:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://api:${API_PORT}/api/health"]
test: ["CMD", "curl", "-f", "http://localhost:${API_PORT}/api/health"]
interval: 10s
timeout: 5s
retries: 5
Expand All @@ -48,7 +48,7 @@ services:
build:
context: ./llm_engine
ports:
- "8001:8001"
- "${LLM_ENGINE_PORT}:${LLM_ENGINE_PORT}"
dns:
- 8.8.8.8
- 1.1.1.1
Expand All @@ -66,26 +66,33 @@ services:
condition: service_healthy
api:
condition: service_healthy
tty: true
stdin_open: true
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8001/health"]
test: ["CMD", "curl", "-f", "http://localhost:${LLM_ENGINE_PORT}/health"]
interval: 10s
timeout: 5s
retries: 3
start_period: 5s

query_runner:
query_router:
build:
context: ./query_runner
context: ./query_router
ports:
- "${QUERY_ROUTER_PORT}:${QUERY_ROUTER_PORT}"
env_file:
- ./.env
environment:
DATABASE_URL: ${DATABASE_URL}
API_URL: ${API_URL}
QUERY_ROUTER_PORT: ${QUERY_ROUTER_PORT}
LLM_ENGINE_URL: ${LLM_ENGINE_URL}
RUST_LOG: ${RUST_LOG}
depends_on:
db:
condition: service_healthy
api:
condition: service_started
condition: service_healthy
llm_engine:
condition: service_healthy

volumes:
postgres_data:
Expand Down
3 changes: 0 additions & 3 deletions llm_engine/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,6 @@ RUN apt-get update && \
# Copy the binary from the builder stage
COPY --from=builder /app/target/release/llm_engine /app/llm_engine

# Set environment variables
ENV LLM_ENGINE_PORT=8001

# Expose the port
EXPOSE 8001

Expand Down
6 changes: 4 additions & 2 deletions query_runner/Cargo.toml → query_router/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
[package]
name = "query_runner"
name = "query_router"
version = "0.1.0"
edition = "2021"

[dependencies]
tokio = { version = "1.29", features = ["full"] }
axum = "0.6.20"
axum = "0.6"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
tower = "0.4"
tower-http = { version = "0.4", features = ["cors"] }
reqwest = { version = "0.11", features = ["json"] }
dotenvy = "0.15"
Expand All @@ -18,3 +19,4 @@ thiserror = "1.0"
tokio-postgres = { version = "0.7", features = ["with-serde_json-1"] }
futures = "0.3"
chrono = { version = "0.4", features = ["serde"] }
dotenv = "0.15"
6 changes: 4 additions & 2 deletions query_runner/Dockerfile → query_router/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,9 @@ RUN apt-get update && \
rm -rf /var/lib/apt/lists/*

# Copy the binary from the builder stage
COPY --from=builder /app/target/release/query_runner /app/query_runner
COPY --from=builder /app/target/release/query_router /app/query_router

EXPOSE 8002

# Command to run the application
CMD ["/app/query_runner"]
CMD ["/app/query_router"]
Loading
Loading