chat_bot modeified and logs integrated#4
chat_bot modeified and logs integrated#4Frustum01 wants to merge 9 commits intogarvjain7:featurefrom
Conversation
- Employee 'Request Admin Access' workflow with approval polling - Admin PermissionPage to approve/deny dataset modification requests - Backend role escalation for approved modification queries - Dataset persistence: write modified DataFrames back to disk - Activity logs: track QUERY vs MODIFY events with intent detection - Chatbot History tab in admin Logs page grouped by employee - Fix sidebar flex-shrink for side-by-side layout stability - Fix userId reference (req.user.id) for reliable activity logging
Reviewer's GuideIntegrates a new FastAPI-based dataset insight chatbot microservice and wires it into the existing Node/React app, adds role- and intent-aware logging for chat activity (including data modifications), enhances employee chat UX with model selection, dataset metadata, and access request flows, and extends the admin UI with chatbot history and dataset access approvals. Sequence diagram for employee chat query through Node to FastAPI chatbotsequenceDiagram
actor Employee
participant Browser as EmployeeChatPage
participant Api as askQuery_api
participant NodeRoute as Node_/query_route
participant ChatCtrl as ChatController_askQuestion
participant FastAPI as FastAPI_internal_query
participant ActCtrl as ActivityController_logQueryActivity
participant DB as Postgres_DB
Employee->>Browser: Type message and click Send
Browser->>Browser: handleSend(text)
Browser->>Api: askQuery(datasetId, text, selectedModel, 'employee', isApproved)
Api->>NodeRoute: POST /query {datasetId, question, model, portal, isApproved}
NodeRoute->>ChatCtrl: askQuestion(req, res)
ChatCtrl->>DB: Resolve dataset dir for datasetId
DB-->>ChatCtrl: datasetDir
ChatCtrl->>FastAPI: POST /internal/query {dataset_id, file_dir_path, question, model, role}
Note right of FastAPI: Extract schema
FastAPI->>FastAPI: call_llm() to generate pandas code
FastAPI->>FastAPI: safe_execute(code, df)
FastAPI->>FastAPI: call_llm() to summarize result
FastAPI-->>ChatCtrl: {answer, code, intent, confidence}
ChatCtrl-->>NodeRoute: JSON {success, answer, code, intent}
NodeRoute->>ActCtrl: logQueryActivity(userId, userName, userEmail, datasetId, datasetName, queryText, status, duration, intent)
ActCtrl->>DB: INSERT activity log (event_type QUERY or MODIFY)
DB-->>ActCtrl: ok
NodeRoute-->>Api: HTTP 200 {answer, code, intent}
Api-->>Browser: response
Browser->>Browser: buildResponseHTML({text: answer, code})
Browser-->>Employee: Render bot message and optional code
Class diagram for key chatbot-related components and modelsclassDiagram
class EmployeeChatPage {
+datasetMeta
+selectedModel
+accessRequests
+handleSend(text)
+handleRequestAccess(id)
}
class ApiService {
+askQuery(datasetId, question, model, portal, isApproved)
+getDatasetAnalysis(datasetId)
}
class ChatController {
+askQuestion(req, res)
}
class InternalQueryRequest {
+string dataset_id
+string file_dir_path
+string question
+string model
+string role
}
class FastAPIApp {
+query_dataset(req, current_user)
+internal_query(req)
+extract_schema(df)
+safe_execute(code, df)
+call_llm(model, system, user)
+role_can(role, action)
}
class ActivityController {
+logQueryActivity(userId, userName, userEmail, datasetId, datasetName, query, status, durationSeconds, intent)
}
class LogsPage {
+activeTab
+ChatbotHistoryView(logs)
+getMethodClass(event)
+getEventIcon(event)
}
class PermissionPage {
+datasetAccessRequests
+handleDatasetAccessAction(id, newStatus)
}
EmployeeChatPage --> ApiService : uses
EmployeeChatPage --> LogsPage : generates QUERY
EmployeeChatPage --> PermissionPage : shares datasetAccessRequests via localStorage
ApiService --> ChatController : calls /query
ChatController --> FastAPIApp : axios POST /internal/query
FastAPIApp --> InternalQueryRequest : accepts
FastAPIApp --> ActivityController : supplies intent modify/insight
ActivityController --> LogsPage : event_type QUERY or MODIFY
PermissionPage --> EmployeeChatPage : updates accessRequests state
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- The dataset access approval flow relies entirely on
localStorage(datasetAccessRequests) shared between the employee and admin UIs, which won’t work across different browsers/devices and is trivial to tamper with on the client; consider moving this to a backend-backed permission request model instead of treating localStorage as an authority of record. - The new
scratch-test.jsscript inbackend-nodelooks like a local debugging helper and is not referenced anywhere; consider removing it from the repo or moving it under a dedicated tooling/dev directory to avoid confusion.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The dataset access approval flow relies entirely on `localStorage` (`datasetAccessRequests`) shared between the employee and admin UIs, which won’t work across different browsers/devices and is trivial to tamper with on the client; consider moving this to a backend-backed permission request model instead of treating localStorage as an authority of record.
- The new `scratch-test.js` script in `backend-node` looks like a local debugging helper and is not referenced anywhere; consider removing it from the repo or moving it under a dedicated tooling/dev directory to avoid confusion.
## Individual Comments
### Comment 1
<location path="frontend-react/src/pages/employee/EmployeeChatPage.jsx" line_range="150-159" />
<code_context>
+ const handleRequestAccess = (id) => {
</code_context>
<issue_to_address>
**issue (bug_risk):** Access request polling interval is never cleaned up, which can leak timers across the session.
Each `handleRequestAccess` call creates a `setInterval` that runs indefinitely unless the request is approved or removed from `localStorage`, so timers can keep firing after the component unmounts or the user navigates away. Please tie the polling lifecycle to React (e.g., store the interval ID and clear it in a `useEffect` cleanup) or move to a single shared interval keyed by dataset/user instead of per-request intervals.
</issue_to_address>
### Comment 2
<location path="backend-node/src/controllers/chatController.js" line_range="63-64" />
<code_context>
+ dataset_id: datasetId,
+ file_dir_path: datasetDir,
+ question: queryText,
+ model: model || "groq",
+ role: req.body.isApproved ? 'admin' : (req.body.portal === 'employee' ? 'employee' : (req.user?.role || "viewer"))
+ };
</code_context>
<issue_to_address>
**🚨 issue (security):** Trusting `isApproved` from the client to escalate role to admin is a privilege escalation risk.
Because `role` is computed directly from `req.body.isApproved`, a client can POST `{ isApproved: true }` to `/query` and obtain `admin` privileges in the Python service, bypassing any UI checks. Since `internal/query` relies solely on `role` to gate modification actions, this is a privilege‑escalation vulnerability. Please derive `role` and permissions only from server‑side state (e.g., `req.user` and persisted permissions) and pass a backend‑controlled `role` value that cannot be influenced by client JSON.
</issue_to_address>
### Comment 3
<location path="dataset-insight-chatbot/backend/Dockerfile" line_range="12" />
<code_context>
+
+EXPOSE 8000
+
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--reload"]
</code_context>
<issue_to_address>
**suggestion (performance):** Running uvicorn with `--reload` inside the container is not ideal for non-dev deployments.
`--reload` adds a file watcher that increases overhead and can behave unpredictably with container filesystems. For non-dev images, drop `--reload` and either run a single-process uvicorn or use a process manager (e.g., gunicorn with uvicorn workers). If this is dev-only, consider documenting that or using a separate `Dockerfile.dev`.
Suggested implementation:
```
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
```
1. If this image is intended for development, consider adding a separate `Dockerfile.dev` that keeps `--reload` and documenting in the README which Dockerfile to use for dev vs prod.
2. For production, you may also want to consider a process manager (e.g. `gunicorn -k uvicorn.workers.UvicornWorker`) if you need multiple workers or more advanced process supervision.
</issue_to_address>
### Comment 4
<location path="dataset-insight-chatbot/README.md" line_range="140-7" />
<code_context>
+
+### POST /upload
+
+```json
+// Response
+{
+ "dataset_id": "sales.csv",
+ "schema": {
+ "row_count": 5000,
+ "columns": {
+ "revenue": {
+ "type": "numeric",
+ "min": 100, "max": 95000, "mean": 4820.5, ...
+ },
+ "region": {
+ "type": "categorical",
+ "unique_count": 4,
+ "top_values": ["North", "South", "East", "West"]
+ }
+ }
+ }
+}
+```
+
+### POST /query
</code_context>
<issue_to_address>
**suggestion:** JSON examples include comments/ellipsis but are fenced as strict JSON
In the `/upload` and `/query` sections, the fenced `json` examples contain `//` comments and `...`, which aren’t valid JSON. To avoid confusing users who might paste them into tools, consider changing the fence (e.g., to `jsonc`), or making the examples valid JSON by removing comments/ellipsis or clearly marking them as illustrative only.
Suggested implementation:
```
### POST /upload
```jsonc
// Example response
{
"dataset_id": "sales.csv",
"schema": {
"row_count": 5000,
"columns": {
"revenue": {
"type": "numeric",
"min": 100,
"max": 95000,
"mean": 4820.5
// ...additional numeric stats
},
"region": {
"type": "categorical",
"unique_count": 4,
"top_values": ["North", "South", "East", "West"]
// ...additional categorical stats
}
}
}
}
```
```
```
### POST /query
```jsonc
// Example request
{
"dataset_id": "sales.csv",
"question": "Which region has the highest average revenue?",
"model": "ollama" // or "claude"
}
// Example response
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| const handleRequestAccess = (id) => { | ||
| setAccessRequests(prev => ({ ...prev, [id]: 'requested' })); | ||
|
|
||
| // Save to localStorage so admin can see it | ||
| const reqs = JSON.parse(localStorage.getItem('datasetAccessRequests') || '[]'); | ||
| const user = localStorage.getItem('userName') || 'Employee User'; | ||
| const email = localStorage.getItem('email') || 'employee@datainsights.app'; | ||
| reqs.push({ | ||
| id, | ||
| user, |
There was a problem hiding this comment.
issue (bug_risk): Access request polling interval is never cleaned up, which can leak timers across the session.
Each handleRequestAccess call creates a setInterval that runs indefinitely unless the request is approved or removed from localStorage, so timers can keep firing after the component unmounts or the user navigates away. Please tie the polling lifecycle to React (e.g., store the interval ID and clear it in a useEffect cleanup) or move to a single shared interval keyed by dataset/user instead of per-request intervals.
| model: model || "groq", | ||
| role: req.body.isApproved ? 'admin' : (req.body.portal === 'employee' ? 'employee' : (req.user?.role || "viewer")) |
There was a problem hiding this comment.
🚨 issue (security): Trusting isApproved from the client to escalate role to admin is a privilege escalation risk.
Because role is computed directly from req.body.isApproved, a client can POST { isApproved: true } to /query and obtain admin privileges in the Python service, bypassing any UI checks. Since internal/query relies solely on role to gate modification actions, this is a privilege‑escalation vulnerability. Please derive role and permissions only from server‑side state (e.g., req.user and persisted permissions) and pass a backend‑controlled role value that cannot be influenced by client JSON.
|
|
||
| EXPOSE 8000 | ||
|
|
||
| CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--reload"] |
There was a problem hiding this comment.
suggestion (performance): Running uvicorn with --reload inside the container is not ideal for non-dev deployments.
--reload adds a file watcher that increases overhead and can behave unpredictably with container filesystems. For non-dev images, drop --reload and either run a single-process uvicorn or use a process manager (e.g., gunicorn with uvicorn workers). If this is dev-only, consider documenting that or using a separate Dockerfile.dev.
Suggested implementation:
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
- If this image is intended for development, consider adding a separate
Dockerfile.devthat keeps--reloadand documenting in the README which Dockerfile to use for dev vs prod. - For production, you may also want to consider a process manager (e.g.
gunicorn -k uvicorn.workers.UvicornWorker) if you need multiple workers or more advanced process supervision.
|
|
||
| ## How it works | ||
|
|
||
| ``` |
There was a problem hiding this comment.
suggestion: JSON examples include comments/ellipsis but are fenced as strict JSON
In the /upload and /query sections, the fenced json examples contain // comments and ..., which aren’t valid JSON. To avoid confusing users who might paste them into tools, consider changing the fence (e.g., to jsonc), or making the examples valid JSON by removing comments/ellipsis or clearly marking them as illustrative only.
Suggested implementation:
### POST /upload
```jsonc
// Example response
{
"dataset_id": "sales.csv",
"schema": {
"row_count": 5000,
"columns": {
"revenue": {
"type": "numeric",
"min": 100,
"max": 95000,
"mean": 4820.5
// ...additional numeric stats
},
"region": {
"type": "categorical",
"unique_count": 4,
"top_values": ["North", "South", "East", "West"]
// ...additional categorical stats
}
}
}
}
POST /query
// Example request
{
"dataset_id": "sales.csv",
"question": "Which region has the highest average revenue?",
"model": "ollama" // or "claude"
}
// Example response
…individual employees
…anual dataset cleaning
Summary by Sourcery
Integrate a new FastAPI-based Dataset Insight Chatbot backend and UI, wire it into the existing Node/React app for secure, logged data querying and modification, and enhance admin/employee flows with chatbot activity views, dataset metadata, access requests, and permission-aware model selection.
New Features:
Bug Fixes:
Enhancements:
Deployment:
Documentation:
Chores: