Conversation
- Employee 'Request Admin Access' workflow with approval polling - Admin PermissionPage to approve/deny dataset modification requests - Backend role escalation for approved modification queries - Dataset persistence: write modified DataFrames back to disk - Activity logs: track QUERY vs MODIFY events with intent detection - Chatbot History tab in admin Logs page grouped by employee - Fix sidebar flex-shrink for side-by-side layout stability - Fix userId reference (req.user.id) for reliable activity logging
…individual employees
…anual dataset cleaning
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Reviewer's GuideImplements an internal FastAPI-based query engine with role-aware data modification, wires the existing Node/React app to it, enhances logging of query and cleaning activity (including a new MODIFY event type and employee log view), introduces dataset access request flows, and adds model selection and richer dataset context to the employee chat experience. Sequence diagram for employee chat query via FastAPI enginesequenceDiagram
actor Employee
participant ERChat as EmployeeChatPage
participant NodeAPI as Node_chatRoutes
participant ChatCtrl as chatController_askQuestion
participant FastAPI as FastAPI_internal_query
participant LLM as LLM_provider
Employee->>ERChat: Type question and click Send
ERChat->>ERChat: Read datasetId, selectedModel, userName
ERChat->>ERChat: Check localStorage datasetAccessRequests
ERChat->>NodeAPI: askQuery(datasetId, question, model, portal='employee', isApproved)
NodeAPI->>ChatCtrl: Forward POST /query body
ChatCtrl->>ChatCtrl: Resolve datasetDir for datasetId
ChatCtrl->>FastAPI: POST /internal/query {dataset_id, file_dir_path, question, model, role}
alt LLM allowed
FastAPI->>LLM: Generate pandas code from schema and question
LLM-->>FastAPI: pandas code
FastAPI->>FastAPI: Safe_execute code on DataFrame
FastAPI->>LLM: Summarize result to natural language
LLM-->>FastAPI: answer text
FastAPI-->>ChatCtrl: {answer, code, intent, confidence}
else Modify intent but role cannot_modify
FastAPI-->>ChatCtrl: {answer:'Access Denied', intent:'error'}
end
ChatCtrl-->>NodeAPI: JSON {success, answer, code, intent}
NodeAPI->>ERChat: Response propagated
ERChat->>ERChat: buildResponseHTML(text, code)
ERChat-->>Employee: Render AI message with optional code block and Access Denied CTA
Sequence diagram for dataset access request and approval flowsequenceDiagram
actor Employee
participant ERChat as EmployeeChatPage
participant LocalStorage as browser_localStorage
actor Admin
participant AdminPerm as PermissionPage
Employee->>ERChat: Receive AI message with Access Denied
ERChat->>ERChat: Detect Access Denied text
Employee->>ERChat: Click Request Admin Access
ERChat->>LocalStorage: Read datasetAccessRequests
ERChat->>LocalStorage: Append {id, user, email, dataset, datasetId, time, status:'pending'}
ERChat->>ERChat: accessRequests[id]='requested'
loop Poll for approval
ERChat->>LocalStorage: Read datasetAccessRequests
LocalStorage-->>ERChat: Current requests array
ERChat->>ERChat: If my request status is approved then set accessRequests[id]='approved' and stop polling
end
Admin->>AdminPerm: Open PermissionPage
loop Poll localStorage
AdminPerm->>LocalStorage: Read datasetAccessRequests
LocalStorage-->>AdminPerm: Pending requests
AdminPerm->>AdminPerm: setDatasetAccessRequests(pending)
end
Admin->>AdminPerm: Click Grant Access for request id
AdminPerm->>LocalStorage: Update request.status='approved'
AdminPerm->>AdminPerm: Remove request from pending list
Note over ERChat,AdminPerm: Employee polling loop detects approved status
ERChat->>ERChat: accessRequests[id]='approved'
Employee->>ERChat: Retry query (now isApproved true)
Class diagram for updated chat, logs, and permission componentsclassDiagram
class EmployeeChatPage {
+string selectedModel
+object selectedDataset
+object datasetMeta
+array messages
+object accessRequests
+function handleSend(text)
+function handleRequestAccess(id)
+function loadDatasets()
+function loadDatasetMeta(datasetId)
}
class EmployeeLogsView {
+array logs
+string searchTerm
+function formatDate(date)
+function formatDuration(seconds)
+function getMethodClass(eventType)
+function getEventIcon(eventType)
}
class LogsPage {
+array logs
+string activeTab
%% activeTab: system or employee_logs
+number totalLogins
+number totalCleans
+number totalModifies
+number totalQueries
+function fetchData()
+function clearFilters()
}
class PermissionPage {
+array users
+array pendingUsers
+array datasetAccessRequests
+function fetchData()
+function handleApprove(email)
+function handleReject(email)
+function handleDatasetAccessAction(id, newStatus)
}
class ApiService {
+function askQuery(datasetId, question, model, portal, isApproved)
+function getDatasetAnalysis(datasetId)
+function cleanDataset(datasetId, detail)
}
EmployeeChatPage --> ApiService : uses askQuery
EmployeeChatPage --> ApiService : uses getDatasetAnalysis
EmployeeChatPage --> ApiService : uses cleanDataset (via other pages)
LogsPage --> EmployeeLogsView : renders
PermissionPage --> PermissionPage : polls_localStorage_for_datasetAccessRequests
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- There are unresolved merge conflict markers in several files (e.g.
frontend-react/src/services/api.js,frontend-react/src/layout/EmployeeLayout.jsx,backend-node/src/controllers/uploadController.js); these need to be resolved and the chosen storage/field names made consistent before merging. - The dataset access request/approval flow is implemented entirely via
localStorageon the client (datasetAccessRequestsin EmployeeChatPage and PermissionPage), which is trivially spoofable and not multi-user safe; consider moving this workflow (storage, approvals, and polling) to backend APIs tied to authenticated users. - The new FastAPI
dataset-insight-chatbotservice introduces a parallel auth and dataset handling stack; if it is intended to be the production path, consider aligning its models and permission checks more closely with the existing Node backend (e.g. dataset identifiers, roles, and logging) to avoid duplicated logic and inconsistent behavior between the two paths.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- There are unresolved merge conflict markers in several files (e.g. `frontend-react/src/services/api.js`, `frontend-react/src/layout/EmployeeLayout.jsx`, `backend-node/src/controllers/uploadController.js`); these need to be resolved and the chosen storage/field names made consistent before merging.
- The dataset access request/approval flow is implemented entirely via `localStorage` on the client (`datasetAccessRequests` in EmployeeChatPage and PermissionPage), which is trivially spoofable and not multi-user safe; consider moving this workflow (storage, approvals, and polling) to backend APIs tied to authenticated users.
- The new FastAPI `dataset-insight-chatbot` service introduces a parallel auth and dataset handling stack; if it is intended to be the production path, consider aligning its models and permission checks more closely with the existing Node backend (e.g. dataset identifiers, roles, and logging) to avoid duplicated logic and inconsistent behavior between the two paths.
## Individual Comments
### Comment 1
<location path="frontend-react/src/services/api.js" line_range="33-39" />
<code_context>
if (!companyId) throw new Error("No company in database environment");
+<<<<<<< HEAD
+ const fileHash = crypto.randomBytes(16).toString('hex');
+ const insertResult = await pool.query(
+ `INSERT INTO datasets (dataset_id, company_id, uploaded_by, dataset_name, name, hash, upload_status)
+=======
await pool.query(
`INSERT INTO datasets (dataset_id, company_id, uploaded_by, dataset_name, file_name, file_size, upload_status)
+>>>>>>> upstream/feature
VALUES ($1, $2, $3, $4, $5, $6, $7)`,
[
</code_context>
<issue_to_address>
**issue (bug_risk):** Resolve the merge conflict in getMe to use a single storage strategy for role/userName.
`getMe` still has merge conflict markers and two different storage strategies (localStorage vs sessionStorage), which will break the build and can desync auth state. Choose a single storage mechanism, remove the conflict markers, and verify that all callers (e.g., `EmployeeLayout` and access control) read/write from that same storage location.
</issue_to_address>
### Comment 2
<location path="frontend-react/src/layout/EmployeeLayout.jsx" line_range="20-29" />
<code_context>
if (!companyId) throw new Error("No company in database environment");
+<<<<<<< HEAD
+ const fileHash = crypto.randomBytes(16).toString('hex');
+ const insertResult = await pool.query(
+ `INSERT INTO datasets (dataset_id, company_id, uploaded_by, dataset_name, name, hash, upload_status)
+=======
await pool.query(
`INSERT INTO datasets (dataset_id, company_id, uploaded_by, dataset_name, file_name, file_size, upload_status)
+>>>>>>> upstream/feature
VALUES ($1, $2, $3, $4, $5, $6, $7)`,
[
</code_context>
<issue_to_address>
**issue (bug_risk):** Clean up merge conflict and normalize how the employee name is sourced and stored.
This file still contains merge conflict markers and two different approaches to `userName` (localStorage + `getMe` vs sessionStorage), which will break the build and can desync identity/role handling from `getMe`. Resolve the conflict by choosing a single source of truth (e.g., always call `getMe`, then persist the result in one storage mechanism) and align usage with that convention across the app.
</issue_to_address>
### Comment 3
<location path="backend-node/src/controllers/uploadController.js" line_range="63-70" />
<code_context>
if (!companyId) throw new Error("No company in database environment");
+<<<<<<< HEAD
+ const fileHash = crypto.randomBytes(16).toString('hex');
+ const insertResult = await pool.query(
+ `INSERT INTO datasets (dataset_id, company_id, uploaded_by, dataset_name, name, hash, upload_status)
+=======
await pool.query(
`INSERT INTO datasets (dataset_id, company_id, uploaded_by, dataset_name, file_name, file_size, upload_status)
+>>>>>>> upstream/feature
VALUES ($1, $2, $3, $4, $5, $6, $7)`,
[
</code_context>
<issue_to_address>
**issue (bug_risk):** Fix the merge conflict in dataset insertion so column names and values line up correctly.
The `INSERT` statement still contains merge markers and two incompatible column lists: `(dataset_id, company_id, uploaded_by, dataset_name, name, hash, upload_status)` vs `(dataset_id, company_id, uploaded_by, dataset_name, file_name, file_size, upload_status)`. Resolve the conflict, decide whether you’re storing `file_hash` and/or `file_size`, choose the correct `name`/`file_name` column, and ensure the final column list and values match the actual `datasets` table schema.
</issue_to_address>
### Comment 4
<location path="backend-node/src/routes/chatRoutes.js" line_range="52-53" />
<code_context>
req.activityUserId,
- req.activityUserName || userEmail?.split('@')[0],
- userEmail,
+ req.activityUserName || req.activityUserEmail?.split('@')[0],
+ req.activityUserEmail,
req.activityDatasetId,
req.activityDatasetName || datasetId,
</code_context>
<issue_to_address>
**issue (bug_risk):** Use a defined email source when logging query activity instead of req.activityUserEmail.
`req.activityUserEmail` is never set in this route, while `userEmail` is derived from `req.user`. As a result, this change will log `undefined` for both email and name. Please keep `userEmail` as a fallback for both fields, e.g. `req.activityUserName || userEmail?.split('@')[0]` and `req.activityUserEmail || userEmail`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| <<<<<<< HEAD | ||
| localStorage.setItem('role', user.role); | ||
| localStorage.setItem('userName', user.name || user.full_name || 'User'); | ||
| ======= | ||
| sessionStorage.setItem('role', user.role); | ||
| sessionStorage.setItem('userName', user.name); | ||
| >>>>>>> upstream/feature |
There was a problem hiding this comment.
issue (bug_risk): Resolve the merge conflict in getMe to use a single storage strategy for role/userName.
getMe still has merge conflict markers and two different storage strategies (localStorage vs sessionStorage), which will break the build and can desync auth state. Choose a single storage mechanism, remove the conflict markers, and verify that all callers (e.g., EmployeeLayout and access control) read/write from that same storage location.
| <<<<<<< HEAD | ||
| const [userName, setUserName] = useState(() => { | ||
| const val = localStorage.getItem('userName'); | ||
| return (val && val !== 'undefined' && val !== 'null') ? val : 'Employee'; | ||
| }); | ||
| const safeName = userName || 'Employee'; | ||
| const userInitials = safeName.split(' ').map(n => n[0]).join('').toUpperCase().slice(0, 2); | ||
|
|
||
| useEffect(() => { | ||
| getMe().then(user => { |
There was a problem hiding this comment.
issue (bug_risk): Clean up merge conflict and normalize how the employee name is sourced and stored.
This file still contains merge conflict markers and two different approaches to userName (localStorage + getMe vs sessionStorage), which will break the build and can desync identity/role handling from getMe. Resolve the conflict by choosing a single source of truth (e.g., always call getMe, then persist the result in one storage mechanism) and align usage with that convention across the app.
| <<<<<<< HEAD | ||
| const fileHash = crypto.randomBytes(16).toString('hex'); | ||
| const insertResult = await pool.query( | ||
| `INSERT INTO datasets (dataset_id, company_id, uploaded_by, dataset_name, name, hash, upload_status) | ||
| ======= | ||
| await pool.query( | ||
| `INSERT INTO datasets (dataset_id, company_id, uploaded_by, dataset_name, file_name, file_size, upload_status) | ||
| >>>>>>> upstream/feature |
There was a problem hiding this comment.
issue (bug_risk): Fix the merge conflict in dataset insertion so column names and values line up correctly.
The INSERT statement still contains merge markers and two incompatible column lists: (dataset_id, company_id, uploaded_by, dataset_name, name, hash, upload_status) vs (dataset_id, company_id, uploaded_by, dataset_name, file_name, file_size, upload_status). Resolve the conflict, decide whether you’re storing file_hash and/or file_size, choose the correct name/file_name column, and ensure the final column list and values match the actual datasets table schema.
| req.activityUserName || req.activityUserEmail?.split('@')[0], | ||
| req.activityUserEmail, |
There was a problem hiding this comment.
issue (bug_risk): Use a defined email source when logging query activity instead of req.activityUserEmail.
req.activityUserEmail is never set in this route, while userEmail is derived from req.user. As a result, this change will log undefined for both email and name. Please keep userEmail as a fallback for both fields, e.g. req.activityUserName || userEmail?.split('@')[0] and req.activityUserEmail || userEmail.
Summary by Sourcery
Add internal dataset insight chatbot service, enhance query and logging pipeline, and improve admin/employee UX around logs, permissions, and dataset cleaning.
New Features:
Bug Fixes:
Enhancements:
Documentation: