-
-
Notifications
You must be signed in to change notification settings - Fork 36
Description
Description
Refactor and improve the table reference resolution system to leverage workspace schema configuration and provide a more robust, maintainable solution for handling references between tables. This enhancement moves critical logic to the backend and improves multi-user scenarios.
Problem
- Reference resolution between tables is complex and error-prone
- Multi-user scenarios are not handled well in the current implementation
- Current implementation mixes concerns between frontend and backend
- No clear ownership of reference resolution logic
- Limited integration with workspace schema configuration
- References are resolved client-side, leading to performance and consistency issues
Proposed Solution
- Leverage Workspace Schema Configuration: Use schema definitions to understand reference relationships
- Move Resolution to Backend: Implement server-side reference resolution using SchemaService
- Create Document-Centric APIs: Use DocumentService for cross-dataset reference resolution
- Enhance Frontend Components: Simplify frontend by using backend resolution services
- Improve Multi-User Support: Handle reference consistency across users
Implementation Details
Dependencies
- Workspace schema configuration infrastructure must be implemented
- SchemaService should provide reference resolution capabilities
- DocumentService should handle cross-dataset reference resolution
- TableField and TableQuestion types should be available ([Feature] Implement TableField type in backend #90, [Feature] Implement TableQuestion type and improve table answer handling #91)
Backend Changes
-
Enhance SchemaService with reference resolution:
class SchemaService: def resolve_table_references(self, table_data: dict, workspace_id: UUID, user_id: UUID = None) -> dict: """Resolve all references in table data using workspace schema""" schema_config = self.get_workspace_schema_config(workspace_id) resolved_data = table_data.copy() for ref_column in self._get_reference_columns(table_data): ref_values = self._resolve_reference_column( ref_column, table_data[ref_column], workspace_id, user_id ) resolved_data[f"{ref_column}_resolved"] = ref_values return resolved_data def get_reference_schema_mapping(self, workspace_id: UUID) -> Dict[str, str]: """Get mapping of reference columns to their target schemas""" pass def validate_reference_consistency(self, table_data: dict, workspace_id: UUID) -> List[ValidationError]: """Validate that all references point to valid records""" pass
-
Create DocumentService for cross-dataset references:
class DocumentService: def resolve_document_references(self, document_ref: str, workspace_id: UUID, user_id: UUID = None) -> Dict[str, Any]: """Resolve all references for a complete document across datasets""" all_records = self.get_document_records(document_ref, workspace_id) schema_service = SchemaService(workspace_id) resolved_document = {} for record in all_records: if self._has_table_data(record): resolved_data = schema_service.resolve_table_references( record.table_data, workspace_id, user_id ) resolved_document[record.schema_name] = resolved_data return resolved_document
-
Add reference resolution API endpoints:
@router.post("/workspaces/{workspace_id}/tables/resolve-references") async def resolve_table_references( workspace_id: UUID, table_data: dict, user_id: UUID = None, db: AsyncSession = Depends(get_async_db) ): schema_service = SchemaService(workspace_id, db) return schema_service.resolve_table_references(table_data, workspace_id, user_id) @router.get("/workspaces/{workspace_id}/documents/{reference}/resolved") async def get_resolved_document( workspace_id: UUID, reference: str, user_id: UUID = None, db: AsyncSession = Depends(get_async_db) ): document_service = DocumentService(workspace_id, db) return document_service.resolve_document_references(reference, workspace_id, user_id)
-
Enhance record APIs with reference context:
- Include resolved reference data in record responses
- Add reference validation before saving records
- Provide reference metadata for frontend components
Frontend Changes
-
Refactor useReferenceTablesViewModel to use backend APIs:
export const useReferenceTablesViewModel = (props: { tableJSON: TableData }) => { const { state: workspace } = useWorkspace(); const resolveReferences = async (tableData: TableData): Promise<TableData> => { if (!workspace?.id) return tableData; const response = await documentService.resolveTableReferences( workspace.id, tableData.toJSON() ); return new TableData( response.data, response.schema, response.reference ); }; const getResolvedDocument = async (reference: string): Promise<ResolvedDocument> => { if (!workspace?.id) return null; return await documentService.getResolvedDocument(workspace.id, reference); }; return { resolveReferences, getResolvedDocument, // ... other methods simplified using backend APIs }; };
-
Simplify table rendering components:
- Remove complex client-side reference resolution logic
- Use resolved data from backend APIs
- Add error handling for reference resolution failures
- Implement caching for resolved references
-
Enhance multi-user reference handling:
- Display reference conflicts between users
- Show resolution history and user context
- Provide UI for reference conflict resolution
- Enable collaborative reference editing
-
Improve reference management UI:
// New component: ReferenceResolver.vue export default { props: { tableData: Object, workspaceId: String, }, data() { return { resolvedData: null, loading: false, errors: [], }; }, async mounted() { await this.resolveReferences(); }, methods: { async resolveReferences() { this.loading = true; try { this.resolvedData = await documentService.resolveTableReferences( this.workspaceId, this.tableData ); } catch (error) { this.errors.push(error.message); } finally { this.loading = false; } }, }, };
Performance and Caching
-
Implement reference resolution caching:
- Cache resolved references at the workspace level
- Invalidate cache when referenced records change
- Use Redis or similar for distributed caching
-
Optimize reference queries:
- Batch reference resolution requests
- Use database joins for efficient reference lookup
- Implement lazy loading for large reference datasets
Related Files
extralit/argilla-server/src/argilla_server/services/SchemaService.py- Enhanced reference resolutionextralit/argilla-server/src/argilla_server/services/DocumentService.py- Cross-dataset reference handlingextralit/argilla-server/src/argilla_server/api/handlers/v1/references/- New reference endpointsextralit/argilla-frontend/components/base/base-render-table/useReferenceTablesViewModel.ts- Simplified frontend logicextralit/argilla-frontend/components/features/reference-resolution/- New reference UI componentsextralit/argilla-frontend/v1/infrastructure/services/DocumentService.ts- Document API client
Acceptance Criteria
- Reference resolution has clear ownership in the backend using SchemaService
- Workspace schema configuration drives reference resolution logic
- Backend provides efficient APIs for reference resolution with proper caching
- Frontend reference handling is simplified and uses backend APIs
- Multi-user scenarios are properly supported with conflict resolution
- Reference validation ensures data integrity across tables
- Performance is optimized with appropriate caching strategies
- Cross-dataset reference resolution works correctly
- UI provides clear feedback for reference resolution status and errors
- The system maintains backward compatibility with existing data
- Integration tests verify reference resolution functionality
- Error handling provides meaningful feedback to users
Related Issues
This is part of the strategic workspace-level schema management enhancement:
- Depends on: Workspace Schema Configuration Infrastructure
- Depends on: [Feature] Implement TableField type in backend #90 (TableField), [Feature] Implement TableQuestion type and improve table answer handling #91 (TableQuestion) for proper table types
- Related to: [Feature] Enhance table suggestion handling #93 (Table suggestion handling)
- Part of: [Refactor] Phase 3: Refactor and enhance table reference resolution #97 (Phase 3: Table reference resolution)
- Enables: Document-schema-fields table component