Intuitive dashboard for managing your RAG system with real-time document processing and search capabilities
LangConnect-Client is a comprehensive RAG (Retrieval-Augmented Generation) client application built with Streamlit. It provides a user-friendly interface for interacting with the LangConnect API, enabling document management and vector search capabilities powered by PostgreSQL with pgvector extension.
Manage your collections with instant deletion capabilities. Create and organize document collections with metadata support and view detailed statistics.
View and manage your documents with an intuitive interface. See document-level statistics including chunk counts and total characters, with multi-select deletion capabilities.
Upload multiple documents in various formats (PDF, TXT, MD, DOCX) with automatic metadata generation and customization options.
Perform advanced searches with multiple search types (semantic, keyword, hybrid). Filter by metadata and view results with relevance scores and source information.
Deep dive into document chunks with powerful filtering capabilities. View detailed chunk information including content previews, character counts, and metadata.
- Features
- Architecture
- Getting Started
- API Documentation
- Streamlit Application
- MCP (Model Context Protocol) Server
- Environment Variables
- Testing
- Security
- License
- π FastAPI-based REST API with automatic documentation
- π Supabase Authentication for secure user management
- π PostgreSQL with pgvector for efficient vector storage and similarity search
- π Multi-format Document Support (PDF, TXT, MD, DOCX)
- π Advanced Search Capabilities:
- Semantic search (vector similarity)
- Keyword search (full-text)
- Hybrid search (combination of both)
- Metadata filtering
- π¨ Streamlit Web Interface for easy interaction
- π€ MCP Server Integration for AI assistant tools
- π³ Docker Support for easy deployment
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Streamlit UI ββββββΆβ FastAPI Server ββββββΆβ PostgreSQL β
β (Frontend) β β (Backend API) β β + pgvector β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β
βΌ
ββββββββββββββββββββ
β Supabase Auth β
ββββββββββββββββββββ
Get up and running in just 3 steps:
# 1. Clone the repository
git clone https://github.com/teddynote-lab/LangConnect-Client.git
cd LangConnect-Client
# 2. Copy environment variables and configure
cp .env.example .env
# Edit .env file to add your SUPABASE_URL and SUPABASE_KEY
# 3. Start all services
docker compose up -dOnce started, you can access:
- π¨ Streamlit UI: http://localhost:8501
- π API Documentation: http://localhost:8080/docs
- π Health Check: http://localhost:8080/health
To stop all services:
docker compose down- Docker and Docker Compose
- Python 3.11 or higher
- Supabase account (for authentication)
-
Clone the repository:
git clone https://github.com/teddynote-lab/LangConnect-Client.git cd LangConnect-Client -
Create a
.envfile with your configuration:# Supabase Configuration SUPABASE_URL=https://your-project.supabase.co SUPABASE_KEY=your-anon-key # Database Configuration POSTGRES_HOST=postgres POSTGRES_PORT=5432 POSTGRES_USER=postgres POSTGRES_PASSWORD=postgres POSTGRES_DB=postgres # Authentication IS_TESTING=false # Set to true to disable authentication
-
Start the services:
docker-compose up -d
This will start:
- PostgreSQL database with pgvector extension
- LangConnect API service on http://localhost:8080
- Streamlit UI on http://localhost:8501
-
Access the services:
- API documentation: http://localhost:8080/docs
- Streamlit UI: http://localhost:8501
- Health check: http://localhost:8080/health
For local development without Docker:
-
Install dependencies:
pip install -r requirements.txt
-
Start PostgreSQL with pgvector (or use Docker for just the database):
docker-compose up -d postgres
-
Run the API server:
uvicorn langconnect.server:app --reload --host 0.0.0.0 --port 8080
-
Run the Streamlit app:
streamlit run Main.py
The API provides comprehensive endpoints for managing collections and documents. Full interactive documentation is available at http://localhost:8080/docs when the service is running.
All API endpoints (except /health and /auth/*) require authentication when IS_TESTING=false.
| Method | Endpoint | Description | Request Body |
|---|---|---|---|
| POST | /auth/signup |
Create a new user account | {"email": "user@example.com", "password": "password123"} |
| POST | /auth/signin |
Sign in with existing account | {"email": "user@example.com", "password": "password123"} |
| POST | /auth/signout |
Sign out (client-side cleanup) | - |
| POST | /auth/refresh |
Refresh access token | Query param: refresh_token |
| GET | /auth/me |
Get current user info | - (requires auth) |
Include the access token in requests:
Authorization: Bearer your-access-token
Collections are containers for organizing related documents.
| Method | Endpoint | Description | Request Body |
|---|---|---|---|
| GET | /collections |
List all collections | - |
| POST | /collections |
Create a new collection | {"name": "collection-name", "metadata": {}} |
| GET | /collections/{collection_id} |
Get collection details | - |
| PATCH | /collections/{collection_id} |
Update collection | {"name": "new-name", "metadata": {}} |
| DELETE | /collections/{collection_id} |
Delete collection | - |
Documents are stored as chunks with embeddings for efficient search.
| Method | Endpoint | Description | Parameters |
|---|---|---|---|
| GET | /collections/{collection_id}/documents |
List documents | limit, offset |
| POST | /collections/{collection_id}/documents |
Upload documents | Form data: files[], metadatas_json |
| DELETE | /collections/{collection_id}/documents/{document_id} |
Delete document | Query param: delete_by (document_id/file_id) |
Advanced search capabilities within collections.
| Method | Endpoint | Description | Request Body |
|---|---|---|---|
| POST | /collections/{collection_id}/documents/search |
Search documents | {"query": "search text", "limit": 10, "search_type": "semantic", "filter": {}} |
Search types:
semantic: Vector similarity search using embeddingskeyword: Traditional full-text searchhybrid: Combination of semantic and keyword search
Filter example:
{
"query": "machine learning",
"limit": 5,
"search_type": "hybrid",
"filter": {
"source": "research_paper.pdf",
"type": "academic"
}
}The Streamlit application provides a user-friendly web interface for interacting with the LangConnect API.
- User Authentication: Sign up, sign in, and persistent sessions
- Collections Management: Create, view, and delete collections
- Document Upload: Batch upload documents with metadata
- Document Management: View, search, and delete documents
- Advanced Search: Semantic, keyword, and hybrid search with filters
- API Testing: Built-in API tester for development
-
Main.py - Landing page with project overview and navigation
- Project information and features
- Quick links to all pages
- User authentication status
-
Collections Page (
pages/1_Collections.py)- List Tab: View all collections with document/chunk counts
- Create Tab: Create new collections with metadata
- Multi-select for bulk deletion
- Automatic statistics calculation
-
Documents Page (
pages/2_Documents.py)- Upload Tab:
- Batch file upload (PDF, TXT, MD, DOCX)
- Automatic metadata generation
- Custom metadata support
- List Tab:
- Document-level view grouped by file_id
- Total character and chunk counts per document
- Multi-select deletion by file_id
- Chunk Tab:
- Individual chunk viewing with content preview
- Multi-select source filtering
- Character count display
- Chunk-level deletion
- Upload Tab:
-
Search Page (
pages/3_Search.py)- Collection selection
- Search type selection (semantic/keyword/hybrid)
- Metadata filtering with JSON
- Result display with relevance scores
- Source preview for filtering
-
API Tester Page (
pages/4_API_Tester.py)- Interactive API endpoint testing
- Grouped by functionality
- Request/response visualization
- Authentication token management
The Streamlit app supports persistent authentication through:
-
Automatic File-Based Storage (Default)
- Tokens saved to
~/.langconnect_auth_cache - Valid for 7 days
- Automatically loads on restart
- Tokens saved to
-
Environment Variables (Optional)
LANGCONNECT_TOKEN=your-access-token LANGCONNECT_EMAIL=your-email@example.com
LangConnect includes two MCP server implementations that allow AI assistants like Claude to interact with your document collections programmatically:
- Standard MCP Server - Uses stdio transport for direct integration
- SSE MCP Server - Uses Server-Sent Events for web-based integration (runs on port 8765)
The MCP server provides 9 tools for comprehensive document management:
- search_documents - Perform semantic, keyword, or hybrid search
- list_collections - List all available collections
- get_collection - Get details about a specific collection
- create_collection - Create a new collection
- delete_collection - Delete a collection and its documents
- list_documents - List documents in a collection
- add_documents - Add text documents with metadata
- delete_document - Delete specific documents
- get_health_status - Check API health
First, you need to authenticate and generate the MCP configuration:
# Generate MCP configuration with automatic authentication
uv run mcp/create_mcp_json.pyThis command will:
- Prompt for your email and password
- Automatically obtain a Supabase access token
- Update the
.envfile with the new access token - Generate
mcp/mcp_config.jsonwith all necessary settings
To run the MCP SSE server locally:
# Start the SSE server
uv run mcp/mcp_sse_server.pyThe server will start on http://localhost:8765 and display:
- Server startup confirmation
- Note that it's for MCP clients only (not browser accessible)
You can test the MCP server using the MCP Inspector:
# Test with MCP Inspector
npx @modelcontextprotocol/inspectorIn the Inspector:
- Select "SSE" as the transport type
- Enter
http://localhost:8765as the URL - Connect and test the available tools
Simply copy the contents of the generated mcp/mcp_config.json file and paste it into your Claude Desktop MCP settings to start using LangConnect tools immediately.
Alternatively, you can manually configure the MCP server in mcp/mcp_config.json:
{
"mcpServers": {
"langconnect-rag-mcp": {
"command": "/path/to/python",
"args": [
"/path/to/langconnect/mcp/mcp_server.py"
],
"env": {
"API_BASE_URL": "http://localhost:8080",
"SUPABASE_ACCESS_TOKEN": "your-jwt-token-here"
}
}
}
}For SSE transport (web-based integrations):
{
"mcpServers": {
"langconnect-rag-sse": {
"url": "http://localhost:8765",
"transport": "sse"
}
}
}The SSE server provides:
- Base URL: http://localhost:8765
- Transport: Server-Sent Events (SSE)
- CORS Support: Enabled for web integrations
Both MCP servers require Supabase JWT authentication. The easiest way to authenticate is using the create_mcp_json.py script as described above, which will:
- Automatically obtain your access token
- Update your
.envfile - Generate the MCP configuration
If you need to manually get your access token:
- Sign in to the Streamlit UI at http://localhost:8501
- Open browser Developer Tools (F12) β Application/Storage β Session Storage
- Find and copy the
access_tokenvalue - Set it as
SUPABASE_ACCESS_TOKENin your configuration
uv run mcp/create_mcp_json.py again to get a fresh token.
- Get your Supabase access token (see Authentication above)
- Update the configuration file with your paths and token
- Add the configuration to Claude Desktop's MCP settings
- Claude will have access to all LangConnect tools for document management and search
Example usage in Claude:
- "Search for documents about machine learning in my research collection"
- "Create a new collection called 'Project Documentation'"
- "List all documents in the technical-specs collection"
| Variable | Description | Default | Required |
|---|---|---|---|
| Authentication | |||
| SUPABASE_URL | Supabase project URL | - | Yes |
| SUPABASE_KEY | Supabase anon key | - | Yes |
| IS_TESTING | Disable authentication for testing | false | No |
| LANGCONNECT_TOKEN | Persistent auth token | - | No |
| LANGCONNECT_EMAIL | Persistent auth email | - | No |
| Database | |||
| POSTGRES_HOST | PostgreSQL host | postgres | No |
| POSTGRES_PORT | PostgreSQL port | 5432 | No |
| POSTGRES_USER | PostgreSQL username | postgres | No |
| POSTGRES_PASSWORD | PostgreSQL password | postgres | No |
| POSTGRES_DB | PostgreSQL database name | postgres | No |
| API | |||
| API_BASE_URL | API base URL | http://localhost:8080 | No |
| MCP SSE Server | |||
| SSE_PORT | Port for MCP SSE server | 8765 | No |
| SUPABASE_ACCESS_TOKEN | JWT token from Supabase auth | - | Yes (for MCP) |
Run the test suite:
# Test authentication
python test_supabase_auth.py
# Test API endpoints
python test_auth.py
python test_retrieval_endpoint.py
# Test MCP functionality
python test_mcp_metadata.py- Authentication: All API endpoints require valid JWT tokens (except health check)
- Token Management: Tokens expire and must be refreshed
- Environment Security: Never commit
.envfiles or expose keys - CORS: Configure allowed origins in production
- Database: Use strong passwords and restrict access
This project is licensed under the terms included in the repository.
Made with β€οΈ by TeddyNote LAB