LangConnect-Client

Intuitive dashboard for managing your RAG system with real-time document processing and search capabilities

LangConnect-Client is a comprehensive RAG (Retrieval-Augmented Generation) client application built with Streamlit. It provides a user-friendly interface for interacting with the LangConnect API, enabling document management and vector search capabilities powered by PostgreSQL with pgvector extension.

🎯 Key Features in Action

📚 Collections Management

Manage your collections with instant deletion capabilities. Create and organize document collections with metadata support and view detailed statistics.

📄 Document Management

View and manage your documents with an intuitive interface. See document-level statistics including chunk counts and total characters, with multi-select deletion capabilities.

Upload multiple documents in various formats (PDF, TXT, MD, DOCX) with automatic metadata generation and customization options.

🔍 Vector Search

Perform advanced searches with multiple search types (semantic, keyword, hybrid). Filter by metadata and view results with relevance scores and source information.

🔬 Chunk Investigation

Deep dive into document chunks with powerful filtering capabilities. View detailed chunk information including content previews, character counts, and metadata.

Features

🚀 FastAPI-based REST API with automatic documentation
🔐 Supabase Authentication for secure user management
🐘 PostgreSQL with pgvector for efficient vector storage and similarity search
📄 Multi-format Document Support (PDF, TXT, MD, DOCX)
🔍 Advanced Search Capabilities:
- Semantic search (vector similarity)
- Keyword search (full-text)
- Hybrid search (combination of both)
- Metadata filtering
🎨 Streamlit Web Interface for easy interaction
🤖 MCP Server Integration for AI assistant tools
🐳 Docker Support for easy deployment

Architecture

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│   Streamlit UI  │────▶│  FastAPI Server  │────▶│   PostgreSQL    │
│   (Frontend)    │     │  (Backend API)   │     │   + pgvector    │
└─────────────────┘     └──────────────────┘     └─────────────────┘
                               │
                               ▼
                        ┌──────────────────┐
                        │   Supabase Auth  │
                        └──────────────────┘

🚀 Quick Start

Get up and running in just 3 steps:

# 1. Clone the repository
git clone https://github.com/teddynote-lab/LangConnect-Client.git
cd LangConnect-Client

# 2. Copy environment variables and configure
cp .env.example .env
# Edit .env file to add your SUPABASE_URL and SUPABASE_KEY

# 3. Start all services
docker compose up -d

Once started, you can access:

🎨 Streamlit UI: http://localhost:8501
📚 API Documentation: http://localhost:8080/docs
🔍 Health Check: http://localhost:8080/health

To stop all services:

docker compose down

Getting Started

Prerequisites

Docker and Docker Compose
Python 3.11 or higher
Supabase account (for authentication)

Running with Docker

Clone the repository:

git clone https://github.com/teddynote-lab/LangConnect-Client.git
cd LangConnect-Client

Create a .env file with your configuration:

# Supabase Configuration
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_KEY=your-anon-key

# Database Configuration
POSTGRES_HOST=postgres
POSTGRES_PORT=5432
POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_DB=postgres

# Authentication
IS_TESTING=false  # Set to true to disable authentication

Start the services:
```
docker-compose up -d
```
This will start:
- PostgreSQL database with pgvector extension
- LangConnect API service on http://localhost:8080
- Streamlit UI on http://localhost:8501
Access the services:
- API documentation: http://localhost:8080/docs
- Streamlit UI: http://localhost:8501
- Health check: http://localhost:8080/health

Development Setup

For local development without Docker:

Install dependencies:
```
pip install -r requirements.txt
```
Start PostgreSQL with pgvector (or use Docker for just the database):
```
docker-compose up -d postgres
```

Run the API server:

uvicorn langconnect.server:app --reload --host 0.0.0.0 --port 8080

Run the Streamlit app:
```
streamlit run Main.py
```

API Documentation

The API provides comprehensive endpoints for managing collections and documents. Full interactive documentation is available at http://localhost:8080/docs when the service is running.

Authentication

All API endpoints (except /health and /auth/*) require authentication when IS_TESTING=false.

Authentication Endpoints

Method	Endpoint	Description	Request Body
POST	`/auth/signup`	Create a new user account	`{"email": "user@example.com", "password": "password123"}`
POST	`/auth/signin`	Sign in with existing account	`{"email": "user@example.com", "password": "password123"}`
POST	`/auth/signout`	Sign out (client-side cleanup)	-
POST	`/auth/refresh`	Refresh access token	Query param: `refresh_token`
GET	`/auth/me`	Get current user info	- (requires auth)

Include the access token in requests:

Authorization: Bearer your-access-token

Collections

Collections are containers for organizing related documents.

Method	Endpoint	Description	Request Body
GET	`/collections`	List all collections	-
POST	`/collections`	Create a new collection	`{"name": "collection-name", "metadata": {}}`
GET	`/collections/{collection_id}`	Get collection details	-
PATCH	`/collections/{collection_id}`	Update collection	`{"name": "new-name", "metadata": {}}`
DELETE	`/collections/{collection_id}`	Delete collection	-

Documents

Documents are stored as chunks with embeddings for efficient search.

Method	Endpoint	Description	Parameters
GET	`/collections/{collection_id}/documents`	List documents	`limit`, `offset`
POST	`/collections/{collection_id}/documents`	Upload documents	Form data: `files[]`, `metadatas_json`
DELETE	`/collections/{collection_id}/documents/{document_id}`	Delete document	Query param: `delete_by` (document_id/file_id)

Search

Advanced search capabilities within collections.

Method	Endpoint	Description	Request Body
POST	`/collections/{collection_id}/documents/search`	Search documents	`{"query": "search text", "limit": 10, "search_type": "semantic", "filter": {}}`

Search types:

semantic: Vector similarity search using embeddings
keyword: Traditional full-text search
hybrid: Combination of semantic and keyword search

Filter example:

{
  "query": "machine learning",
  "limit": 5,
  "search_type": "hybrid",
  "filter": {
    "source": "research_paper.pdf",
    "type": "academic"
  }
}

Streamlit Application

The Streamlit application provides a user-friendly web interface for interacting with the LangConnect API.

Main Features

User Authentication: Sign up, sign in, and persistent sessions
Collections Management: Create, view, and delete collections
Document Upload: Batch upload documents with metadata
Document Management: View, search, and delete documents
Advanced Search: Semantic, keyword, and hybrid search with filters
API Testing: Built-in API tester for development

Pages Overview

Main.py - Landing page with project overview and navigation
- Project information and features
- Quick links to all pages
- User authentication status
Collections Page (pages/1_Collections.py)
- List Tab: View all collections with document/chunk counts
- Create Tab: Create new collections with metadata
- Multi-select for bulk deletion
- Automatic statistics calculation
Documents Page (pages/2_Documents.py)
- Upload Tab:
  - Batch file upload (PDF, TXT, MD, DOCX)
  - Automatic metadata generation
  - Custom metadata support
- List Tab:
  - Document-level view grouped by file_id
  - Total character and chunk counts per document
  - Multi-select deletion by file_id
- Chunk Tab:
  - Individual chunk viewing with content preview
  - Multi-select source filtering
  - Character count display
  - Chunk-level deletion
Search Page (pages/3_Search.py)
- Collection selection
- Search type selection (semantic/keyword/hybrid)
- Metadata filtering with JSON
- Result display with relevance scores
- Source preview for filtering
API Tester Page (pages/4_API_Tester.py)
- Interactive API endpoint testing
- Grouped by functionality
- Request/response visualization
- Authentication token management

Authentication Persistence

The Streamlit app supports persistent authentication through:

Automatic File-Based Storage (Default)
- Tokens saved to ~/.langconnect_auth_cache
- Valid for 7 days
- Automatically loads on restart

Environment Variables (Optional)

LANGCONNECT_TOKEN=your-access-token
LANGCONNECT_EMAIL=your-email@example.com

MCP (Model Context Protocol) Server

LangConnect includes two MCP server implementations that allow AI assistants like Claude to interact with your document collections programmatically:

Standard MCP Server - Uses stdio transport for direct integration
SSE MCP Server - Uses Server-Sent Events for web-based integration (runs on port 8765)

Available Tools

The MCP server provides 9 tools for comprehensive document management:

search_documents - Perform semantic, keyword, or hybrid search
list_collections - List all available collections
get_collection - Get details about a specific collection
create_collection - Create a new collection
delete_collection - Delete a collection and its documents
list_documents - List documents in a collection
add_documents - Add text documents with metadata
delete_document - Delete specific documents
get_health_status - Check API health

Configuration

Step 1: Authentication & Configuration

First, you need to authenticate and generate the MCP configuration:

# Generate MCP configuration with automatic authentication
uv run mcp/create_mcp_json.py

This command will:

Prompt for your email and password
Automatically obtain a Supabase access token
Update the .env file with the new access token
Generate mcp/mcp_config.json with all necessary settings

⚠️ Important: Supabase access tokens expire after approximately 1 hour. You'll need to run this command again when your token expires.

Step 2: Running the SSE Server

To run the MCP SSE server locally:

# Start the SSE server
uv run mcp/mcp_sse_server.py

The server will start on http://localhost:8765 and display:

Server startup confirmation
Note that it's for MCP clients only (not browser accessible)

Step 3: Testing with MCP Inspector

You can test the MCP server using the MCP Inspector:

# Test with MCP Inspector
npx @modelcontextprotocol/inspector

In the Inspector:

Select "SSE" as the transport type
Enter http://localhost:8765 as the URL
Connect and test the available tools

Using with Claude Desktop

Simply copy the contents of the generated mcp/mcp_config.json file and paste it into your Claude Desktop MCP settings to start using LangConnect tools immediately.

Manual Configuration (Standard MCP Server)

Alternatively, you can manually configure the MCP server in mcp/mcp_config.json:

{
  "mcpServers": {
    "langconnect-rag-mcp": {
      "command": "/path/to/python",
      "args": [
        "/path/to/langconnect/mcp/mcp_server.py"
      ],
      "env": {
        "API_BASE_URL": "http://localhost:8080",
        "SUPABASE_ACCESS_TOKEN": "your-jwt-token-here"
      }
    }
  }
}

SSE MCP Server Configuration

For SSE transport (web-based integrations):

{
  "mcpServers": {
    "langconnect-rag-sse": {
      "url": "http://localhost:8765",
      "transport": "sse"
    }
  }
}

The SSE server provides:

Base URL: http://localhost:8765
Transport: Server-Sent Events (SSE)
CORS Support: Enabled for web integrations

Authentication

Both MCP servers require Supabase JWT authentication. The easiest way to authenticate is using the create_mcp_json.py script as described above, which will:

Automatically obtain your access token
Update your .env file
Generate the MCP configuration

Manual Token Retrieval (Alternative)

If you need to manually get your access token:

Sign in to the Streamlit UI at http://localhost:8501
Open browser Developer Tools (F12) → Application/Storage → Session Storage
Find and copy the access_token value
Set it as SUPABASE_ACCESS_TOKEN in your configuration

⚠️ Token Expiration: Supabase access tokens expire after approximately 1 hour. When your token expires, simply run uv run mcp/create_mcp_json.py again to get a fresh token.

Usage with Claude Desktop

Get your Supabase access token (see Authentication above)
Update the configuration file with your paths and token
Add the configuration to Claude Desktop's MCP settings
Claude will have access to all LangConnect tools for document management and search

Example usage in Claude:

"Search for documents about machine learning in my research collection"
"Create a new collection called 'Project Documentation'"
"List all documents in the technical-specs collection"

Environment Variables

Variable	Description	Default	Required
Authentication
SUPABASE_URL	Supabase project URL	-	Yes
SUPABASE_KEY	Supabase anon key	-	Yes
IS_TESTING	Disable authentication for testing	false	No
LANGCONNECT_TOKEN	Persistent auth token	-	No
LANGCONNECT_EMAIL	Persistent auth email	-	No
Database
POSTGRES_HOST	PostgreSQL host	postgres	No
POSTGRES_PORT	PostgreSQL port	5432	No
POSTGRES_USER	PostgreSQL username	postgres	No
POSTGRES_PASSWORD	PostgreSQL password	postgres	No
POSTGRES_DB	PostgreSQL database name	postgres	No
API
API_BASE_URL	API base URL	http://localhost:8080	No
MCP SSE Server
SSE_PORT	Port for MCP SSE server	8765	No
SUPABASE_ACCESS_TOKEN	JWT token from Supabase auth	-	Yes (for MCP)

Testing

Run the test suite:

# Test authentication
python test_supabase_auth.py

# Test API endpoints
python test_auth.py
python test_retrieval_endpoint.py

# Test MCP functionality
python test_mcp_metadata.py

Security

Authentication: All API endpoints require valid JWT tokens (except health check)
Token Management: Tokens expire and must be refreshed
Environment Security: Never commit .env files or expose keys
CORS: Configure allowed origins in production
Database: Use strong passwords and restrict access

License

This project is licensed under the terms included in the repository.

Made with ❤️ by TeddyNote LAB

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
.github		.github
.streamlit		.streamlit
.vscode		.vscode
assets		assets
init-scripts		init-scripts
langconnect		langconnect
mcp		mcp
pages		pages
tests/unit_tests		tests/unit_tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AUTH_README.md		AUTH_README.md
Dockerfile		Dockerfile
INSTRUCTIONS.md		INSTRUCTIONS.md
LICENSE		LICENSE
Main.py		Main.py
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
run_mcp_server.sh		run_mcp_server.sh
uv.lock		uv.lock

License

junhwankimmd/langconnect-client

Folders and files

Latest commit

History

Repository files navigation

LangConnect-Client

🎯 Key Features in Action

📚 Collections Management

📄 Document Management

🔍 Vector Search

🔬 Chunk Investigation

📋 Table of Contents

Features

Architecture

🚀 Quick Start

Getting Started

Prerequisites

Running with Docker

Development Setup

API Documentation

Authentication

Authentication Endpoints

Collections

Documents

Search

Streamlit Application

Main Features

Pages Overview

Authentication Persistence

MCP (Model Context Protocol) Server

Available Tools

Configuration

Step 1: Authentication & Configuration

Step 2: Running the SSE Server

Step 3: Testing with MCP Inspector

Using with Claude Desktop

Manual Configuration (Standard MCP Server)

SSE MCP Server Configuration

Authentication

Manual Token Retrieval (Alternative)

Usage with Claude Desktop

Environment Variables

Testing

Security

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages