CLAMS Agent Prototype

Overview

The CLAMS Agent Prototype leverages Large Language Models (LLMs) to automate the generation of pipelines of CLAMS tools based on task descriptions and available tool metadata. This prototype enables users to analyze video content through natural language queries and visualize the results through an interactive interface.

Project Goals

Automate the construction of CLAMS tool pipelines using LLMs
Enable natural language interaction for video content analysis tasks
Provide intuitive visualization of computational analysis results
Streamline multimedia processing workflows

Key Features

Chat Interface

Natural language interaction for requesting information about video content
LLM-powered interpretation of user requests
Automatic generation of appropriate CLAMS tool pipelines
Parameter optimization for efficient video processing

Visualization Tool

Interactive exploration of pipeline outputs in MMIF format
Integrated video player for synchronized content viewing
Dynamic presentation of computational analysis results
User-friendly interface for exploring video annotations

Pipeline Generation

Intelligent selection of appropriate CLAMS tools based on user queries
Automatic configuration of tool parameters
Optimization of processing workflows for efficiency
Support for diverse multimedia analysis tasks

Technical Overview

MMIF (Multimedia Interchange Format)

The system uses MMIF as its core data format, enabling standardized exchange of multimedia annotations between different components of the processing pipeline.

CLAMS Platform Integration

This prototype integrates with the CLAMS (Computational Linguistics Applications for Multimedia Services) platform, leveraging its ecosystem of multimedia analysis tools.

Architecture

The system consists of:

A chat interface for query input and results display
An LLM-powered pipeline generation system
A visualization interface for exploring MMIF data
A video player component for content viewing

Installation and Setup

Prerequisites

Python 3.8 or higher
Node.js 16.x or higher (for frontend visualization)
npm or yarn package manager

1. Environment Setup

# Clone the repository (if not already done)
cd clams-agent-prototype

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install Python dependencies
pip install -r requirements.txt

2. Frontend Build (Required for Web Interface)

# Navigate to visualization directory
cd visualization

# Install Node.js dependencies
npm install

# Build the frontend
npm run build

# Return to project root
cd ..

3. Running the Application

Option A: Web Interface (Recommended)

# Ensure you're in the project root with virtual environment activated
python app.py

The web interface will be available at http://localhost:5000

Option B: Command Line Interface

# Ensure you're in the project root with virtual environment activated
python pipeline_chat.py

This provides an interactive command-line chat interface.

4. Configuration (Optional)

The application uses default configuration settings. To customize:

LLM model parameters can be configured in utils/config.py
Default settings work for most use cases
Configuration is automatically saved to config.json when modified

5. Troubleshooting

Port 5000 in use: Change the port in app.py (line 228): app.run(debug=True, port=5001)
Frontend build errors: Ensure Node.js 16+ is installed, delete node_modules and package-lock.json, then run npm install again
Python dependency errors: Ensure you're using the correct virtual environment and all dependencies are installed

Usage Examples

Sample Queries

"Identify all speaking segments in this news broadcast"
"Find all scenes containing cars in this movie"
"Detect and transcribe all text visible in this documentary"

Workflow

Load a video or select a collection of videos
Enter a natural language query about the content
The system generates and executes an appropriate CLAMS tool pipeline
Results are displayed in the visualization interface
Explore the results interactively alongside the video

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
data		data
tests		tests
utils		utils
visualization		visualization
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
app.py		app.py
conversation.log		conversation.log
debug_agent.py		debug_agent.py
debug_agui.py		debug_agui.py
debug_flask_event.py		debug_flask_event.py
debug_sse.py		debug_sse.py
final_integration_test.py		final_integration_test.py
monitor_conversations.sh		monitor_conversations.sh
requirements.txt		requirements.txt
run.py		run.py
run_tests.py		run_tests.py
server.log		server.log
server_fixed.log		server_fixed.log
server_fixed_timeout.log		server_fixed_timeout.log
server_new.log		server_new.log
server_restart.log		server_restart.log
server_simple.log		server_simple.log
test_app.py		test_app.py
test_caching.py		test_caching.py
test_fixed_flask.py		test_fixed_flask.py
test_frontend.html		test_frontend.html
test_function_calling.py		test_function_calling.py
test_hybrid_agent.py		test_hybrid_agent.py
test_prompts.md		test_prompts.md
test_simple.html		test_simple.html
test_tool_execution.py		test_tool_execution.py
test_web_interface.py		test_web_interface.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CLAMS Agent Prototype

Overview

Project Goals

Key Features

Chat Interface

Visualization Tool

Pipeline Generation

Technical Overview

MMIF (Multimedia Interchange Format)

CLAMS Platform Integration

Architecture

Installation and Setup

Prerequisites

1. Environment Setup

2. Frontend Build (Required for Web Interface)

3. Running the Application

Option A: Web Interface (Recommended)

Option B: Command Line Interface

4. Configuration (Optional)

5. Troubleshooting

Usage Examples

Sample Queries

Workflow

Related

About

Uh oh!

Releases

Packages

Uh oh!

Languages

kelleyl/clams-agent-prototype

Folders and files

Latest commit

History

Repository files navigation

CLAMS Agent Prototype

Overview

Project Goals

Key Features

Chat Interface

Visualization Tool

Pipeline Generation

Technical Overview

MMIF (Multimedia Interchange Format)

CLAMS Platform Integration

Architecture

Installation and Setup

Prerequisites

1. Environment Setup

2. Frontend Build (Required for Web Interface)

3. Running the Application

Option A: Web Interface (Recommended)

Option B: Command Line Interface

4. Configuration (Optional)

5. Troubleshooting

Usage Examples

Sample Queries

Workflow

Related

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages