Fix: Apply HTTP client timeouts to prevent infinite hangs #23

TheFermiSea · 2025-10-17T22:37:38Z

Fix: Apply HTTP client timeouts to prevent infinite hangs

Summary

Fixes a critical bug where octocode's GraphRAG indexing process hangs indefinitely during LLM API calls. The root cause is that reqwest::Client instances are created using Client::new() instead of the builder pattern, which prevents the configured batch_timeout_seconds from being applied.

Problem

When using LLM-powered GraphRAG features (use_llm = true), the indexing process hangs indefinitely at two points:

File description generation (with ai_batch_size > 1)
Architectural relationship extraction (always processes all files in one batch)

The configuration parameter graphrag.llm.batch_timeout_seconds is loaded but never applied to the HTTP client, causing requests to wait indefinitely when the LLM provider is slow to respond.

Root Cause Analysis

Primary Bug (`src/indexer/graphrag/builder.rs:74`)

// BEFORE (buggy):
let client = Client::new();

// AFTER (fixed):
let client = Client::builder()
    .timeout(std::time::Duration::from_secs(
        config.graphrag.llm.batch_timeout_seconds,
    ))
    .build()?;

Secondary Bug (`src/embedding/provider/huggingface.rs:460`)

// BEFORE (buggy):
let client = reqwest::Client::new();

// AFTER (fixed):
let client = reqwest::Client::builder()
    .timeout(std::time::Duration::from_secs(30))
    .build()?;

Reproduction

Configure octocode with use_llm = true and ai_batch_size > 1
Run octocode index on any codebase
Process hangs at "AI analyzing X files for architectural relationships" with infinite spinner
No timeout occurs even after configured batch_timeout_seconds expires

Testing

Before Fix

Both GPT-4.1-mini (default) and GPT-5-mini exhibited infinite hangs
Workaround with ai_batch_size = 1 only helped description phase
Relationship extraction always hung (processes 72 files in single batch)

After Fix

Timeouts properly trigger after configured duration
Failed requests return error messages instead of hanging forever
Large batch processing completes or fails gracefully

Impact

This bug made LLM-powered GraphRAG features unusable for any non-trivial codebase. The fix enables:

Reliable timeout behavior for all LLM API calls
Proper error handling and recovery
Predictable indexing duration

Code Quality Improvements

In addition to the core timeout fixes, this PR includes:

Enhanced Error Handling

Added .context() error messages to both HTTP client build failures
GraphRAG client: "Failed to create HTTP client for LLM API calls"
HuggingFace client: "Failed to create HTTP client for HuggingFace downloads"

Documentation Comments

// IMPORTANT: Must use builder pattern with timeout to prevent infinite hangs
// when LLM API calls take too long. Client::new() does not apply timeouts.

These comments at both fix locations help prevent future regressions.

Code Verification

Zero clippy warnings
All existing Client::new() uses verified (3 instances in commands/ use request-level timeouts correctly)
No unwrap() issues in production code
Consistent error handling patterns

Technical Details

Why Client::new() Doesn't Apply Timeouts

The reqwest::Client::new() convenience method creates a client with default settings that do not include any timeout. To apply a timeout, you must use the builder pattern:

// ❌ WRONG - no timeout applied
let client = Client::new();

// ✅ CORRECT - timeout is applied
let client = Client::builder()
    .timeout(Duration::from_secs(120))
    .build()?;

Request-Level vs Client-Level Timeouts

This PR uses client-level timeouts. An alternative approach is request-level timeouts:

// Also valid, but less convenient for repeated requests
let client = Client::new();
let response = client.get(url)
    .timeout(Duration::from_secs(120))
    .send()
    .await?;

The codebase uses both patterns appropriately:

Client-level (this PR): For GraphRAG batch operations and HuggingFace downloads
Request-level: In src/commands/ where each request may need different timeouts

Test Results

Unit Tests

cargo test --all-features

Results: 93 passed, 3 failed

The 3 failures are unrelated FastEmbed lock acquisition issues:

test_fastembed_provider_creation
test_fastembed_model_validation
test_fastembed_embedding_generation

These failures are environmental (file lock contention in local cache directory), not caused by the timeout fix changes.

Integration Testing

Real-world test on rust-daq codebase (113 files):

File indexing: ✅ 113/113 files processed
GraphRAG blocks: ✅ 896 blocks created
Relationship extraction: ✅ Timeout triggered after 120s (expected behavior)
Before fix: Infinite hang
After fix: Graceful timeout with error message

Configuration Tested

[graphrag.llm]
batch_timeout_seconds = 120
ai_batch_size = 10
description_model = "openai/gpt-4.1-mini"
relationship_model = "google/gemini-2.0-flash-001"

Migration Guide for Developers

If you're creating new HTTP clients in octocode:

For LLM/API Calls

use reqwest::Client;
use anyhow::Context;

let client = Client::builder()
    .timeout(std::time::Duration::from_secs(
        config.graphrag.llm.batch_timeout_seconds,
    ))
    .build()
    .context("Failed to create HTTP client")?;

For File Downloads

let client = Client::builder()
    .timeout(std::time::Duration::from_secs(30))
    .build()
    .context("Failed to create HTTP client")?;

Don't Use Client::new()

Unless you have a specific reason to avoid timeouts (rare), always use the builder pattern.

Related Issues

This fix resolves the core issue documented in:

octocode_timeout_analysis.md (comprehensive technical analysis)
User reports of infinite hangs during GraphRAG indexing

Checklist

Code changes tested locally
Both timeout bugs fixed (GraphRAG + HuggingFace)
No breaking changes to API
Enhanced error messages added
Documentation comments added
Code quality verified (clippy clean)
Integration tested on real codebase
Unit tests passing (93/96 passed, 3 unrelated FastEmbed lock failures)
Ready for upstream PR

- Fix builder.rs:74: Use Client::builder() with batch_timeout_seconds - Fix huggingface.rs:460: Use Client::builder() with 30s timeout - Prevents infinite hangs during LLM API calls and model downloads Resolves timeout bug documented in octocode_timeout_analysis.md

…ixes Enhanced the timeout bug fixes with: - Better error messages using .context() for HTTP client build failures - Documentation comments explaining why builder pattern is required - Clear warnings that Client::new() does not apply timeouts These improvements help prevent future regressions and make the code more maintainable by documenting the critical timeout requirement. Related to the fix for infinite hangs during LLM GraphRAG operations.

Added two documentation files to support PR submission: 1. PR_DESCRIPTION.md - Detailed technical explanation of the bug and fix - Code quality improvements section - Integration test results from rust-daq codebase - Migration guide for developers - Comprehensive checklist 2. TESTING.md - Complete testing procedures (unit, integration, regression) - Manual testing checklist - Real-world test results - Troubleshooting guide - CI/CD recommendations These documents provide all information needed for PR review and merge. Ready for upstream submission.

Copilot

Pull request overview

This PR fixes a critical bug where octocode's GraphRAG indexing process hangs indefinitely during LLM API calls by properly applying HTTP client timeouts. The root cause was using reqwest::Client::new() instead of the builder pattern, preventing configured timeouts from being applied to HTTP requests.

Key Changes

Applied client-level timeout configuration to GraphRAG LLM API calls using Client::builder() pattern
Applied 30-second timeout to HuggingFace model downloads to prevent infinite hangs
Added inline documentation explaining the importance of using the builder pattern for timeouts

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`src/indexer/graphrag/builder.rs`	Fixes HTTP client creation for LLM API calls by using builder pattern with configured `batch_timeout_seconds` timeout
`src/embedding/provider/huggingface.rs`	Fixes HTTP client creation for HuggingFace downloads by using builder pattern with 30-second timeout
`TESTING.md`	Adds comprehensive testing documentation for the timeout fixes, including unit tests, integration tests, and real-world test results
`PR_DESCRIPTION.md`	Provides detailed explanation of the bug, root cause analysis, testing results, and migration guide for developers

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-23T06:51:22Z

TESTING.md

+Run the complete test suite:
+
+```bash
+cd /Users/briansquires/octocode


This hardcoded path (/Users/briansquires/octocode) is environment-specific and should be replaced with a generic placeholder like /path/to/octocode or the relative path reference that's already used elsewhere in the document (e.g., line 49 uses /path/to/test/codebase).

Suggested change

cd /Users/briansquires/octocode

cd /path/to/octocode

Copilot · 2025-11-23T06:51:22Z

TESTING.md

+Three files use `Client::new()` but apply request-level timeouts correctly:
+
+```bash
+cd /Users/briansquires/octocode


This hardcoded path (/Users/briansquires/octocode) is environment-specific and should be replaced with a generic placeholder like /path/to/octocode to make the documentation universally applicable.

TheFermiSea added 4 commits October 16, 2025 22:13

docs: Update PR description with test results

1257555

donhardman requested a review from Copilot November 23, 2025 06:47

Copilot started reviewing on behalf of donhardman November 23, 2025 06:48 View session

Copilot finished reviewing on behalf of donhardman November 23, 2025 06:50

Copilot AI reviewed Nov 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Apply HTTP client timeouts to prevent infinite hangs #23

Fix: Apply HTTP client timeouts to prevent infinite hangs #23

Uh oh!

TheFermiSea commented Oct 17, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 23, 2025

Uh oh!

Copilot AI Nov 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix: Apply HTTP client timeouts to prevent infinite hangs #23

Are you sure you want to change the base?

Fix: Apply HTTP client timeouts to prevent infinite hangs #23

Uh oh!

Conversation

TheFermiSea commented Oct 17, 2025

Fix: Apply HTTP client timeouts to prevent infinite hangs

Summary

Problem

Root Cause Analysis

Primary Bug (src/indexer/graphrag/builder.rs:74)

Secondary Bug (src/embedding/provider/huggingface.rs:460)

Reproduction

Testing

Before Fix

After Fix

Impact

Related Documentation

Code Quality Improvements

Enhanced Error Handling

Documentation Comments

Code Verification

Technical Details

Why Client::new() Doesn't Apply Timeouts

Request-Level vs Client-Level Timeouts

Test Results

Unit Tests

Integration Testing

Configuration Tested

Migration Guide for Developers

For LLM/API Calls

For File Downloads

Don't Use Client::new()

Related Issues

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Primary Bug (`src/indexer/graphrag/builder.rs:74`)

Secondary Bug (`src/embedding/provider/huggingface.rs:460`)