[Bug]: Document Processing Failures

# Issue Analysis: Document Processing Failures on `main`

## Overview

The `main` branch fails to process documents submitted via the A2A protocol (`message/send`). Tasks immediately transition to `failed` state without executing the agent, and the `tasks/get` response lacks the `artifacts` field. The `feature/document-analyzer` branch does not have these issues.

---

## Commit Origin

**Reference commit:** `6aa857f` — `docs(example): update document‑analyzer README with file text field`

**All issues arose after commit `6aa857f`.** That commit (on the `feature/document-analyzer` branch) only touched a README. The breaking changes were introduced by two subsequent commits merged into `main` from separate branches that forked from the same parent (`700d111`):

| Commit | Date | Description | Issues Introduced |
|--------|------|-------------|-------------------|
| `1cc2a61` | Mar 6, 2026 | `fix(scheduler): resolve anio buffer deadlock, cpu burn loop, and trace serialization` | **Issue 1** (trace context mismatch) and **Issue 4** (unbounded buffer) |
| `a6f2206` | Mar 7, 2026 | `refactor(storage): harden memory layer, fix OOM risks, and optimize database indexes` | Storage API changes (additional `offset` param, interface changes) |
| `16f1353` | Mar 8, 2026 | `style: apply consistent formatting and add comprehensive docstrings` | Docstring-only follow-up to `1cc2a61` |

The critical breaking commit is **`1cc2a61`**. It changed the scheduler's `_TaskOperation` TypedDict from `_current_span: Span` to `trace_id: str | None` / `span_id: str | None`, but did **not** update the worker (`bindu/server/workers/base.py`) which still expects `_current_span`. This half-completed refactor causes every task to crash.

```
            700d111 (common ancestor)
           /       \
  6aa857f          1cc2a61 ← scheduler trace refactor (BROKE worker contract)
  (feature/        a6f2206 ← storage refactor
   document-       16f1353 ← formatting follow-up
   analyzer)          |
                   6d189cb (HEAD of main)
```

---

## Issues

### 1. Trace Context Mismatch — Worker Crashes on Every Task (CRITICAL)

**Introduced by:** `1cc2a61` (`fix(scheduler): resolve anio buffer deadlock, cpu burn loop, and trace serialization`)

**Impact:** All tasks fail immediately. No documents are ever processed.

The scheduler and worker have an incompatible interface for passing OpenTelemetry trace context:

| Component | File | Sends/Expects |
|-----------|------|---------------|
| Scheduler base type | `bindu/server/scheduler/base.py` (L67–76) | `trace_id: str`, `span_id: str` |
| InMemoryScheduler | `bindu/server/scheduler/memory_scheduler.py` (L68–72) | Sends `trace_id`/`span_id` strings |
| Worker | `bindu/server/workers/base.py` (L130) | Expects `task_operation["_current_span"]` (Span object) |

Commit `1cc2a61` changed `_TaskOperation` and `InMemoryScheduler` to use primitive `trace_id`/`span_id` strings, but **did not update the worker** (`bindu/server/workers/base.py` was not in the commit's changeset). The worker still calls `use_span(task_operation["_current_span"])` which raises a `KeyError` on every task, caught by the broad `except` clause which marks the task as `failed`.

**Fix required:** Either:

- **(A)** Update `base.py` worker to reconstruct a span from `trace_id`/`span_id` strings, or
- **(B)** Revert the scheduler to pass the live `_current_span` Span object (matching `feature/document-analyzer`). This involves:
  - `bindu/server/scheduler/base.py`: Change `_TaskOperation` fields from `trace_id`/`span_id` back to `_current_span: Span`
  - `bindu/server/scheduler/memory_scheduler.py`: Remove `_get_trace_context()` helper; pass `get_current_span()` directly
  - Same for `bindu/server/scheduler/redis_scheduler.py`

### 2. Response Missing `artifacts` Field (CONSEQUENCE OF  1)

**Impact:** `tasks/get` returns a task with no artifacts.

This is not a separate bug — it is a direct consequence of Issue 1. The processing flow is:

```
message/send → task created (state: "submitted", no artifacts)
            → scheduled to worker
            → worker crashes on _current_span KeyError
            → task marked "failed" (no artifacts generated)
tasks/get   → returns failed task without artifacts
```

Artifacts are only generated in `ManifestWorker._handle_terminal_state()` when state is `"completed"`. Since the worker never reaches agent execution, no artifacts are ever created.

**Fix required:** Resolving Issue 1 will fix this — once tasks execute successfully, `build_artifacts()` will produce artifacts and `update_task()` will persist them.

### 3. Frontend Does Not Pass File Parts to Agent Messages (MODERATE)

**Impact:** Frontend file uploads are constructed but never reach the agent due to Issue 1. If Issue 1 is fixed, this path works correctly on `main`.

On `main`, `frontend/src/lib/utils/agentMessageHandler.ts` accepts a `files` parameter, builds `FilePart` objects with the A2A-required `text` field, and includes them in the message payload. The `frontend/src/lib/server/endpoints/bindu/types.ts` `FilePart` interface also requires `text: string`.

On `feature/document-analyzer`, the frontend file upload code is entirely removed — the `files` parameter is dropped and messages only contain `TextPart`. The `FilePart` type also drops the `text` field.

**No fix required for backend processing** — the curl-based API path works correctly for file uploads. The frontend code on `main` is structurally correct but untestable while Issue 1 exists.

### 4. InMemoryScheduler Uses Unbounded Buffer (MINOR)

**Introduced by:** `1cc2a61` (`fix(scheduler): resolve anio buffer deadlock, cpu burn loop, and trace serialization`)

**File:** `bindu/server/scheduler/memory_scheduler.py` (L53–55)

On `main`, the anyio memory object stream is created with `math.inf` buffer:
```python
anyio.create_memory_object_stream[TaskOperation](math.inf)
```

On `feature/document-analyzer`, it uses the default (unbuffered):
```python
anyio.create_memory_object_stream[TaskOperation]()
```

The `math.inf` buffer was added to prevent a deadlock where the API server hangs if no worker is immediately ready to receive. However, an unbounded buffer can silently accumulate tasks during failures without backpressure.

**Fix required:** Evaluate whether a bounded buffer (e.g., 100) is more appropriate than `math.inf`, or keep the default if the worker startup is guaranteed before task submission.

---

## Root Cause Chain

```
curl message/send (with file parts)
  ↓
Task submitted to storage (state: "submitted")              ✅
  ↓
Task scheduled via InMemoryScheduler.run_task()             ✅
  sends: {operation: "run", params: ..., trace_id: "...", span_id: "..."}
  ↓
Worker._handle_task_operation() receives TaskOperation      ✅
  ↓
Worker accesses task_operation["_current_span"]             ❌ KeyError
  ↓
Exception caught → storage.update_task(state="failed")      ← task never runs
  ↓
tasks/get returns: {state: "failed", NO artifacts}
```

## Files Requiring Changes

| File | Change | Priority |
|------|--------|----------|
| `bindu/server/scheduler/base.py` | Fix `_TaskOperation` type to match worker expectations | Critical |
| `bindu/server/scheduler/memory_scheduler.py` | Fix trace context passing to match `_TaskOperation` type | Critical |
| `bindu/server/scheduler/redis_scheduler.py` | Fix trace context passing to match `_TaskOperation` type | Critical |
| `bindu/server/workers/base.py` | Ensure `_handle_task_operation` matches scheduler's task operation format | Critical |

## Verification

After fixing, the following should work:

```bash
# 1. Send document
curl -X POST http://localhost:3773/ \
  -H 'Content-Type: application/json' \
  -d '{
    "jsonrpc": "2.0",
    "id": "test-001",
    "method": "message/send",
    "params": {
      "message": {
        "messageId": "msg-001",
        "contextId": "ctx-001",
        "taskId": "task-001",
        "kind": "message",
        "role": "user",
        "parts": [
          {"kind": "text", "text": "Analyze this document"},
          {"kind": "file", "text": "paper.pdf", "file": {"name": "paper.pdf", "mimeType": "application/pdf", "bytes": "<base64>"}}
        ]
      }
    }
  }'
# Expected: task in "submitted" state

# 2. Check task status (after processing)
curl -X POST http://localhost:3773/ \
  -H 'Content-Type: application/json' \
  -d '{
    "jsonrpc": "2.0",
    "id": "test-002",
    "method": "tasks/get",
    "params": {"taskId": "task-001"}
  }'
# Expected: task in "completed" state WITH artifacts array
```


Commit	Date	Description	Issues Introduced
`1cc2a61`	Mar 6, 2026	`fix(scheduler): resolve anio buffer deadlock, cpu burn loop, and trace serialization`	Issue 1 (trace context mismatch) and Issue 4 (unbounded buffer)
`a6f2206`	Mar 7, 2026	`refactor(storage): harden memory layer, fix OOM risks, and optimize database indexes`	Storage API changes (additional `offset` param, interface changes)
`16f1353`	Mar 8, 2026	`style: apply consistent formatting and add comprehensive docstrings`	Docstring-only follow-up to `1cc2a61`

File	Change	Priority
`bindu/server/scheduler/base.py`	Fix `_TaskOperation` type to match worker expectations	Critical
`bindu/server/scheduler/memory_scheduler.py`	Fix trace context passing to match `_TaskOperation` type	Critical
`bindu/server/scheduler/redis_scheduler.py`	Fix trace context passing to match `_TaskOperation` type	Critical
`bindu/server/workers/base.py`	Ensure `_handle_task_operation` matches scheduler's task operation format	Critical

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Document Processing Failures #353

Issue Analysis: Document Processing Failures on `main`

Overview

Commit Origin

Issues

1. Trace Context Mismatch — Worker Crashes on Every Task (CRITICAL)

2. Response Missing `artifacts` Field (CONSEQUENCE OF 1)

3. Frontend Does Not Pass File Parts to Agent Messages (MODERATE)

4. InMemoryScheduler Uses Unbounded Buffer (MINOR)

Root Cause Chain

Files Requiring Changes

Verification

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component	File	Sends/Expects
Scheduler base type	`bindu/server/scheduler/base.py` (L67–76)	`trace_id: str`, `span_id: str`
InMemoryScheduler	`bindu/server/scheduler/memory_scheduler.py` (L68–72)	Sends `trace_id`/`span_id` strings
Worker	`bindu/server/workers/base.py` (L130)	Expects `task_operation["_current_span"]` (Span object)

[Bug]: Document Processing Failures #353

Description

Issue Analysis: Document Processing Failures on main

Overview

Commit Origin

Issues

1. Trace Context Mismatch — Worker Crashes on Every Task (CRITICAL)

2. Response Missing artifacts Field (CONSEQUENCE OF 1)

3. Frontend Does Not Pass File Parts to Agent Messages (MODERATE)

4. InMemoryScheduler Uses Unbounded Buffer (MINOR)

Root Cause Chain

Files Requiring Changes

Verification

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Issue Analysis: Document Processing Failures on `main`

2. Response Missing `artifacts` Field (CONSEQUENCE OF 1)