Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 102 additions & 0 deletions docs/api/batch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Batch Endpoint

Batch video generation via HTTP. Unlike the WebRTC streaming path (real-time, interactive), the batch endpoint produces a complete video in one request, processing it chunk-by-chunk with SSE progress events.

Primary consumer: ComfyUI custom nodes (`comfyui-scope`).

## Endpoints

| Endpoint | Method | Purpose |
|---|---|---|
| `/api/v1/batch` | POST | Generate video (SSE stream) |
| `/api/v1/batch/cancel` | POST | Cancel after current chunk |
| `/api/v1/batch/upload` | POST | Upload input video for v2v |
| `/api/v1/batch/upload-data` | POST | Upload binary data blob (VACE, per-chunk video) |
| `/api/v1/batch/download` | GET | Download output video |

Only one generation can run at a time (409 if busy).

## Flow

```
1. [optional] POST /batch/upload → input_path
2. [optional] POST /batch/upload-data → data_blob_path
3. POST /batch (JSON body, references paths from steps 1-2)
← SSE: event: progress {chunk, total_chunks, frames, latency, fps}
← SSE: event: complete {output_path, video_shape, num_frames, ...}
4. GET /batch/download?path=<output_path>
← binary video data
```

## Binary Protocol

### Video Upload (`/batch/upload`)

**Request**: Raw uint8 bytes in THWC order (frames × height × width × channels).

**Headers** (required):
- `X-Video-Frames`: T
- `X-Video-Height`: H
- `X-Video-Width`: W
- `X-Video-Channels`: C (default 3)

**Stored format**: 20-byte header + raw data.
```
[4 bytes: ndim (little-endian u32)]
[4 bytes × ndim: shape dimensions (little-endian u32 each)]
[raw uint8 video bytes]
```

### Data Blob Upload (`/batch/upload-data`)

**Request**: Raw binary blob containing packed arrays. Max size: 2 GB.

The blob is an opaque byte buffer. `ChunkSpec` entries in the batch request reference regions of this blob by offset:

```json
{
"chunk": 0,
"vace_frames_offset": 0,
"vace_frames_shape": [1, 3, 12, 320, 576],
"vace_masks_offset": 26542080,
"vace_masks_shape": [1, 1, 12, 320, 576]
}
```

Arrays are packed as contiguous float32 (VACE frames/masks) or uint8 (input video). The client is responsible for computing offsets when packing the blob.

### Video Download (`/batch/download`)

**Response**: Same binary format as upload (20-byte header + raw uint8 THWC data).

**Response headers**:
- `X-Video-Frames`, `X-Video-Height`, `X-Video-Width`, `X-Video-Channels`

## BatchRequest

```json
{
"pipeline_id": "longlive",
"prompt": "a cat walking",
"num_frames": 48,
"seed": 42,
"noise_scale": 0.7,
"input_path": "<from /batch/upload>",
"data_blob_path": "<from /batch/upload-data>",
"chunk_specs": [
{
"chunk": 0,
"text": "override prompt for chunk 0",
"lora_scales": {"path/to/lora.safetensors": 0.5},
"vace_frames_offset": 0,
"vace_frames_shape": [1, 3, 12, 320, 576]
}
],
"pre_processor_id": null,
"post_processor_id": null
}
```

Request-level fields are global defaults. `chunk_specs` entries override any field for a specific chunk index. Only fields that change need to be specified — prompts are sticky (last-set persists).

See `schema.py` for the full `GenerateRequest` and `ChunkSpec` field definitions.
Loading
Loading