Long-lasting workflows & streams on Vercel #1106

kylekz · 2026-02-18T12:20:38Z

kylekz
Feb 18, 2026

as per #980, the recommended approach for durable streams is to use the provided agent, but what if we're not making chat requests?

my use case is using LLMs to generate/summarize content, rather than chat. the workflow looks more or less like this:

prepare data, fetch from database, split data into n sections
emit progress update
dispatch first section to prime model provider cache
dispatch the rest of the sections in parallel
emit progress update
update database row
emit completion progress update

depending on the model, step 4 can take minutes. currently in testing we're observing anywhere from 4-8 minutes. for this, we've needed to increase the maxDuration of /.well-known/workflow/v1/step in our vercel.ts, which applies to all steps and not just this one workflow. it'd be cool if we were able to define myStep.maxDuration = 600 similar to how retries are defined.

this solved the vercel functions timing out in the step functions, but then the streams were timing out. to accommodate HTTP/2 stream timeouts, i had to inject a heatbeat into the stream:

export function createNDJSONStream<T>(): {
  stream: TransformStream<T, Uint8Array>;
  encoder: TextEncoder;
} {
  const encoder = new TextEncoder();
  const stream = new TransformStream<T, Uint8Array>({
    transform(chunk, controller) {
      const line = JSON.stringify(chunk) + "\n";
      controller.enqueue(encoder.encode(line));
    },
  });

  return { stream, encoder };
}

export function createHeartbeatStream<T>(
  source: ReadableStream<T>,
  intervalMs = 15_000,
): ReadableStream<Uint8Array> {
  const { stream: ndjsonStream, encoder } = createNDJSONStream<T>();
  const { readable, writable } = new TransformStream<Uint8Array>();
  const writer = writable.getWriter();
  const reader = source.pipeThrough(ndjsonStream).getReader();

  let heartbeatTimeout: NodeJS.Timeout | null = null;
  let closed = false;

  const scheduleHeartbeat = () => {
    if (heartbeatTimeout) clearTimeout(heartbeatTimeout);
    if (closed) return;
    heartbeatTimeout = setTimeout(() => {
      void sendHeartbeat();
    }, intervalMs);
  };

  const sendHeartbeat = async () => {
    if (closed) return;
    try {
      const msg =
        JSON.stringify({ type: "heartbeat", timestamp: Date.now() }) + "\n";
      await writer.write(encoder.encode(msg));
    } catch {
      return;
    }
    scheduleHeartbeat();
  };

  const pump = async () => {
    try {
      scheduleHeartbeat();
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        scheduleHeartbeat();
        await writer.write(value);
      }
    } finally {
      closed = true;
      reader.releaseLock();
      if (heartbeatTimeout) clearTimeout(heartbeatTimeout);
      try {
        await writer.close();
      } catch {
        // Stream closed, ignore
      }
    }
  };

  void pump();

  return readable;
}

export async function POST(request: Request, { params }: RouteParams) {
  const run = await start(myWorkflow);
  const stream = createHeartbeatStream(run.readable);
  return new Response(stream, {
    headers: {
      "Content-Type": "application/x-ndjson",
      "X-Workflow-Run-Id": run.runId,
    },
  });
}

this all works fine. what's catching me out is two things:

for a long workflow, in my vercel logs i have 18 log entries for /.well-known/workflow/v1/step , 7 of these are vercel function timeouts of 600s, despite the longest step only taking 6 minutes and the stream continuing to work just fine. the entire workflow fit within the 10 minute window, so there shouldn't be any function timeouts
what's the solution for a workflow step taking longer than the 13 minutes vercel allows a function to last? opus at 50 tps takes 10 minutes to output 30k tokens, and i could easily see some of our steps taking this long or even longer. self-host a separate API that uses a postgres world and have it run as long as we want? refactor the logic to further parallelize the generation?

for issue 1, my current assumption is that wrapping the entire workflow in try-catch-finally is what's causing the issue:

try {
  await step1()
  await step2()
  await step3()
} catch {
  await emitErrorProgress()
} finally {
  await closeStream()
}

for issue 2, this is a bit more complex since we're essentially giving the LLM full control over what it generates, so we don't have a reliable method of determining if a section is going to be large or not

has anyone else run into this kind of thing?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long-lasting workflows & streams on Vercel #1106

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Long-lasting workflows & streams on Vercel #1106

Uh oh!

kylekz Feb 18, 2026

Replies: 0 comments

kylekz
Feb 18, 2026