You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
dispatch first section to prime model provider cache
dispatch the rest of the sections in parallel
emit progress update
update database row
emit completion progress update
depending on the model, step 4 can take minutes. currently in testing we're observing anywhere from 4-8 minutes. for this, we've needed to increase the maxDuration of /.well-known/workflow/v1/step in our vercel.ts, which applies to all steps and not just this one workflow. it'd be cool if we were able to define myStep.maxDuration = 600 similar to how retries are defined.
this solved the vercel functions timing out in the step functions, but then the streams were timing out. to accommodate HTTP/2 stream timeouts, i had to inject a heatbeat into the stream:
this all works fine. what's catching me out is two things:
for a long workflow, in my vercel logs i have 18 log entries for /.well-known/workflow/v1/step , 7 of these are vercel function timeouts of 600s, despite the longest step only taking 6 minutes and the stream continuing to work just fine. the entire workflow fit within the 10 minute window, so there shouldn't be any function timeouts
what's the solution for a workflow step taking longer than the 13 minutes vercel allows a function to last? opus at 50 tps takes 10 minutes to output 30k tokens, and i could easily see some of our steps taking this long or even longer. self-host a separate API that uses a postgres world and have it run as long as we want? refactor the logic to further parallelize the generation?
for issue 1, my current assumption is that wrapping the entire workflow in try-catch-finally is what's causing the issue:
for issue 2, this is a bit more complex since we're essentially giving the LLM full control over what it generates, so we don't have a reliable method of determining if a section is going to be large or not
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
as per #980, the recommended approach for durable streams is to use the provided agent, but what if we're not making chat requests?
my use case is using LLMs to generate/summarize content, rather than chat. the workflow looks more or less like this:
nsectionsdepending on the model, step 4 can take minutes. currently in testing we're observing anywhere from 4-8 minutes. for this, we've needed to increase the maxDuration of
/.well-known/workflow/v1/stepin ourvercel.ts, which applies to all steps and not just this one workflow. it'd be cool if we were able to definemyStep.maxDuration = 600similar to how retries are defined.this solved the vercel functions timing out in the step functions, but then the streams were timing out. to accommodate HTTP/2 stream timeouts, i had to inject a heatbeat into the stream:
this all works fine. what's catching me out is two things:
/.well-known/workflow/v1/step, 7 of these are vercel function timeouts of 600s, despite the longest step only taking 6 minutes and the stream continuing to work just fine. the entire workflow fit within the 10 minute window, so there shouldn't be any function timeoutsfor issue 1, my current assumption is that wrapping the entire workflow in try-catch-finally is what's causing the issue:
for issue 2, this is a bit more complex since we're essentially giving the LLM full control over what it generates, so we don't have a reliable method of determining if a section is going to be large or not
has anyone else run into this kind of thing?
Beta Was this translation helpful? Give feedback.
All reactions