The user will write it's own async inference function, something like:
async get_answer(input):
return output
And will pass it to the inference_function(), something like:
from llmsql import inference_function
result = inference_function(inference_function=get_answer, requests_per_minute=60)
The user will write it's own async inference function, something like:
And will pass it to the
inference_function(), something like: