Come up with a cost estimate based on number of tokens for using the openAI API

**OpenAI API** information:
* GPT-4: **0.03$/1k** tokens for input into the prompt and **0.06$/1k** tokens for the GPT-4's output.
* GPT-3.5 Turbo: **0.0015$/1k** tokens for input sent to the LLM and **0.02$/1k** tokens for the LLM's output. 
* OpenAI embeddings: **0.0001$/1k** tokens 
___
We basically need to estimate how many tokens we feed into the model. We need to see how many tokens the schema linking is gonna take. How many tokens the query classification (easy, medium, hard) is going to take and how many tokens the SQL query classification and the self-correction phase is going to take.

Take 100 examples from the **Spider** dataset and come up with the estimate. 
As a continuation you can even take 100 examples each from both, the **BIRD** dataset and the **WikiSQL** dataset. 

Check out this article by OpenAI:   [How to count tokens with **tiktoken**](https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Come up with a cost estimate based on number of tokens for using the openAI API #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Come up with a cost estimate based on number of tokens for using the openAI API #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions