How to Calculate GPT-3 API Call Costs with Python?
Find the full code here
Introduction
GPT-3 is a language generation model created by OpenAI blah blah... Well, if you’re reading this, you probably already know what GPT-3 is, so let’s not bore you with the details. So getting straight to the main point here.
I am currently using GPT-3 APIs for my project and since GPT-3 doesn’t provide the cost of the API call in the output, I had to find another way.
How cost is calculated in GPT-3?

You can see the pricing here. The pricing for all GPT-3 models is calculated based on the number of tokens used in both the prompt text provided by the user and the text generated by the model. The pricing is determined by a per-1000 token rate. This means that the cost is calculated by multiplying the total number of tokens in both the prompt and generated text by the appropriate price per token. Let’s understand what a token is.
What is a token?
The number of tokens is the basic metric we need to calculate the price. You might have seen that in the GPT-3 playground. The GPT family of models process text using tokens, which are common sequences of characters found in the text. A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words)
OpenAI does provide a way to find the number of tokens present in the given text. If you go to this link, you can input the text and it will give you the number of tokens present

However, since we’re working with APIs, we require a programmatic interface to calculate tokens. It is said that approximately a token is 4 characters but it won’t be always the same, so it is not ideal to calculate the number of tokens that way.
Tokenizer Model
The tokenizer model is a machine-learning model designed to break down text data into individual tokens. So using a such model we can break our text into tokens and find the number of tokens just as you saw in the above example.
Mainly I could find two models in the hugging face library. Those are GPT2Tokenizer and GPT2TokenizerFast. Don’t get confused between GPT-2 and GPT-3, GPT-3 uses a similar tokenization process to GPT-2.
As their names suggest, GPT2TokenizerFast is a newer and faster model, while GPT2Tokenizer is the original tokenizer implementation of GPT-2. Both will give you similar results.
How to implement them?
Find the full code here
Firstly install the transformers package by running
pip install transformers
Then import the necessary model from transformers. You can import GPT2TokenizerFast or GPT2Tokenizer
from transformers import GPT2TokenizerFast
Then we have to load a pre-trained tokenizer from this GPT2TokenizerFast model and we have to store it in our system so that we don’t have to download(load) each time.
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
tokenizer.save_pretrained('/tmp/gpt2-tokenizer')
So we have initialized the GPT2TokenizerFast object with the pre-trained model and it is saved in location /tmp/gpt2-tokenizer. Now we have to use the tokenize method of the tokenizer object we created.
token = tokenizer.tokenize("provide the text for tokenization here")
This will tokenize the provided text to a list of tokens which looks like this:
[‘prov’, ‘ide’, ‘Ġthe’, ‘Ġtext’, ‘Ġfor’, ‘Ġtoken’, ‘ization’, ‘Ġhere’]
The “G” characters that you see before some of the words in the list are special symbols that the tokenizer uses to show where a new word begins.
From this, you can just simply find the length of the list token which will be the number of tokens present in the text we provided which is 8 in this case.
len(token)
You can also limit the user’s input by calculating the number of tokens because GPT-3 has a limit of 2048 tokens ( prompt and generated completion combined )
Since we found the number of tokens, let’s find the price for a call now. There are mainly 4 models (Ada, Babbage, Curie and Davinci). Let’s take Davinci for example, it costs $0.0200 for 1000 tokens. So the mathematics here is simple.
davinci_price = 0.0200
price_for_a_token = davinci_price/1000
total_cost = len(token)*price_for_a_token
This is simple, IKYK. The price for each model may vary each time when new models come out, so check them frequently.
In conclusion, understanding token pricing is crucial for the effective use of the GPT-3 API. By being aware of the cost of each call, developers can optimize their usage and prevent any unexpected expenses.
Find the full code here