You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using the GetSemanticFunctionUsedTokensAsync function to calculate tokens for a prompt. The issue I'm encountering is that while it accurately counts the tokens generated by the prompt, it doesn't capture the total tokens generated when invoking a semantic function. This discrepancy may potentially lead to exceeding the maximum token limit.
assistant: Assistant is a large language model.
user: Prompt sent
Where the assistant's message will typically be the default message, and the user's message will be the prompt that is sent. This results in not accurately calculating the total number of tokens being sent, potentially causing us to exceed the maximum token limit.
According to the OpenAI documentation (https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb), the calculation for models such as ["gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", "gpt-4-0314", "gpt-4-32k-0314", "gpt-4-0613", "gpt-4-32k-0613"] would be:
3 extra tokens per message + 3 extra tokens for the output.
In the above scenario, the calculation would be:
assistant: Assistant is a large language model. => 3 tokens for the message + 7 tokens for the content
user: Hello => 3 tokens for the message + 1 token for the content
=> 3 extra output tokens
=> Total: 17 tokens
In tests conducted with a prompt containing the text 'Hello' the token usage returned by the request is 19, indicating a 2-token difference from the above calculation.
With an empty prompt, according to the OpenAI documentation, the result should be 16 tokens. In tests, the result is 18 tokens.
The test results data obtained is what the OpenAI client returns in the response.
Proposed Solution
While it may not be the optimal solution, to ensure accuracy, we could consider always adding 25 tokens (as a parameter with a default value) to the GetSemanticFunctionUsedTokensAsync function.
I'm using the GetSemanticFunctionUsedTokensAsync function to calculate tokens for a prompt. The issue I'm encountering is that while it accurately counts the tokens generated by the prompt, it doesn't capture the total tokens generated when invoking a semantic function. This discrepancy may potentially lead to exceeding the maximum token limit.
In the version used by Enmarcha, Semantic Kernel (1.0.0-beta8), when executing a semantic function and generating the call with the OpenAI SDK, two messages are generated (https://github.com/microsoft/semantic-kernel/blob/dotnet-1.0.0-beta8/dotnet/src/Connectors/Connectors.AI.OpenAI/AzureSdk/ClientBase.cs#L317):
Where the assistant's message will typically be the default message, and the user's message will be the prompt that is sent. This results in not accurately calculating the total number of tokens being sent, potentially causing us to exceed the maximum token limit.
According to the OpenAI documentation (https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb), the calculation for models such as ["gpt-3.5-turbo-0613", "gpt-3.5-turbo-16k-0613", "gpt-4-0314", "gpt-4-32k-0314", "gpt-4-0613", "gpt-4-32k-0613"] would be:
In the above scenario, the calculation would be:
In tests conducted with a prompt containing the text 'Hello' the token usage returned by the request is 19, indicating a 2-token difference from the above calculation.
With an empty prompt, according to the OpenAI documentation, the result should be 16 tokens. In tests, the result is 18 tokens.
The test results data obtained is what the OpenAI client returns in the response.
Proposed Solution
While it may not be the optimal solution, to ensure accuracy, we could consider always adding 25 tokens (as a parameter with a default value) to the GetSemanticFunctionUsedTokensAsync function.
Considerations
The text was updated successfully, but these errors were encountered: