hi there, i currently just started using the openai api for my custom chatbot, and i am using langchain together with a faiss vectorstore. im a little bit concerned with my billing usage. when using chat.completions api, the tokens are in the below 1,000 usually (gpt-3.5-turbo-0613), but when i use the vectorsearch one, it can go up to >10k tokens for very few requests (text-davinci).
is there any reason for the large disparity?
additionally, how do i know which model it is using? i can have the exact same code, hit the api, and still see a mixture of models used in my usage section.
thanks in advance!