By Christian Prokopp on 2023-11-23
Recently, OpenAI released GPT4 turbo preview with 128k at its DevDay. That addresses a serious limitation for Retrieval Augmented Generation (RAG) applications, which I described in detail for Llamar.ai. That amounts to nearly 200 pages of text, assuming approximately 500 words per page and 0.75 words per token and 2¹⁷ tokens.
While writing some code using the gpt-4-1106-preview
model via the API, I noticed that long responses never exceed 4,096 tokens for completion. Responses cut off mid-sentence or word even when the total is less than 128k tokens, i.e. input plus completion. A quick search in the OpenAI forum revealed that others observe this behaviour, and the model does not provide more than 4,096 completion tokens.
The larger context window greatly improves maintaining context in lengthy conversations. RAG applications can benefit from more detailed in-context learning and a higher chance of having relevant text in-context. However, this is a serious limitation for other applications requiring extensive outputs, such as data generation or conversion. I wish OpenAI were more proactive in listing the limitations in their DevDay announcement or the API description. OpenAI does mention it in its documentation in the model description.
Lastly, it is a reminder never to assume, always check and use logs and metrics whenever possible. Biases and issues can creep in from unexpected vectors.
You might also be interested in the more recent post: How many words are 128k tokens?
Christian Prokopp, PhD, is an experienced data and AI advisor and founder who has worked with Cloud Computing, Data and AI for decades, from hands-on engineering in startups to senior executive positions in global corporations. You can contact him at christian@bolddata.biz for inquiries.
2024-04-12
128k tokens are 96k words in English for ChatGPT 3.5 and 4. The ratio is estimated to be 0.75 words per token. However, the answer is not straightf...
2023-11-29
Large-language models (LLMs) are great generalists, but modifications are required for optimisation or specialist tasks. The easiest choice is Retr...
2023-02-11
Microsoft could follow Google's $100bn loss. I tried the new Bing Chat (ChatGPT) feature, which was great until it went disastrously wrong. It even...
2023-02-07
ChatGPT is a state-of-the-art language model developed by OpenAI, utilising the Transformer model and fine-tuned through reinforcement learning to...
2022-08-08
There is one simple thing most companies miss about their data. It has been instrumental in my work as a data professional ever since.
2022-05-18
According to an adage, big data is anything too big for Excel, i.e. more than 1,048,576 rows. It is a bit cheek-in-tongue, but, as with many jokes,...