OpenAI introduces GPT-4 Turbo: Larger memory, lower cost, new knowledge

By | November 6, 2023
A stock illustration of a chatbot icon on a blue wavy background.

On Monday at the OpenAI DevDay event, company CEO Sam Altman announced a major update to its GPT-4 language model called GPT-4 Turbo, which can process a much larger amount of text than GPT-4 and features a knowledge cutoff of April 2023. He also introduced APIs for DALL-E 3, GPT-4 Vision, and text-to-speech—and launched an “Assistants API” that makes it easier for developers to build assistive AI apps.

OpenAI hosted its first-ever developer event on November 6 in San Francisco called DevDay. During the opening keynote delivered by Altman in front of a small audience, the CEO showcased the wider impacts of its AI technology in the world, including helping people with tech accessibility. Altman shared some stats, saying that over 2 million developers are building apps using its APIs, over 92 percent of Fortune 500 companies are building on their platform, and that ChatGPT has over 100 million active weekly users.

At one point, Microsoft CEO Satya Nadella made a surprise appearance on the stage, talking with Altman about the deepening partnership between Microsoft and OpenAI and sharing some general thoughts about the future of the technology, which he thinks will empower people.

The OpenAI DevDay 2023 keynote from Sam Altman.

GPT-4 gets an upgrade

During the keynote, Altman dropped several major announcements, including “GPTs,” which are custom, shareable, user-defined ChatGPT AI roles that we covered separately in another article. He also launched the aforementioned GPT-4 Turbo model, which is perhaps most notable for three properties: context length, more up-to-date knowledge, and price.

Large language models (LLM) like GPT-4 rely on a context length or “context window” that defines how much text they can process at once. That window is often measured in tokens, which are chunks of words. According to OpenAI, one token corresponds roughly to about four characters of English text, or about three-quarters of a word. That means GPT-4 Turbo can consider around 96,000 words in one go, which is longer than many novels. Also, a 128K context length can lead to much longer conversations without having the AI assistant lose its short-term memory of the topic at hand.

Previously, GPT-4 featured an 8,000-token context window, with a 32K model available through an API for some developers. Extended context windows aren’t completely new to GPT-4 Turbo: Anthropic announced a 100K token version of its Claude language model in May, and Claude 2 continued that tradition.

For most of the past year, ChatGPT and GPT-4 only officially incorporated knowledge of events up to September 2021 (although judging by reports, OpenAI has been silently testing models with more recent cutoffs at various times). GPT-4 Turbo has knowledge of events up to April 2023, making it OpenAI’s most up-to-date language model yet.

And regarding cost, running GPT-4 Turbo as an API reportedly costs one-third less than GPT-4 for input tokens (at $0.01 per 1,000 tokens) and one-half less than GPT-4 for output tokens (at $0.03 per 1,000 tokens). Relatedly, OpenAI also dropped prices for its GPT-3.5 Turbo API models. And OpenAI announced it is doubling the tokens-per-minute limit for all paying GPT-4 customers, allowing requests for increased rate limits as well.

More capabilities come to API

APIs, or application programming interfaces, are ways that programs can talk to each other. They let software developers integrate OpenAI’s models into their apps. Starting Monday, OpenAI now offers access to APIs for: GPT-4 Turbo with vision, which can analyze images and use them in conversations; DALL-E 3, which can generate images using AI image synthesis; and OpenAI’s text-to-speech model, which has made a splash in the ChatGPT app with its realistic voices.

OpenAI also debuted the “Assistants API,” which can help developers build “agent-like experiences” within their own apps. It’s similar to an API version of OpenAI’s new “GPTs” product that allows for custom instructions and external tool use.

The key to Assistants API, OpenAI says, is “persistent and infinitely long threads,” which allow developers to forego keeping track of an existing conversation history themselves and manually manage context window limitations. Instead, developers can add each new message in the conversation to an existing thread. In contrast to “stateless” AI, which means the AI model approaches each chat session as a blank slate with no knowledge of previous interactions, people often call this threaded approach “stateful” AI.

Odds and ends

Also on Monday, OpenAI introduced what it calls “Copyright Shield,” which is the company’s commitment to protect its enterprise and API customers from legal claims related to copyright infringement due to using its text or image generators. The shield does not apply to ChatGPT free or Plus users. And OpenAI announced the launch of version 3 of its open source Whisper model, which handles speech recognition.

While closing out his keynote address, Altman emphasized his company’s iterative approach toward introducing AI features with more agency (referring to GPTs) and expressed optimism that AI will create abundance. “As intelligence is integrated everywhere, we will all have superpowers on demand,” he said.

While inviting attendees to return to DevDay next year, Altman dropped a hint at what’s to come: “What we launched today is going to look very quaint compared to what we’re creating for you now.”

Source