On Wednesday, OpenAI announced ChatGPT, a dialogue-based AI chat interface for its GPT-3 family of large language models. It’s currently free to use with an OpenAI account during a testing phase. Unlike the GPT-3 model found in OpenAI’s Playground and API, ChatGPT provides a user-friendly conversational interface and is designed to strongly limit potentially harmful output.
“The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests,” writes OpenAI on its announcement blog page.
So far, people have been putting ChatGPT through its paces, finding a wide variety of potential uses while also exploring its vulnerabilities. It can write poetry, correct coding mistakes with detailed examples, generate AI art prompts, write new code, expound on the philosophical classification of a hot dog as a sandwich, and explain the worst-case time complexity of the bubble sort algorithm… in the style of a “fast-talkin’ wise guy from a 1940’s gangster movie.”
OpenAI’s new ChatGPT explains the worst-case time complexity of the bubble sort algorithm, with Python code examples, in the style of a fast-talkin’ wise guy from a 1940’s gangster movie: pic.twitter.com/MjkQ5OAIlZ
— Riley Goodside (@goodside) December 1, 2022
ChatGPT also refuses to answer many potentially harmful questions (related to topics such as hate speech, violent content, or how to build a bomb) because the answers would go against its “programming and purpose.” OpenAI achieved this through both a special prompt it prepends to all input and by using a technique called Reinforcement Learning from Human Feedback (RLHF), which can fine-tune an AI model based on how humans rate its generated responses.
Reining in the offensive proclivities of large language models is one of the key problems that has limited their potential market usefulness, and OpenAI sees ChatGPT as a significant iterative step in the direction of providing a safe AI model for everyone.
And yet, unsurprisingly, people have already figured out how to circumvent some of ChatGPT’s built-in content filters using quasi-social engineering attacks, such as asking the AI to frame a restricted output as a pretend scenario (or even as a poem). ChatGPT also appears to be vulnerable to prompt-injection attacks, a story Ars broke in September.
Like GPT-3, its dialogue-based cousin is also very good at completely making stuff up in an authoritative-sounding way, such as a book that doesn’t exist, including details about its content. This represents another key problem with large language models as they exist today: If they can breathlessly make up convincing information whole cloth, how can you trust any of their output?
OpenAI’s new chatbot is amazing. It hallucinates some very interesting things. For instance, it told me about a (v interesting sounding!) book, which I then asked it about:
Unfortunately, neither Amazon nor G Scholar nor G Books thinks the book is real. Perhaps it should be! pic.twitter.com/QT0kGk4dGs
— Michael Nielsen (@michael_nielsen) December 1, 2022
Still, as people have noticed, ChatGPT’s output quality seems to represent a notable improvement over previous GPT-3 models, including the new text-davinci-003 model we wrote about on Tuesday. OpenAI itself says that ChatGPT is part of the “GPT 3.5” series of models that was trained on “a blend of text and code from before Q4 2021.”
Meanwhile, rumors of GPT-4 continue to swirl. If today’s ChatGPT model represents the culmination of OpenAI’s GPT-3 training work in 2021, it will be interesting to see what GPT-related innovations the firm has been working on over these past 12 months.