OpenAI unveils huge upgrade to ChatGPT that makes it more eerily human than ever
ChatGPT's latest upgrade means the voice assistant can now respond to audio, text and visual inputs in real time. The new chatbot, named ChatGPT-4o, will be rolled out to alpha testers in the coming weeks.
A new version of ChatGPT can read facial expressions, mimic human voice patterns and have near real-time conversations, its creators have revealed.
OpenAI demonstrated the upcoming version of the artificial intelligence (AI) chatbot, called GPT-4o, in an apparently real-time presentation on Monday (May 13). The chatbot, which spoke out loud with presenters through a phone, appeared to have an eerie command of human conversation and its subtle emotional cues — switching between robotic and singing voices upon command, adapting to interruptions and visually processing the facial expressions and surroundings of its conversational partners.
During the demonstration, the AI voice assistant showcased its skills by completing tasks such as real-time language translation, solving a math equation written on a piece of paper and guiding a blind person around London's streets.
"her," Sam Altman, OpenAI's CEO, wrote in a one-word post on the social media platform X after the presentation had ended. The post is a reference to the 2013 film of the same name, in which a lonely man falls in love with an AI assistant.
To show off its ability to read visual cues, the chatbot used the phone’s camera lens to read one OpenAI engineer’s facial expressions and describe their emotions.
Related: MIT gives AI the power to 'reason like humans' by creating hybrid architecture
"Ahh, there we go, it looks like you're feeling pretty happy and cheerful with a big smile and a touch of excitement," said the bot, which answered to the name ChatGPT. "Whatever is going on, it looks like you're in a good mood. Care to share the source of those good vibes?"
Sign up for the Live Science daily newsletter now
Get the world’s most fascinating discoveries delivered straight to your inbox.
If the demonstration is an accurate representation of the bot's abilities, the new capabilities are a massive improvement on the limited voice features in the company's previous models — which were incapable of handling interruptions or responding to visual information.
"We're looking at the future of interaction between ourselves and the machines," Mira Murati, OpenAI's chief technology officer, said at the news conference. "We think GPT-4o is really shifting that paradigm."
The new voice assistant is set to be released in a limited form to alpha testers in the coming weeks, followed by a wider rollout that will begin with paying ChatGPT Plus subscribers. The announcement also follows a Bloomberg report that the company is nearing a deal with Apple to integrate ChatGPT on the iPhone — opening a possibility that GPT-4o could be used to upgrade Siri, the iPhone's voice assistant.
But the new technology comes with significant safety concerns. The bot's ability to process real-time text, audio and visual input means that it could be used for spying. And its convincing emotional mimicry might also make it adept at conducting scam phone calls or presenting dangerous misinformation in a convincing manner.
In response to these issues, Murati said that OpenAI was working to build "mitigations against misuse" of the new technology.
Ben Turner is a U.K. based staff writer at Live Science. He covers physics and astronomy, among other topics like tech and climate change. He graduated from University College London with a degree in particle physics before training as a journalist. When he's not writing, Ben enjoys reading literature, playing the guitar and embarrassing himself with chess.