News > Smart & Connected Life Why ChatGPT's New Ability to Speak Could Change Everything More ways to communicate By Sascha Brodsky Sascha Brodsky Senior Tech Reporter Macalester College Columbia University Sascha Brodsky is a freelance journalist based in New York City. His writing has appeared in The Atlantic, the Guardian, the Los Angeles Times and many other publications. lifewire's editorial guidelines Published on September 27, 2023 10:45AM EDT Fact checked by Jerri Ledford Fact checked by Jerri Ledford Western Kentucky University Gulf Coast Community College Jerri L. Ledford has been writing, editing, and fact-checking tech stories since 1994. Her work has appeared in Computerworld, PC Magazine, Information Today, and many others. lifewire's fact checking process Smart & Connected Life AI & Everyday Life News Trending Videos Close this video player OpenAI launched a revamped version of its chatbot that can converse with users. ChatGPT can now comprehend spoken words, respond with an artificial voice, and evaluate pictures.The chatbot’s new abilities could make technology more inclusive. Using AI to Write Code. cofotoisme / Getty Images Talking to your chatbot is now a thing, and it could revolutionize how we interact with artificial intelligence (AI). OpenAI has released a new version of its chatbot that can talk to people. ChatGPT now has the ability to "see, hear, and speak." The bot can understand spoken language, reply using a synthetic voice, and analyze images. "Interacting with AI chatbots using spoken words fosters a sense of natural communication, catering to our innate human preference for verbal exchange," Proto Inc.'s AI lead, Raffi Kryszek, told Lifewire via email. "This mode of interaction is not only often faster than typing but also heightens convenience, especially on devices or in settings where typing isn't feasible." Chatting With Your Bot The new update to the chatbot, the biggest one from OpenAI since GPT-4, lets users have voice chats on the ChatGPT mobile app. Users can pick from five different robot voices for the chatbot to use. They can also show pictures to ChatGPT, a feature called GPT-4-Vision, and point out specific areas to look at or discuss. "Snap a picture of a landmark while traveling and have a live conversation about what's interesting about it," the company wrote on its website. "When you're home, snap pictures of your fridge and pantry to figure out what's for dinner (and ask follow-up questions for a step-by-step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you." With the ability to process voice, ChatGPT can imitate voices and produce speech after hearing just a brief snippet of someone speaking. ChatGPT's updated voice function can tell bedtime stories, help resolve dinner table discussions, and read users' typed text verbally. OpenAI has acknowledged the risk of this feature being used for impersonation or fraudulent activities. Despite these concerns, the company said that ChatGPT will only use voices already in the system and have received prior approval from the company. Newer chatbots like OpenAI's ChatGPT are all much better at carrying out conversations and understanding users' instructions than the old generation of Alexa, Siri, and Google Assistant, Chris Callison-Burch, a professor of computer and information science at the University of Pennsylvania, said in an email. "I expect a rapid leap forward in smart assistants as they incorporate generative AI technology." I expect a rapid leap forward in smart assistants as they incorporate generative AI technology. The upgraded version of ChatGPT will roll out to Plus and Enterprise users on mobile platforms in the next two weeks, with follow-on access for developers and other users "soon after." ChatGPT's voice feature could be useful for children, Callison-Burch suggested. He said his children used Amazon Alexa to search the internet. "My kids asked Alexa science questions like, 'How many teeth do snails have?' or 'Can turtles breathe through their butts?' and quizzed it about Pokémon," he added. "They used it to teach themselves interesting math facts (one of my kids can count by 1000s to novemtrigintillion because of Alexa)." Callison-Burch said he has had early access to GPT-4-Vision and found it "incredibly impressive." "I have used it to describe photographs, figures in scientific papers, and even fine art paintings," he added. "Its descriptions are exceptionally good, and you can have a conversation with it about the images, asking questions and having it answer them." Artificial Intelligence. Mr.Cole_Photographer / Getty Images The Future of AI? The enhanced multimodal capabilities of ChatGPT follow closely after the release of DALL-E 3, OpenAI's latest and most sophisticated image generation system. OpenAI states that DALL-E 3 also incorporates natural language processing, enabling users to communicate with the model to refine outcomes and coordinate with ChatGPT to assist in generating image prompts. In the not-so-distant future, voice-activated AI chatbots will be able to understand diverse accents and languages, making technology more inclusive and universal, Kryszek said. "This evolution will be coupled with the capability to sense emotions from our voice's subtle cues, creating more empathetic digital assistants," he added. "These advancements are poised to permeate every facet of our lives—from wearables to vehicles, underpinned by robust voice biometrics ensuring utmost security. And as these systems mature, we'll witness a blend of voice, visuals, and tactile feedback, ushering in a new era of immersive and multi-dimensional digital interactions." Was this page helpful? Thanks for letting us know! Get the Latest Tech News Delivered Every Day Subscribe Tell us why! Other Not enough details Hard to understand Submit