According to OpenAI, the popular AI model ChatGPT can now "see, hear, and speak" as new voice and image capabilities are being rolled out. This will allow users to engage and interact with ChatGPT conversationally and show it pictures - with OpenAI providing a few innocuous examples, like snapping a picture of a landmark while traveling and asking about its history.
Which, yes, does mean that ChatGPT will be able to speak. According to the update, ChatGPT's new synthetic voice capabilities will be able to craft a realistic-sounding voice based on "just a few seconds of real speech."
OpenAI notes that it is aware that this technology could be used to impersonate public figures or commit fraud, so it's limited to voice chat. And with that, you'll only be able to choose from five different options for ChatGPT's voice.
However, OpenAI is licensing the technology to companies like Spotify, which is planning on using it as part of a Voice Translation feature for podcasters. Essentially, podcasters can translate their voices into additional languages to broaden their reach, which is mind-blowing stuff.
The new voice and image features will transform how people interact with ChatGPT, with OpenAI rolling out the update to subscribers in the next two weeks, with non-paying users to get access sometime "soon." Being able to take a photo and then feed that into ChatGPT to get a better understanding of what's being shown is a powerful tool.
Still, the real kicker comes with ChatGPT's back-and-forth capabilities, where you can ask for additional information or clarify your question. You'll even be able to draw on images to highlight points of interest for ChatGPT to focus on.
OpenAI is aware that using the tool to find out information about people opens the door to a whole new set of problems, so it has "taken technical measures to significantly limit ChatGPT's ability to analyze and make direct statements about people since ChatGPT is not always accurate and these systems should respect individuals' privacy."