Google impressed everyone with its advanced Duplex voice AI at its developer conference I/O 2018. It is so advanced that it can make natural conversations like humans and is even capable of making calls on your behalf and book appointments. In this post, we will let you know about the artificial intelligence and technology behind Google Duplex?
The AI Behind Google Duplex
Google’s Artificial Intelligence has taken things to the next level. Now Google Assistance will be able to make phone calls on your behalf in the background, it will have a proper conversation and book a reservation at a restaurant for example; isn’t that amazing. The feature has been named as Google Duplex. Now, let us know the AI behind it.
On May 9, Google presented this at the Google I/O meeting, in California, USA. Google I/O brings together developers from around the globe for an immersive experience focused on exploring the next generation of tech.
Google defines Duplex as “a new technology for conducting natural conversations to carry out ‘real world’ tasks over the phone.” It’s AI that can make calls for you. It combines the company’s latest breakthroughs in speech recognition and text-to-speech synthesis with a range of contextual details, including the purpose and history of the conversation.
Google Duplex uses a natural speech pattern, which includes hesitations and affirmations such as “uh-huh”, making it almost indistinguishable from a genuine human phone call. The brand new feature will be launched for the public later this year.
The Technology Behind Google Duplex
Google’s CEO Sundar Pichai in his keynote showed an example on stage that showed someone asking Google Assistant to call a hairdresser and make an appointment for a woman’s haircut at 12PM. Duplex did the conversation and successfully made the appointment.
Duplex uses Google’s automatic speech recognition technology so it can interface with the user, Thanks to WaveNet, an AI-based generative program that’s part of Google’s DeepMind division.
Google Duplex also uses the TensorFlow Extended (TFX) Machine Learning platform to create a Recurrent Neural Network (RNN) with speech processing duties being handled via an Automatic Speech Recognition (ASR) and Text To Speech (TTS) engines controlling intonation depending on the circumstances.
To achieve the required quality of interaction, Google Duplex is trained in narrow Domains like booking a hair appointment. Training is undertaken in real-time and is supervised by a human operator who will monitor the interactions and intervene as and when appropriate.
These highly trained instructors keep overseeing the training until the conversation performs at the quality level required. At this point, Google Duplex is free to operate on its own.
Duplex is designed to respond in a natural manner and adapt to responses in real time while understanding the context of the conversation. However, if it finds the conversation too complex, the algorithm is smart enough to understand it’s failing or can’t do something and in that case, it will get a human to come and take over.
In case of a tight schedule Assistant will also check the working hours, so if it’s too early in the day, it will let you know, and suggest booking later in the day. If the booking is successfully done, you then get a notification and a calendar entry, where you can check if the appointment was made correctly.
It’s easy to think of the use cases for Duplex, but this technology could also be misused too and it’ll be interesting to see how Google hopes to secure the feature in the future.