Artificial intelligence (AI) technology is helping us make our lives easier. Using its in-built smartness, we’re removing the hassles from our lives and making things more efficient.
But voice assistants and IVR are no longer at the cutting edge of AI development. We now have AI-powered solutions that you can interact with in human-like ways – allowing you and your brand to have real AI-driven conversations with your customers. You can even use Voice AI to make outgoing phone calls to your customers – imagine the possibilities of that!
For those of you new to the world of AI, we explain what ‘Conversational AI’ is and broadly categorise solutions that fall within this family of AI tools.
Conversational AI is a set of technologies that enable computers to understand and process natural language inputs so they can ‘talk’ with humans. Put simply, they allow us to interact with machines in the same way that we do with other humans.
Unlike computer language, which has defined rules and operates in zeros and ones, human language is highly nuanced. Think how long it takes a toddler to learn to speak in their native language. Learning how to talk naturally is complex, for a variety of reasons.
For example, as humans:
In comparison, computers are used to seeing things in black and white (figuratively of course!).
The differences in how we communicate, add texture and bring colour to our conversations make it challenging for computers to understand us. The value of Conversational AI lies in bridging this communication gap between humans and computers to achieve more natural and useful interactions – whether it’s having a live in-app chat, or making a voice call.
But Conversational AI is a broad umbrella term and the solutions within it vary greatly.
Conversational AI solutions can be classified in a number of ways. But we chose to categorise them based on four simple dimensions, for ease of understanding.
The vast majority of Conversational AI tools are designed to fulfil inquiries or help us accomplish certain tasks. These tools react to human commands and are essentially conversational user interfaces to an information retrieval mechanism, or a form process.
Examples of tools at this end of the spectrum include FAQ bots and task-fulfilment chatbots.
FAQ bots simply return the best matches to user inquiries. They consider the user’s input to be complete information and don’t prompt for clarifications or ask follow-up questions. Similarly, task fulfilment chatbots that implement an intent detection and template slot-filling model ask questions to elicit responses to fill a form. Every piece of information requested corresponds to a blank that needs to be filled. In essence, their questioning ability is limited by the scope of the form. They’re passive tools that wait to be activated by the user and ask predefined questions because they need to.
In contrast, Conversational AI solutions are on the opposite end of this spectrum. They proactively ask questions to discover information and gather knowledge from humans. In other words, they drive the conversation and ask questions because they want to.
For example, in an eCommerce setting, Curious Thing’s AI technology can be used to make a phone call to a consumer to find out why they abandoned their cart. The Voice AI leads the conversation by asking open and closed-ended questions around customer engagement, quality of service and the competitiveness of prices, etc.
The AI discovers the consumer’s reasoning for abandoning the cart and gathers incredibly useful feedback and information – data that the eCommerce business can use to improve its user interface, payment process or the quality of its goods, etc.
The ability to retain context in a conversation and conduct complex interactions reflects the level of intelligence of Conversational AI solutions.
On one end of this spectrum are AI solutions that can only hold one-off, single-turn conversations. And on the other end are more sophisticated tools that are capable of engaging in complex, multi-turn conversations.
Conversational turn-taking is the back-and-forth nature of orderly communication, similar to people taking turns when speaking – and it’s something that older technology struggled with.
A single-turn conversation involves one back-and-forth interaction. For example, if you ask a weather bot what tomorrow’s forecast for Melbourne is, it doesn’t need any additional information to execute your request. It provides the information in one response and the conversation is marked as complete.
But if you followed-up with, ‘How about Sydney?’, the bot would most likely not understand. This is because an AI solution that can only hold single-turn interactions doesn’t retain context. So in this case, it won’t know that you’re still referring to the weather.
AI solutions like this work well for simple tasks, like playing a song or sending a text where the user usually provides the information in one go. But some questions can’t be answered in a single turn. The user might ask a query that needs to be refined or filtered before it can be actioned. This is where multi-turn conversations are essential.
A multi-turn conversation is a dialogue with a series of back-and-forth interactions. These kinds of conversations are complex. The AI needs to withhold context over the course of the discussion and apply it to understand and fulfil the user’s intent.
Tools that achieve this are far more conversational, and can handle comparatively more elaborate tasks. For example, in a Healthcare setting, our Voice AI could listen for answers to questions about the patient’s health and then follow this up with more questions to get a deeper understanding of a patient’s condition, or if there’s been a dip in their recovery after an illness.
In conversations, people listen, process and respond in real-time. But not all AI solutions function this way. Conversational AI tools can be categorised based on their asynchronous and synchronous natures.
Asynchronous communication takes place when parties engage in conversation at different times. Likewise, asynchronous conversational AI tools participate in unidirectional conversations – i.e. conversations that only move in one direction. The AI tools process information once the interaction is complete. Tools like this usually require lesser conversational turns since all the information is packaged in a single input and provided to the AI.
For example, in the Financial Services sector, there are likely to be a lot of unidirectional conversations with the firm’s clients. If a client is applying for a business loan, they’ll have to answer a series of questions about their income, cashflow and credit score, etc. A Voice AI could ask you these questions sequentially, record your answers and then start processing this information once the recording is complete. Only then will the client get an ‘approved’ or ‘not approved’ response to their loan application.
Synchronous communication happens in real-time. A synchronous conversational AI interaction is a two-way interaction where the user and AI simultaneously participate. Like humans, the AI listens, understands, responds, and potentially delivers results within minutes.
For example, Curious Thing’s Voice AI can interact in real-time, asking relevant questions based on the responses it receives. An eCommerce company looking to re-engage with lapsed customers may want to call their inactive shoppers to find out why they stepped away from the brand.
As the inactive customer speaks, the AI processes their input and attributes meaning. The Voice AI may ask why the customer abandoned their cart before paying, but will then listen for the answer and will ask a follow-up question that’s relevant to the context and direction of the ongoing conversation. It may ask ‘Did you find the user interface difficult to navigate when making your purchase?’. Then the customer can respond, and a detailed picture of their customer experience can be built up – all invaluable data for any eCommerce business.
Conversational AI tools can also be classified based on the medium that’s being used for the interaction; i.e. whether you’re using voice or text to interact with the customer.
Text-based solutions exist on messaging platforms like SMS or web-based applications. We interact with them in the same way that we use text-based tools like WhatsApp – through a screen via chat. An example is Uber’s bot that lets you book a ride on Facebook Messenger.
With voice-based solutions, like Siri or Alexa, we converse verbally. But the way we communicate varies when we talk and text. For instance, we’re more likely to be chatty and long-winded in our speech when speaking to an AI, but will use shorter sentences while texting.
Although voice is a richer medium of communication, it’s more varied and difficult to comprehend in comparison to text. As a result, Conversational AI solutions that can process speech generally lie on the higher end of the intelligence spectrum.
Curious Thing’s Conversational Voice AI platform goes far beyond the limitations of a basic Voice AI assistant. Using Conversational Voice AI technology, we can make outgoing warm calls and have real-time conversations with your customers.
When using our Voice AI, you can:
If you’d like to explore the possibilities of Voice AI, we’d love to give you a demo.