Decoding complex languages with conversational AI
Thanks to advances in natural language understanding, conversational AI-powered virtual agents are now able to tackle difficult languages with ease.
Look up any list of the world’s hardest languages to learn and you will invariably find Finnish nestled somewhere near the top. While the written form of this unique Nordic language is (somewhat) similar to English, grammatically Finish takes things to a whole new level. Due to a mixture of compound words and tricky conjugation, learning Finnish is a herculean task for any non-native speaker. For artificial intelligence, it is equally taxing to decode, requiring a range of complex processes in order to have a virtual agent arrive at a correct response.
Understanding is the foundation of all communication and neither human nor machine is able to help unless there is a true understanding of what is being said. Enabling machines to understand us is, therefore, the cornerstone from which to explore a brand new frontier of human-machine interaction. Conversational AI is at the forefront of the current communications revolution. Advances in deep learning and natural language technologies are reaching new heights of sophistication that make machines adept at understanding, processing and responding to numerous human languages at an unprecedented level.
The latest, best of breed conversational AI uses a variety of natural language understanding techniques (some common and others proprietary) to make sense of the many permutations that words can take in a language, even one as grammatically complex as Finnish. Below we outline some of the steps conversational AI algorithms go through in order to make languages that were once considered impossible, possible.
Finnish words have a staggering number of possible conjugations. For something as simple (at least in English) as the words ‘car’, ‘insurance’ and ‘invoice’ there are any number of alternatives available in Finnish. Here are just a few examples:
In order to parse such a large number of possibilities, conversational AI goes through a process called ‘stemming’, which essentially reduces the conjugated word down to its root form, so the algorithm doesn’t have to be taught each variation. This is illustrated above by the words highlighted in black.
Another common difficulty with Finnish is the language’s high number of compound words. Rather than writing each word individually, it’s often the case that multiple words are joined together to form one longer word. Below we can see just some of the many variations of the compound words for ‘car insurance’ and ‘insurance invoice’:
Compound words are common in many European languages, such as German or Norwegian, but combining them with such a high number of conjugation possibilities makes Finnish especially difficult.
Practically speaking, you would first need to teach the algorithm each individual word with all its conjugations, and then move on to the corresponding compound words with all of their different conjugations. The result is an enormous amount of work just for the algorithm to learn every permutation.
To solve this, conversational AI is able to perform a process called compound splitting. This allows the algorithm to disassemble compound words into their composite parts so that it only needs the base words (in their stemmed form) in order to accurately interpret user input.
Another important piece of the puzzle is conversational AI’s ability to perform advanced spelling correction. The best solutions can identify and repair mistakes in the spelling of complex compound words, greatly reducing the chances of error.
This combination of stemming, compound splitting and spelling correction is the part of conversational AI that is the key to decreasing the workload needed to crack Finnish and other similarly complex languages.
Thanks to these processes, we only need to feed the algorithm a total of eight words, instead of over 150, in order to teach it to understand the Finnish for ‘car’, ‘insurance’, ‘invoice’, ‘car insurance’, ‘insurance invoice’ and ‘car insurance invoice’.
Automatic semantic understanding
Simplifying a language down to this level then allows for an Automatic Semantic Understanding (ASU) technology layer on top for an even deeper level of understanding. ASU has the capacity to supercharge a virtual agent to the point where it can do everything from handle multiple intents in the same request, understand insanely complex queries that would trip up lesser ‘bots’, even eliminate false positives by up to 90% in some cases.
This is especially useful in a market, such as the USA, where, while English is the official language, there are also 350+ other languages spoken across the country. Conversational AI is inherently multilingual and allows a virtual agent to seamlessly transition between any number of languages, within the same chat window. Similarly, for dialect and slang, conversational AI is especially adept at handling various colloquialisms, thanks to advanced natural language technologies that accurately interpret human conversational habits.
For humans, learning a new language will always be difficult. But for conversational AI, once you have dealt with these core challenges, even the most complex language can be simplified, becoming no more difficult to decode than English or Norwegian.
Lars Selsås is the CEO and technical powerhouse of Boost.ai. His ground-breaking code is the basis on which the company’s conversational AI is founded. He has a deep understanding of high-performance code, Big Data, deep learning and natural language technology. Lars spent a number of years in Silicon Valley working for some of the world’s...