Speech recognition is famously the next big thing but its time might genuinely be about to come, argues Nick Applegarth, managing director, Nuance UK and EMEA
Soon you will be doing your banking from your car. When not cruising the highways and byways of the land you will be checking your stock options or finding directions to the nearest curry emporium by speaking to a wireless internet service that understands you, even if you have had more than your fair share of lager.
Heard it all before? Of course you have. Speech recognition has been the next big thing, well, four times now over the past eight years. But this time it seems to be about to fulfil its promise. And it is all thanks to the recession. With the cost of processing power on an ever-downward spiral, the brains needed to do proper, industrial scale speech recognition is actually more cost effective than ever before. But it is the need for businesses to offer ever-better customer service, 24 hours a day, seven days a week in the face of stiff competition that has really made speech recognition less of an expense and more of a money maker.
While voice activated, wireless, in-car banking portals maybe one of the potential areas that speech recognition can grab some headlines, it is in the traditional call centre where it is going to have its biggest impact. More than 65 per cent of the costs of running a call centre are tied up in labour. To offer a wider and ‘open all hours’ service requires investing in more people and pushing these already high costs higher still – until it is on a par with the financial directors’ blood pressure.
Unfortunately, each of these expensive agents wastes a substantial amount of their valuable working time in answering non-revenue generating FAQs (frequently asked questions). They also get bored doing repetitive and routine tasks and, as a consequence, tend to leave after months of expensive training and investment to do something more interesting, like sit on the settee flicking through daytime TV’s wealth of home improvement programmes – literally, they would rather sit and watch paint dry than carry on with the mundanity of FAQs.
And it is here that speech recognition technology now looks like such a promising proposition to all those that want to improve that other much discussed subject, customer relationship management (CRM), while making life easier and more pleasant for its operators. On a simple level, speech recognition offers many things to most businesses that have extensive dealings with customers. For starters, it can be used to garner information from the caller as they call, aiding the forwarding of that call to the right place – and seeing an end to cumbersome IVR systems that reel off a list of prompts and buttons to press.
More clever still is that speech recognition technology can be used to make much more effective use of the queues that often form of callers trying to get through during busy times. A natural language system can be used to ask and then log the customer’s ‘name, rank and serial number’ while they are waiting and, using computer telephony integration (CTI) – remember that? – this information can be popped to the agent when the call comes through to them.
This means that, one, the queuing customer has more to do than listen to Greensleeves repeatedly until their head implodes and, two, the agent appears to know them and have their details ready in front of them when they do get out of electronic music hell.
Fine and dandy, you may think, but isn’t this all just a glorified IVR system that asks for words rather than the current “press one for…”? Well, in this instance it is. But what really sets speech recognition up for the investment in now seems to richly deserve from the contact centre community is that it offers these simple efficiency gains alongside whole new ways of generating new business from existing customers and offering the level of service that the ever demanding customer base has, disproportionately to reality, come to expect from everyone from airlines to banks and even estate agents.
From an FAQ and routine transaction point of view, speech recognition is already being used by Lloyds TSB for its Phonebank Express service, which uses the technology to fully automate simple – but frequent – customer transactions. For example, the system allows the customer to make a balance enquiry, pay some bills – funds permitting – transfer some money from savings and so on, all without troubling an expensive agent.
This makes life easier for the customer, as they simply make a call and know they can do what they want. It also helps the agents. They no longer have to be troubled for hours on end handling these routine – and, let’s face it, boring and repetitive – and can, instead, concentrate on selling mortgages and other high value financial products.
And here again, speech recognition can be used to help. Speech recognition systems recognise words. So get a system set up that can recognise key words within an on going customer interaction and then assesses whether there is a potential to cross or up-sell other products to that caller, which can even be based on the gender of the caller.
Take a bank for instance. With a caller on the phone talking about taking out a mortgage, the system can recognise the word ‘mortgage’ and will go off and look within the company annals for whether or not that customer has a mortgage with the bank already and whether they have insurance and any of the other services that go with it. If not then the system, from simply recognising and understanding the word ‘mortgage’ can send details of that customer’s status to the agent along with the right sales pitch script for whatever it is the caller doesn’t yet have.
This is essential within the ever more competitive world of personal banking, as the only way banks can grow when they offer identical products to their competitors is to gain a bigger share of the customer’s wallet – and that means offering the best service and making the customer feel loved and wanted by their new warm and fuzzy ‘personal banker’.
This is the heart of CRM and speech recognition has a key part to play in that. Beyond that, the marketeers have, as ever, been thinking big and are looking at the whole idea of voice branding.
Your company not only has to have a massively over priced logo – usually circular and red these days – but also has to have a ‘voice’. So the ads on TV have the voice and, when you call up the company, the speech recognition system that talks to you does so using ‘the voice’. AOL sort of do it with Connie, the tall be-wigged woman on the TV ads welcomes you to AOL when you dial up on your computer – and this is only the beginning.
But it isn’t all as simple as that. Speech recognition – recognising key words or phrases buried within the modern Britain’s erratic syntax and wealth of “ums”, “errs”, “rights” and “d’yaknowwhatImeans” – is one thing, but actually getting the meaning from what is said – called speech acquisition – is very hard.
To make the most of speech systems, the customer has to feel as if they are talking to a person. But to achieve this has to put in place the levels of speech recognition that we as people have developed. In fact, the average human’s speech acquisition accuracy is lower than that of some machines, we just have – over a million or so years – developed much quicker and better error recovery. In order to deliver on the promise that speech recognition offers, this has to be allowed for.
Alternatively, the customers themselves have to be conditioned to know that they are talking to a machine. Many among speech and natural language recognition’s growing cadre of fans believe that the first step in getting the technology accepted is, in fact, to not ape the human: what users want is to know they are talking to a machine, but a machine that will respond to natural commands and requests. And that is a big challenge – d’yaknowwhatImean?