How to choose the best speech analytics solution for your business

17th Feb 2014

For many years, speech analytics was very much the domain of the early adopters and the mavericks. But recently it has become increasingly popular to the mainstream.

This is a double-edged sword. On the one hand it represents an opportunity for the majority of organisations to reap the benefits of this sophisticated technology and drive more value from their contact centres.

But on the other hand, because the mainstream is by its very nature less technically sophisticated and brave than the early adopters, it means they are more likely to find the path to speech analytics adoption a painful one.

“In Forrester’s Tech Radar, which is our maturity model, we have moved speech analytics from our niche category to not quite the broad mainstream, but certainly more of the middle of the road, regular adoption phase,” says Art Schoeller, principal analyst at Forrester. “In the last year and a half, more of the mainstream-type organisations have been adopting speech analytics. And they want more stability, and more assurances.”

This could prove to be a problem, because despite the uptick in adoption, the technology selection process in speech analytics remains complicated, offering a variety of different technical approaches available to businesses. With no single unifying architecture, the approach which is most appropriate for your particular business is dependent on your specific environment and needs.

But to further complicate things, this is not necessarily the message that vendors will communicate to those in the market for a solution, with partisan providers keen to impress upon buyers that their approach is superior, regardless of the requirements.

In light of this, the following article identifies the key factors you should bear in mind when you are in the process of selecting a speech analytics system, so that you can weigh up which is the most appropriate technology for your particular circumstances.

Phonetics vs transcription

Speech analytics solutions come in two flavours – phonetics and transcription. It is critical that before you begin the process of vendor selection that you are clear about which will be most appropriate for your business.

“You really have to spend as much time as possible researching and reading as many white papers as you can to understand the differences between phonetics and transcription,” advises Jim Davies, research director at Gartner. “Once you understand that, your choice will depend on the repercussions for your business associated with differences in the speed at which you can process the audio versus the speed at which you can then search that processed audio, because there is a fundamental difference between phonetics and transcription.”

He continues: “Phonetics is pretty fast at converting the audio into the searchable format but not as fast at doing the search. Transcription, meanwhile, is much slower at converting the audio into the searchable format, but then very fast once you’ve done that.”

The choice between phonetics or transcription therefore has major implications depending on your organisation.

“Normally the bigger the company, and thus there number of calls they need to convert, the more that points to phonetics, because you have got a conversion challenge if you are trying to do it by transcription,” says Davies. “But this balancing act between the speed of processing the audio versus the speed at which you then search the subsequent data you generated is a big area to look at and you should get the vendor to demonstrate this when they’re doing their on-site look and feel.”

Precision vs recall

Another choice has to be made between the precision and recall of various speech analytics solutions, because this too varies according to the particular technology.

Precision refers to the accuracy of the insight that is provided. For instance, let’s say that over the course of 1,000 calls there were 50 calls that mentioned a competitor. When you apply the analytics to look for mentions of the competitor, if the tool returned with 45 call and they are indeed all linked to the competitor, then it would have great precision because if it says there is a mention, there it is there. Recall, meanwhile, refers to the detection rate - how many times the tool finds something. So with the same example, some systems might find 45 out of 50 while others might only find 35 out of 50.

While you therefore would ideally choose a solution that has high recall and high precision, reality dictates that you’ll have to prioritise one over the other according to your own business needs.

“What you tend to find is you get a mix,” says Davies. “While some technologies are very good at precision/accuracy, they have a lower detection rate – so maybe only uncovers 20 of the 50 calls, but all 20 are where a competitor is mentioned. Whereas other systems might report that 80 calls had a competitor mentioned and while it has got nearly all of the 50, it has also wrongly classified another 35 as being linked to that competitor. So in that case you get more hits but also more negatives as well.”

That balance between precision and recall is critical for a lot of businesses.

Davies continues: “If you’re using it for compliance, for instance, you really need to ensure you find all of the instances where a non-compliance issue arose, and it doesn’t really matter if you get a few negatives or false positives into the mix, you can just filter those out manually. Therefore, systems with a high recall rate are better.

“Whereas if you’re using it to automate a process – such as generating leads in the sales department – in that situation you don’t really want to keep generating leads when it is inappropriate. It is much better to make sure that when you do generate a lead, it is an appropriate lead, and just be aware that you are missing a few leads that could have been generated. That’s where the different technologies, precision vs recall, aligns with the use case and some solutions are better than others for the use cases.”


The language that is being used by your business is another factor that will influence which type of solution you should choose, particularly in terms of vocabulary changes.

“If you’re in retail, for instance, and you’re using new product names all the time, it can be a little more difficult for transcription to take this into account,” notes Davies. “With phonetics, in terms of new product name, you can just type in a new product name and it finds it. With transcription you have to add that name to a library and then it can go away and find it once it’s in the library dictionary. But sometimes you can’t do that yourselves, you need to get professional services in. So the more that vocabulary is changing over time, the more that favours one technique over the other.”

Unexpected events

Another influential factor to consider is how likely it is that an unknown event could impact your contact centre. With phonetics you are only looking for things you have told it to go and look for, whereas transcription is reporting what it finds to you.

“Transcription will report back that, for instance, one of your products was the most mentioned term of last week, so that there is obviously something going on that you need to investigate,” says Davies. “It can do that analysis for you because everything is converted into text and it can mine. So if your contact centre thinks there is a high chance of unknown things happening that they are not aware of, then that favours the transcription approach over phonetic.”

Also bear in mind…

Other factors that you should keep in mind when choosing a speech analytics solution, according to Omer Minkara, senior research analyst in the customer management technology practice at the Aberdeen Group, include:

  • Who is the owner of speech analytics activities? Do you want to manage speech analytics in house, or do you want a third party to manage it as a service for your business? 
  • If you are doing it in-house, what is the preferred deployment model? Do you want to do your speech analytics on-premise, hosted – that is private or public in the Cloud – or hybrid, which is a mix of on-premise and hosted speech analytics? 
  • What are the sources of speech analytics you’ll be using? Do you want to capture feedback or speech data just from phone conversations or IVR interactions or to you want to tap into both sources?
  • Do you want to capture speech data on a real-time basis or post-call?
  • What are the benefits you’re looking to get out of speech analytics? Are you looking to use it to manage the customer experience? Or are you looking to use it for quality assurance, planning or managing agent performance? It could be a number of different factors that you want it for, but you need to understand what your ultimate objectives are going to be.

Do an RFP

Donna Fluss, president of DMG Consulting, adds that organisations should also ensure that they conduct a request for proposal (RFP).

“If you’ve never used speech analytics before, do an RFP and learn what’s out there. Learn from analysts and vendors about what the solutions are capable of,” she explains. “Put your heads together and think about all the various ways in which you want to use the application, then do an RFP, and go into detail – ask for what it is you’re looking for. During the selection process, ensure you speak to references. And during the selection process, find out what kind of resources you need to allocate to the implementation and then to the ongoing utilisation and application of the solution. Don’t make your decision solely on the price of the application.”

Replies (0)

Please login or register to join the discussion.

There are currently no replies, be the first to post a reply.