The three secrets to successful sentiment analysis
Seth Grimes explores the ingredients of full-circle sentiment analysis - and isolates the three factors that are core to successful sentiment analysis.
Sentiment analysis is not new, just newly visible, generating excitement given the role solutions can play in making sense of social (and conventional) media and a range of other customer, consumer, and business information sources. Online, we’re inundated with survey and review requests; we’re encouraged to submit and select on subjective star ratings for hotels, restaurants, consumer goods; news searches are 'enhanced' with trend lines and with neat, simple, green-red-grey sentiment classifications.
It should go without saying - but unfortunately doesn’t - that 'your mileage may vary'. Eye-candy sentiment graphics often do little more than classify and count keywords. They pay no regard to context, meaning, or other complications. They embody H.L. Mencken’s observation, "For every complex problem there is an answer that is clear, simple, and wrong." Sentiment analysis done right certainly is a complex problem. So what are the ingredients of compleat, full-circle sentiment analysis?
A full set of resources and goals
The obvious first ingredient is a full set of relevant sources, appropriate to the job at hand. If your goal is brand/reputation management, for instance, you need to monitor online and social media where your company’s products and services are being discussed, but is monitoring enough? What about the many forms of enterprise feedback - survey responses, email, contact-center interactions - that provide the very earliest issue-detection opportunities, where you can catch and shift opinions before they explode onto social media. Yet so many of the solutions out there - software, platforms, and services - are one-dimensional, focusing solely on public attitudes rather than also on (still-)private interactions. One axiom of full-circle sentiment analysis is ability to use all relevant sentiment sources.
You need a clear picture of your sentiment analysis goals. Full-circle sentiment analysis offers great ROI, measured in terms of customer satisfaction, issue resolution, quality improvement, better marketing, and a variety of other enterprise goals that go beyond profitability. Important choices - what information you collect, how you transform and analyse it, and what you do with findings - are tied to, and justified by, business goals.
People have opinions on every topic under the sun, often expressed in ways that are difficult for software to decode. Analysis techniques are obviously key. To centre on our target, let’s start with a definition of sentiment analysis that was put forward Prof. Jan Wiebe of the University of Pittsburgh and colleagues in 2005: "Sentiment analysis is the task of identifying positive and negative opinions, emotions, and evaluations." (Wiebe and colleagues recorded over 8,000 subjectivity clues - words and usage that indicate the presence of sentiment in natural language - which reinforces the complexity of the problem.)
I see three core ingredients in sentiment analysis done correctly:
- First is ability to assess sentiment at the entity, concept, or topic level. 'Pulse' applications that count blogosphere mentions can be interesting, but they don’t report attitudes. When they do assign sentiment polarities - positive, negative, neutral - it’s typically at the document level - by web page, news story, or tweet. Blunt instruments fall short when you’re trying, for instance, to separate out guests’ feelings about hotel-room cleanliness from their feedback about room-service speed.
- A second technical ingredient is ability to see beyond keywords. You need linguistics to deal with context and meaning, to properly assess sentiment. It’s easy to show why. I run a simple test when I come across search tools that claim sentiment capabilities. A search on 'Ted Kennedy' (further back, I’d use 'Farrah Fawcett' or 'Michael Jackson') turns up items containing 'sad news', negatively classified, even though the poster is actually relating positive reflection about Kennedy. Software with natural-language abilities can grasp context, for instance when the word 'sad' is used in connection to a death, and they can deal with other language complexities such as obvious ('not') and subtle ('still') elements in written text. Further, it can deal with matters such as anaphora, for instance, the pronoun 'he' in the news text "Mr. Geithner said there has been a 'dramatic improvement in confidence'… He said large businesses are now able to borrow again."
- And a third technical need is ability to distinguish opinion holder and object, to link sentiment to deeper opinion-holder information, and to explain.
Part of moving beyond aggregate, 'pulse' statistics is knowing Who said What. In many cases, for instance travel and restaurant review sites, the opinion holder is obviously the person who posted. By contrast, blogs and articles often cite others’ opinions as do threaded conversations on forums and email lists.
Ability to understand the identity of the opinion holder is important if you hope to act on what you’ve learned. Consider scenarios from customer service, marketing: Can we link a complaint back to particular transactions in order to address the problems that were identified? Is a forum-poster a high-value customer or instead a chronic complainer who’s better ignored? How influential is a certain blogger or speaker? How does sentiment link to sales for different market segments?
And sentiment analysis done right provides explanatory power - root-cause analysis - the possibility of reaching beyond sentiment expressions to the customer, reviewer, or influencer interactions, incidents, and experiences that prompted them.
Usability and usefulness is the last sentiment-done-right ingredient I’ll offer. It’s an ingredient that’s implicit in customer-support and quality scenarios, the difference between knowing about problems and opportunities and acting on them. In practical terms, usability-usefulness is achieved via integration with line-of-business and larger analytical information systems. Usability and usefulness are further boosted when sentiment analysis is built into business solutions, typically into marketing automation, contact center, research and investigation, and business intelligence applications.
This article has sketched out ingredients of sentiment analysis done right. You can get a lot of value - for customer satisfaction, quality, brand and reputation management, media analysis - from solutions that are less than comprehensive, but if you’re at least aware of what you’re missing, you’ll be in good shape to build out to a full-circle sentiment analysis solution.
Seth Grimes has worked for over 20 years in data-systems architecture and design, specialising in systems for analysis and management of social, demographic, economic, and marketing statistics and related use of the Internet. He is the founder, and a principal, of Alta Planta Corporation. For more on sentiment analysis, check out a new conference Seth is organising, the Sentiment Analysis Symposium, April 13 in New York.
You might also be interested in
Seth Grimes is the leading industry analyst covering natural language processing (NLP), text analytics, and sentiment analysis technologies and their business applications. He founded Washington DC based Alta Plana Corporation, an information technology strategy consultancy, in...