What is sentiment analysis?

Sentiment analysis defined

Sentiment analysis applies NLP, computational linguistics, and machine learning to identify the emotional tone of digital text. This allows organizations to identify positive, neutral, or negative sentiment towards their brand, products, services, or ideas. Ultimately, it gives businesses actionable insights by enabling them to better understand their customers.

As an example of sentiment analysis, a streaming platform can identify how popular a series is through the text analysis of social media posts. In this case, sentiment analysis enables the streamer to understand whether the public feels positive, neutral, or negative about the content. The results of a sentiment analysis provide the platform with actionable insight: cancel the series, renew the series, or make different cast and/or creative hiring decisions.

Sentiment analysis vs. natural language processing (NLP)
Sentiment analysis is a subcategory of natural language processing, meaning it is just one of the many tasks that NLP performs. Natural language processing gives computers the ability to understand human written or spoken language. NLP tasks include named entity recognition, question answering, text summarization, language identification, and natural language generation.

Sentiment analysis vs. machine learning (ML)
Sentiment analysis uses machine learning to perform the analysis of any given text. Machine learning uses algorithms that “learn” when they are fed training data. By using machine learning, sentiment analysis is constantly evolving to better interpret the language it analyzes.

Sentiment analysis vs. artificial intelligence (AI)
Sentiment analysis is not to be confused with artificial intelligence. AI refers more broadly to the capacity of a machine to mimic human learning and problem-solving abilities. Machine learning is a subset of AI, so machine learning sentiment analysis is also a subset of AI. While all three are connected, they are not the same.

Sentiment analysis vs. data mining
Sentiment analysis is a form of data mining that specifically mines text data for analysis. Data mining simply refers to the process of extracting and analyzing large datasets to discover various types of information and patterns.

Types of sentiment analysis

There are several different types of sentiment analysis, whether it is performed as a rule-based, machine learning, or hybrid analysis. These include:

  • Fine-grained analysis
  • Aspect-based analysis
  • Emotion detection
  • Intent-based analysis

Fine-grained sentiment analysis, or graded sentiment analysis, allows a business to study customer ratings in reviews. Fine-grained analysis also refines the polarities into very positive, positive, neutral, negative, and very negative categories. So, for example, a 1-star review will be considered very negative, a 3-star review—neutral, and a 5-star review will be seen as very positive.

Aspect-based sentiment analysis, or ABSA, focuses on the sentiment towards a single aspect of a service or product. For example, a tech company launches a new set of wireless headphones. Some aspects for consideration might be connectivity, aesthetic design, and quality of sound. Through a requested analysis classification, aspect-based sentiment analysis allows a business to capture how customers feel about a specific part of their product or service. “These new ears are sexy” would indicate sentiment towards the headphones’ aesthetic design. “I like the look of these, but volume control is an issue” might alert a business to a practical design flaw.

Emotion detection sentiment analysis goes beyond polarity detection to identify customer feelings such as happiness, sadness, or anger. This type of analysis can use lexicons to evaluate subjective language. Words like awful and disgraceful suggest anger. Miserable and devastating can signal sadness. Exciting or super can suggest happiness. Of course, lexicons don’t account for context, and people express their emotions in different ways. Consider this example:

Words like “stuck” and “frustrating” signify a negative emotion, whereas “generous” is positive. This sentiment is nuanced and the emotion difficult to classify.

Intent-based sentiment analysis allows a business to identify customer intent and interest levels. Different types of intent include purchase, upgrade, downgrade, cancel, or unsubscribe. Intent-based analysis requires classification training with relevant text, such as customer emails or queries. For example, “I’ve run out of storage space, what do I do?” could be classified as an upgrade opportunity. The intent in “I don’t like the samples I’m receiving, I don’t need more eyeliners” could be classified as cancel, but also alerts the business to a service improvement opportunity. This type of analysis helps businesses manage and maintain their customer base and to maximize sales opportunities.

How to apply sentiment analysis

To complete sentiment analysis, you need to:

  1. Preprocess your text, including tokenizing sentences, lemmatizing to root form, and removing stop words.
  2. Extract features, which can include converting the lemmatized tokens to a numeric representation or generating embeddings.
  3. Apply the sentiment classifier to your data.

Sentiment analysis can be approached in three ways:

  • Rule-based
  • Machine-learning
  • Hybrid

Rule-based sentiment analysis uses manually-written algorithms — or rules — to evaluate language. These rules use computational linguistics methods like tokenization, lemmatization, stemming and part-of-speech tagging. They might also use lexicons (word banks).

This type of analysis will parse out specific words in sentences and evaluate their polarity and subjectivity to determine sentiment and intent. Once a polarity (positive, negative) is assigned to a word, a rule-based approach will count how many positive or negative words appear in a given text to determine its overall sentiment.

The obvious disadvantage is that this type of system requires significant effort to create all the rules. Plus, these rules don’t take into consideration how words are used in a sentence (their context). Though new rules can be written to accommodate complexity, this affects the overall complexity of the analysis. Keeping this approach accurate also requires regular evaluation and fine-tuning.

Machine learning sentiment analysis is an automated version of rule-based sentiment analysis that relies instead on machine learning (ML) capabilities. This model requires the ML sentiment analysis tool to be fed training data so that it can learn which words correspond to which polarities. Common examples of training data are movie reviews, Amazon product reviews, or Yelp-rated places of business. Hugging Face, an AI community, provides open source libraries, datasets, and models that can help with building and training sentiment analysis tools.

Once the machine learning sentiment analysis training is complete, the process boils down to feature extraction and classification. To produce results, a machine learning sentiment analysis method will rely on different classification algorithms, such as deep learning, Naïve Bayes, linear regressions, or support vector machines.

Hybrid sentiment analysis combines rule-based and machine-learning sentiment analysis methods. When tuned to a company or user’s specific needs, it can be the most accurate tool. It is especially useful when the sentiments are more subtle, such as business-to- business (B2B) communication where negative emotions are expressed in a more professional way.

Use cases for sentiment analysis

Sentiment analysis provides a business actionable insights by identifying:

  • the polarity of the language used (is it positive, neutral, or negative?)
  • the emotional tone of the consumer's response (are they angry, happy, or sad?)
  • whether the tone is urgent or not
  • what the consumer's intention or level of interest is

As automated opinion mining, sentiment analysis can serve multiple business purposes.

Reviews
Using a sentiment analysis tool, a business can collect and analyze comments, reviews, and mentions from social platforms, blog posts, and various discussion or review forums. This is invaluable information that allows a business to evaluate its brand’s perception.

Discovering positive sentiment can help direct what a company should continue doing, while negative sentiment can help identify what a company should stop and start doing. In this use case, sentiment analysis is a useful tool for marketing and branding teams. Based on analysis insights, they can adjust their strategy to maintain and improve brand perception and reputation.

Social media monitoring
Customer feedback on products or services can appear in a variety of places on the Internet. Manually and individually collecting and analyzing these comments is inefficient.

A sentiment analysis tool can instantly detect any mentions and alert customer service teams immediately. This allows companies to keep track of customer attitudes, and in turn, to more effectively manage their customer experience. A sentiment analysis tool can also be used for monitoring. As an extension of brand perception monitoring, sentiment analysis can be an invaluable crisis-prevention tool. This allows teams to carefully monitor software upgrades and new launches for problems and reduce response time if anything goes wrong.

Market trends
Sentiment analysis is a useful tool when performing market research because it allows organizations to conduct a broad review of entire markets, niches, and specific products and services, drawing insights from attitudes to better evaluate customer needs and expectations.

Common challenges in sentiment analysis

Language is a complex, imperfect, and ever-evolving human communication tool. Because sentiment analysis relies on language interpretation, it is inherently challenging.

Business-to-business reviews
Understanding competitor reviews is a sentiment analysis challenge. If a company sets a rule to identify certain language describing sentiment towards its business as positive, the same language used to describe a competitor will also be considered positive. For example:

I love how quickly [your company] ships their product.
I love that I can set my shipping window with [your competitor].

Both of these statements are positive, but the sentiment analysis tool won't make the distinction between a company and its competitors unless it's trained to recognize anything positive concerning competitors as negative.

Irony, sarcasm, and context
The challenge of detecting and understanding in-person irony and sarcasm also extends to sentiment analysis. Sarcasm uses positive words to describe negative feelings, and the issue is that there are often no textual clues for a machine to distinguish earnestness from sarcasm or irony. For example, in response to "Do you like pulp in your orange juice?", "Omg, you bet" could be understood as either positive if the author were sincere, or negative if the author were being sarcastic.

Context can also skew sentiment. Consider these two responses:

"Only a little bit."
"A lot!"

If the comments are in response to a question like "How likely are you to recommend this product?", the first response is considered negative, while the second is positive. However, if the prompt is "How much did the price adjustment bother you?", the polarities are reversed.

Cultural differences
Culturally-specific language use is one of the main challenges of sentiment analysis. Consider how different humor is from one culture to the next. Even in the English language, dialectic differences make distinguishing meaning complex. For example:

"Pants" refers to trousers in US English. In the UK, "pants" means underwear.

Such differences affect analysis accuracy. Idioms also differ from culture to culture. Their analysis is a similar challenge.

Subjectivity
One of the main challenges of sentiment analysis is that language is subjective. This complicates classification into neat categories, aspects, or polarities. Consider this example:

"This phone is great" clearly denotes positive sentiment.
"This phone is small" is more difficult to classify. Depending on the author's feelings on size, it could be a positive, neutral, or negative statement.

A given word's meaning can be subjective due to context, the use of irony or sarcasm, and other speech particularities.

Benefits of sentiment analysis

Sentiment analysis benefits its users with actionable insights. As a tool, its advantages are multiple:

Make customer emotions actionable, in real time
A sentiment analysis tool can help prevent dissatisfaction and churn and even find the customers who will champion your product or service. The tool can analyze surveys or customer service interactions to identify which customers are promoters, or champions. Conversely, sentiment analysis can also help identify dissatisfied customers, whose product and service responses provide valuable insight on areas of improvement.

Mine text for customer emotions at scale
Sentiment analysis tools provide real-time analysis, which is indispensable to the prevention and management of crises. Receive alerts as soon as an issue arises, and get ahead of an impending crisis. As an opinion mining tool, sentiment analysis also provides a PR team with valuable insights to shape strategy and manage an ongoing crisis.

Improve customer service
Sentiment analysis tools pull a broad set of data from various sources simultaneously: emails, tweets, comments, surveys, polls and reviews. A text-analysis tool can help better manage customer service operations and prioritize queries and automate tracking of poor interactions by empowering managers to train customer service advocates that deal with difficult customers.

Common approaches to sentiment analysis

There are several approaches to sentiment analysis. You can build one yourself, purchase a cloud-provider add-on, or invest in a ready-made sentiment analysis tool. A variety of software-as-a-service (SaaS) sentiment analysis tools are available, while open-source libraries like Python or Java can be used to build your own tool. Alternatively, cloud providers offer their own AI suites.

Build your own sentiment model
You can build your own sentiment model using an NLP library – such as spaCy or NLTK. For those who are overly ambitious, you can even build from scratch! Sentiment analysis with Python or Javascript gives you more customization control. Though the benefit of customizing is important, the cost and time required to build your own tool should be taken into account when making the decision.

Use a turnkey sentiment analysis product
You could also purchase a solution, like a SaaS product offered by the standard cloud providers. This could include Amazon Comprehend, Google AI and machine learning products, or Azure’s Cognitive Services. The advantage of a SaaS sentiment analysis tool is that it can be deployed quickly and often at the fraction of the cost of a custom-built tool. The process of training the tool is streamlined, and does not require an entire team of engineers and specialists for setup.

Integrate third-party sentiment analysis
With third-party solutions, like Elastic, you can upload your own or publicly available sentiment model into the Elastic platform. You can then implement the application that analyzes sentiment of the text data stored in Elastic.

Cloud-provider AI suites
Cloud-providers also include sentiment analysis tools as part of their AI suites. Options include Google AI and machine learning products, or Azure’s Cognitive Services.

As AI technology learns and improves, approaches to sentiment analysis continue to evolve. A successful sentiment analysis approach requires consistent adjustments to training models, or frequent updates to purchased software.

Get started with sentiment analysis with Elastic

Launch your sentiment analysis tool with Elastic, so you can perform your own opinion mining and get the actionable insights you need.

Sentiment analysis glossary

Algorithm: a process or a set of rules that a computer follows.

Artificial intelligence: the simulation of human intelligence by machines and computer systems.

Computational linguistics: a branch of linguistics that uses computer science theories to analyze and synthesize language and speech.

Coreference resolution: the process of identifying all the words that belong to a named entity in a text.

Lemmatization: the process of grouping together different inflected forms of the same word.

Lexicon: a vocabulary word inventory of a language.

Machine learning: a subset of artificial intelligence that, by the use of data and algorithms, allows a computer to learn without prompting.

Named entity recognition: the process of recognizing words as proper names or entities

Natural language processing: a branch of computer science that, as a subset of artificial intelligence, is concerned with helping computer systems understand human language.

Part-of-speech tagging: the process of marking a word in a text to categorize what part of speech it belongs to (for example, apple = noun; slowly = adverb; closed = adjective).

Stemming: process of reducing words to their stem, or root, form.

Tokenization: process of separating a piece of text into smaller units, called tokens.

Word sense disambiguation: the process of identifying the sense of word given its use in context.