VADER Sentiment Analysis
VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool, specifically attuned to sentiments expressed in social media, and effective on texts from other domains. The current version is 3.3.2, with releases occurring infrequently as it is a mature, rule-based system.
Warnings
- gotcha VADER's primary output, the 'compound' score, is normalized between -1 (most extreme negative) and +1 (most extreme positive). Common thresholds for interpretation are: positive sentiment if `compound >= 0.05`, negative sentiment if `compound <= -0.05`, and neutral sentiment otherwise.
- gotcha VADER is highly sensitive to capitalization, punctuation (e.g., exclamation marks), and degree modifiers (e.g., 'very', 'kind of'). While a feature, this can lead to unexpected results if text is preprocessed heavily without considering these aspects.
- gotcha VADER, being a rule-based system with a static lexicon, can struggle with sarcasm, irony, subtle or complex negations ('not bad' can still lean positive), and evolving slang or domain-specific terminology that isn't in its lexicon. It might also struggle with intricate sentence structures involving conjunctions like 'but' that shift sentiment.
- breaking Prior to version 1.0 (specifically earlier versions, e.g., pre-3.x), users sometimes needed to manually ensure the `vader_lexicon.txt` file was accessible or manage NLTK downloads. This could lead to `FileNotFoundError` or import issues.
Install
-
pip install vadersentiment
Imports
- SentimentIntensityAnalyzer
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
Quickstart
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer analyser = SentimentIntensityAnalyzer() sentence = "VADER sentiment analysis is incredibly insightful and super fun!" scores = analyser.polarity_scores(sentence) print(scores) sentence_negative = "This product is absolutely terrible and a complete waste of money." scores_negative = analyser.polarity_scores(sentence_negative) print(scores_negative)