Identifying the semantic orientation or polarity of words is one of the most important topics in sentiment analysis tasks.
In this work, we propose a new lexicon based approach for text polarity detection using sentiment triggers which are adding contextual semantic during the analysis. Lexicon SentiWords.SR is based on the English version lexicon SentiWords containing roughly 155,000 English words associated with a sentiment score in the range [-1, 1]. SentiWords.SR contains ~15,000 words (e.g. lemma & PoS pairs) which are derived upon extensive evaluation of the translated lexicon.
The existing word polarity dictionary in Serbian has been extended containing approximately 15,000 words annotated with polarity strength.
Serbian sentiment framework (SRPOL), relying on the new lexicon and the following sentiment triggers:
have a purpose to modify the polarity intensity of an upcoming sentiment-laden word, but not to change its orientation.
SRPOL considers negations for the upcoming phrase which could include adverb and negation modifiers in addition to the first upcoming standard sentiment-laden word. It identifies negation signals such as the Serbian words ne, ni or nije (eng. 'not') which reverse the score:
exclamation mark increase the perceived sentiment by an average of 6% for one, and of 18% for the sequence of more than two exclamation marks.
Words that contains a repeating character or group of characters more than two times, emphasizing that word has been identified as sentiment trigger
Sentiment of emojis intensities have been utilized by the Emoji Sentiment Ranking v1.0 lexicon.
🙂
The primary goal of splitting text into segments is to help in improving the polarity scoring for the long text with mixed sentiments detected on the containing segments. SRPOL assess the polarity score for each segment (sentence) of a particular text and using a majority rule approach, predicts sentiment score for the given text:
- Increase the number of the words in the SentiWords.SR lexicon by using advanced machine learning methods
- Evaluate other possible sentiment triggers
- Evaluate other segmentation techniques and methods for the final score calculation

