Idiom Polarity Identification using Contextual Information

Belém Priego Sánchez, David Pinto


Identifying the polarity of a given text is a complex task that usually requires an analysis of the contextual information. This task becomes to be much more complex when, in such analysis, we consider smaller textual components than paragraphs, such as sentences, phraseological units or single words. In this paper, we consider the automatic identification of polarity for linguistic units known as idioms based on their contextual information. Idioms are a phraseological unit made up of more than two words in which one of those words plays the role of the predicate. We employ three lexicons for determining the polarity of those words surrounding the idiom, i.e., in its context and using this information we infer the possible polarity of the target idiom. The lexicons we are using are: ElhPolar dictionary, iSOL and ML-SentiCON Sentiment Spanish Lexicon, all of them containg the polarity of different words. One of the aims of this research work is to identify the lexicon that provides the best results for the task proposed, which is to count the number of positive and negative words in the idiom context, so that we can infer the polarity of the idiom itself. The experiments carried out show that the best combination obtain results close to 57.31% when the texts are lemmatized and 48.87% when they are not lemmatized.


Polarity, idiom, lexicon

Full Text: PDF