A 5W1H Based Annotation Scheme for Semantic Role Labeling of English Tweets

Kunal Chakma, Amitava Das


Semantic Role Labeling (SRL) is a well researched area of Natural Language Processing. State-of-the-art lexical resources have been developed for SRL on formal texts that involve a tedious annotation scheme and require linguistic expertise. The difficulties increase manifold when such complex annotation scheme is applied on tweets for identifying predicates and role arguments. In this paper, we present asimplified approach for annotation of English tweets for identification of predicates and corresponding semantic roles. For annotation purpose, we adopted the 5W1H (Who, What, When, Where, Why and How) concept which is widely used in journalism. The 5W1H task seeks to extract the semantic information in a natural language sentence by distilling it into the answers to the 5W1H questions: Who, What, When, Where, Why and How. The 5W1H approach is comparatively simple and convenient with respect to the ProbBank Semantic Role Labeling task. We report an the performance of our annotation scheme for SRL on tweets and show that non-expert annotators can produce quality SRL datafor tweets. This paper also reports the difficulties and challenges involved with semantic role labeling on twitter data and propose solutions to them.


5W1H, semantic role labeling, twitter

Full Text: PDF