This paper aims to address the difficult task of sarcasm detection on Twitter by leveraging behavioral traits intrinsic to users expressing sarcasm. We identify such traits using the user's past tweets. We employ theories from behavioral and psychological studies to construct a behavioral modeling framework tuned for detecting sarcasm.
Current research on the topic has primarily focused on obtaining information from the text of the tweets. These techniques treat sarcasm as a linguistic phenomenon, with limited emphasis on the psychological aspects of sarcasm. Rockwell [32] identified a positive correlation between congitive complexity and the ability to produce sarcasm.
To follow a systematic approach, we first theorize the core forms of sarcasm using existing psychological and behavioral studies. Next we develop computational features to capture these forms of sarcasm using user's current and past tweets. Finally, we combine these features to train a learning algorithm to detect sarcasm.
Sarcasm, while similar to irony, differs in that it's usually viewed as being caustic and derisive. Some researchers even consider it to be aggressive humor and a form of verbal aggression. Oxford dictionary's definition: way of using words that are the opposite of what you mean in order to be unpleasant to somebody or to make fun of them.
Definition. Sarcasm Detection on Twitter. Given an unlabeled tweet t from user U along with a set of U’s past tweets T, a solution to sarcasm detection aims to automatically detect if t is sarcastic or not.
By analyzing users' past tweets, SCUBA considers the user's likelihood of being a sarcastic person. We find that that sarcasm generation can be characterized as one or a combination of the following:
- Sarcasm as a contrast of sentiments
- Sarcasm as a complex form of expression
- Sarcasm as a means of conveying emotion
- Sarcasm as a possible function of familiarity
- Sarcasm as a form of written expression
SCUBA incorporates a behavioral modeling approach for sarcasm detection that uses features which capture the different forms of sarcasm, these extracted features are used in a supervised learning framework along with some labeled data to determine if the tweet is sarcastic or not.
A common way to express sarcasm is to use words with contrasting connotations within the same tweet. To model such occurrences, we construct features based on affect and sentimental scores.
We obtain affect score of words from a dataset containing affect scores for 13.9k English lemmas on a 9-point scale.
The sentiment score is calculated using SentiStrength, a lexicon-based tool optimized for tweet sentiment detection.
Sometimes, the user may set up a contrasting context in her previous tweet and then, choose to use a sarcastic remark in her current tweet. We obtain the sentiment expressed by the user in the previous tweet and current tweet using SentiStrength. Then we include the type of sentiment transition taking place from the past tweet to the current one.
We use number of words, number of syllables, number of syllables per word in the tweet. We also include the 6-tuple representation as features in our framework.
We gauge the user's mood using sentiment expressed in her past tweets, but we can't assume that the user's mood is encapsulated in her last n tweets. We employ a method to extract the mood based on this (more in paper).
We examine how affect and sentiment are expressed in sarcastic tweets, we construct a sentiment score distribution. More in the paper.
We hypothesize that as users encounter unpleasant situations in real life, they react spontaneously by posting tweets to vent out their frustrations. Therefore, they diverge from their regular tweeting patterns. We construct an expected tweet posting time probability distribution.
We measure the user's language skills with features inspired by standardarized language proficiency cloze tests. We evaluate on vocabulary skills, grammar skills, and familiarity with sarcasm.
Users express sarcasm better when they are well acquainted with their environment. Just as people are less likely to use sarcasm at a new, unfamiliar setting, users take time to get familiar with Twitter before posting sarcastic tweets.
We measure usage familiarity, twitter parlance familiarity, and social familiarity.
Users often repeat letters in words to stress and overemphasize certain parts of the tweet (for example, sooooo, awesomeeee) to indicate that they mean the opposite of what is written. We capture such usage by including as boolean features, the presence of repeated characters (3 or more) and the presence of repeated characters (3 or more) in sentimentloaded words (such as, loveeee) (2 features)
We observe that sarcastic tweets sometimes have a certain structure wherein the user’s views are expressed in the first few words of the tweet, while in the later parts, a description of a particular scenario is put forth, e.g., I love it when my friends ignore me. To capture possible syntactic idiosyncrasies arising from such tweet construction, we use as features, the POS tags of the first three words and the last three words in the tweet (6 features).
We obtain the dataset dividing the tweets with #sarcasm and #not filtering. We obtaind 9.1k sarcastic tweets which were self described by users as being sarcastic using the appropriate hashtags.
It must also be noted that, to avoid confusion and ambiguity when expressing sarcasm in writing, the users choose to explicitly mark the sarcastic tweets with appropriate hashtags. The expectation is that these tweets, if devoid of these hashtags, might be difficult to comprehend as sarcasm even for humans. Therefore, our dataset might be biased towards the hardest forms of sarcasm.
We evaluate SCUBA on different class distributions, 1:1, 10:90, 20:80. We use AUC (area under ROC curve) apart from accuracy for metrics. We compare SCUBA to the following baselines:
- Contrast approach
- Hybrid approach
- SCUBA - #sarcasm
- Random classifier
- Majority classifier
- n-gram classifier
We evaluate using a J48 decision tree, l1-regularized logistic regression, and l1-regularized l2-loss SVM to obtain an accuracy of 78%, 83.4% and 83% respectively.
The top 10 features in decreasing order of importance for sarcasm detection are the following:
- Percentage of emoticons in the tweet.
- Percentage of adjectives in the tweet.
- Percentage of past words with sentiment score 3.
- Number of polysyllables per word in the tweet.
- Lexical density of the tweet.
- Percentage of past words with sentiment score 2.
- Percentage of past words with sentiment score -3.
- Number of past sarcastic tweets posted.
- Percentage of positive to negative sentiment transitions made by the user.
- Percentage of capitalized hashtags in the tweet
In this paper, we introduce SCUBA, a behavioral modeling framework for sarcasm detection. We discuss different forms that sarcasm can take, namely: (1) as a contrast of sentiments, (2) as a complex form of expression, (3) as a means of conveying emotion, (4) as a possible function of familiarity and (5) as a form of written expression.
Importantly, we have demonstrated that even limited historical information may greatly help improve the efficiency of sarcasm detection. This makes SCUBA a good fit for real-world, real-time applications which have high computational constraints.