Irony/Sarcasm detection

Irony/Sarcasm Detection

This is an irony/sarcasm detection project working with the tweets I collected earlier this year. This will be my second qualifying exam submission.

Irony/Sarcasm Detection

Methods for building twitter-specific sentiment lexicon

Using mutual information

This page has quite a bit of information regarding how the mutual information should be computed. I don't think there is any difference between multi-class mutual information and binary mutual information except that the entropy of the class labels is affected.

Using Counts

I could use a simple method based upon counts of occurances of words both in the negative and in the positive tweets with normalization for frequency of terms

Using Feature Selection Methods

Mutual Information

The issue with using mutual information for this is that, ideally, we would want a two tailed statistical test, while mutual information is a one tailed test. To fix this, there are two different strategies that I am going to try.

Winner take all Mutual-Information

The winner take all mutual-information is the mutual information calculated for the positive class and the negative class, taking the larger of these two (negative mutual information will be multiplied by -1 to create a two tailed distribution of scores).

Proportional Mutual-Information

In proportional mutual information, the mutual information for the negative class si subtracted from the mutual information for the positive class.

Irony/Sarcasm Detection

Comparison of two-side MI

As mentioned in Methods for building twitter-specific sentiment lexicon there are two general ways that I tried to build a twitter specific sentiment lexicon. The first was to calculate the mutual information associated with the positive class and subtract from that the mutual information associated with the negative class. The other option was to take whichever had the higher value as the mutual information score, multiplying the negative class by -1.

However, upon inspection of the results, the winner-take-all method is producing a much more sensible list of vocabulary.

The raw files can be found here

Binary-Based

Count-Based

Determining a cutoff

There is some imbalance in how many terms are given higher mutual information for the positive class and the negative class.

For example, the 0 value for the winner take all binary setup occurs about two thirds of the way through. This imbalance would be problematic if all words were used to compute shifts in sentiment for the sarcasm detection part. The best solution seems to be to make the threshold some number of words from the ends (e.g. we're usign a ranking scheme to determine which words are associated strongly enough with each class to be representatives of that class).

My next steps are to determine how much overlap with the content of the sarcasm dataset there is.

Adding a minimum count cutoff

Commit 27dd9e4300 adds a cutoff to how low in frequency a given token can occur in order to be considered in the mutual information calculations. The entry is still present in the results array in the program, the mutual information is just automatically set to 0 if there are less than x instances of a feature.

Currently the behavior is not special for counts. E.g. when a binary feature matrix has been computed, the minimum cutoff is effectively how many tweets it occured in, The counts do not try to emulate this and instead just count the frequency of usage including multiple usages in a single tweet.

Irony/Sarcasm Detection

Results dump

Cross validation character n-grams tfidf

F1-score Task A 0.6440953412784399

              precision    recall  f1-score   support

           0  0.64831953 0.69214769 0.66951710      1923
           1  0.66760247 0.62218734 0.64409534      1911
           
   micro avg  0.65727700 0.65727700 0.65727700      3834
   macro avg  0.65796100 0.65716751 0.65680622      3834
weighted avg  0.65793082 0.65727700 0.65684601      3834

Embeddings with averages

F1-score Task A 0.5545722713864307
              precision    recall  f1-score   support

           0  0.70503597 0.62156448 0.66067416       473
           1  0.51226158 0.60450161 0.55457227       311

   micro avg  0.61479592 0.61479592 0.61479592       784
   macro avg  0.60864878 0.61303304 0.60762321       784
weighted avg  0.62856552 0.61479592 0.61858527       784

Embeddings with sums

F1-score Task A 0.3297644539614561
              precision    recall  f1-score   support

           0  0.62738854 0.83298097 0.71571299       473
           1  0.49358974 0.24758842 0.32976445       311

   micro avg  0.60076531 0.60076531 0.60076531       784
   macro avg  0.56048914 0.54028470 0.52273872       784
weighted avg  0.57431274 0.60076531 0.56261351       784

Trial data character n-grams tfidf


{'classify__C': 100, 'classify__gamma': 'scale', 'reduce_dim__k': 10000}
F1-score Task A 0.6191198786039454
              precision    recall  f1-score   support

           0  0.75458716 0.69556025 0.72387239       473
           1  0.58620690 0.65594855 0.61911988       311

   micro avg  0.67984694 0.67984694 0.67984694       784
   macro avg  0.67039703 0.67575440 0.67149613       784
weighted avg  0.68779346 0.67984694 0.68231878       784

Trial data character n-grams with cheating to match skew in test data

Best parameters: 
{'classify__C': 100, 'classify__class_weight': {0: 0.75, 1: 1.5}, 'classify__gamma': 'scale', 'reduce_dim__k': 10000}
F1-score Task A 0.6388206388206388
              precision    recall  f1-score   support

           0  0.81850534 0.48625793 0.61007958       473
           1  0.51689861 0.83601286 0.63882064       311

   micro avg  0.62500000 0.62500000 0.62500000       784
   macro avg  0.66770197 0.66113539 0.62445011       784
weighted avg  0.69886287 0.62500000 0.62148069       784

Trial data character n-grams mpqa skew in test data

MI

Fitting 5 folds for each of 5 candidates, totalling 25 fits
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 16 concurrent workers.
[Parallel(n_jobs=-1)]: Done  20 out of  25 | elapsed:   14.3s remaining:    3.6s
[Parallel(n_jobs=-1)]: Done  25 out of  25 | elapsed:   14.4s finished
Grid scores on training set:

  'precision', 'predicted', average, warn_for)
              precision    recall  f1-score   support

           0  0.00000000 0.00000000 0.00000000       473
           1  0.39668367 1.00000000 0.56803653       311

   micro avg  0.39668367 0.39668367 0.39668367       784
   macro avg  0.19834184 0.50000000 0.28401826       784
weighted avg  0.15735794 0.39668367 0.22533082       784


Best parameters: 
{'C': 100, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.6067415730337078
              precision    recall  f1-score   support

           0  0.87931034 0.21564482 0.34634975       473
           1  0.44461078 0.95498392 0.60674157       311

   micro avg  0.50892857 0.50892857 0.50892857       784
   macro avg  0.66196056 0.58531437 0.47654566       784
weighted avg  0.70687212 0.50892857 0.44964293       784


Best parameters: 
{'C': 100, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.6211180124223602
              precision    recall  f1-score   support

           0  0.91472868 0.24947146 0.39202658       473
           1  0.45801527 0.96463023 0.62111801       311

   micro avg  0.53316327 0.53316327 0.53316327       784
   macro avg  0.68637197 0.60705084 0.50657230       784
weighted avg  0.73355793 0.53316327 0.48290341       784


Best parameters: 
{'C': 100, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.6282722513089005
              precision    recall  f1-score   support

           0  0.92142857 0.27272727 0.42088091       473
           1  0.46583851 0.96463023 0.62827225       311

   micro avg  0.54719388 0.54719388 0.54719388       784
   macro avg  0.69363354 0.61867875 0.52457658       784
weighted avg  0.74070343 0.54719388 0.50314967       784


Best parameters: 
{'C': 100, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.6251319957761351
              precision    recall  f1-score   support

           0  0.89864865 0.28118393 0.42834138       473
           1  0.46540881 0.95176849 0.62513200       311

   micro avg  0.54719388 0.54719388 0.54719388       784
   macro avg  0.68202873 0.61647621 0.52673669       784
weighted avg  0.72678948 0.54719388 0.50640501       784


Best parameters: 
{'C': 100, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.6272439281942978
              precision    recall  f1-score   support

           0  0.90540541 0.28329810 0.43156200       473
           1  0.46698113 0.95498392 0.62724393       311

   micro avg  0.54974490 0.54974490 0.54974490       784
   macro avg  0.68619327 0.61914101 0.52940296       784
weighted avg  0.73148965 0.54974490 0.50918582       784


Best parameters: 
{'C': 100, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.6314677930306231
              precision    recall  f1-score   support

           0  0.91891892 0.28752643 0.43800322       473
           1  0.47012579 0.96141479 0.63146779       311

   micro avg  0.55484694 0.55484694 0.55484694       784
   macro avg  0.69452235 0.62447061 0.53473551       784
weighted avg  0.74089001 0.55484694 0.51474746       784

Chi2

Grid scores on training set:

Best parameters: 
{'C': 100, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.5821596244131456
              precision    recall  f1-score   support

           0  0.96666667 0.06131078 0.11530815       473
           1  0.41114058 0.99678457 0.58215962       311

   micro avg  0.43239796 0.43239796 0.43239796       784
   macro avg  0.68890363 0.52904767 0.34873389       784
weighted avg  0.74629854 0.43239796 0.30050051       784

Best parameters: 
{'C': 10, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.5944881889763779
              precision    recall  f1-score   support

           0  0.88607595 0.14799154 0.25362319       473
           1  0.42836879 0.97106109 0.59448819       311

   micro avg  0.47448980 0.47448980 0.47448980       784
   macro avg  0.65722237 0.55952632 0.42405569       784
weighted avg  0.70451099 0.47448980 0.38883877       784

Best parameters: 
{'C': 10, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.59765625
              precision    recall  f1-score   support

           0  0.92957746 0.13953488 0.24264706       473
           1  0.42917251 0.98392283 0.59765625       311

   micro avg  0.47448980 0.47448980 0.47448980       784
   macro avg  0.67937499 0.56172886 0.42015165       784
weighted avg  0.73107499 0.47448980 0.38347341       784

Best parameters: 
{'C': 10, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.5996093750000001
              precision    recall  f1-score   support

           0  0.94366197 0.14164905 0.24632353       473
           1  0.43057504 0.98713826 0.59960938       311

   micro avg  0.47704082 0.47704082 0.47704082       784
   macro avg  0.68711850 0.56439366 0.42296645       784
weighted avg  0.74012876 0.47704082 0.38646626       784

Best parameters: 
{'C': 10, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.5974781765276431
              precision    recall  f1-score   support

           0  0.95312500 0.12896406 0.22718808       473
           1  0.42777778 0.99035370 0.59747818       311

   micro avg  0.47066327 0.47066327 0.47066327       784
   macro avg  0.69045139 0.55965888 0.41233313       784
weighted avg  0.74472833 0.47066327 0.37407612       784

Best parameters: 
{'C': 10, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.5926640926640927
              precision    recall  f1-score   support

           0  0.93220339 0.11627907 0.20676692       473
           1  0.42344828 0.98713826 0.59266409       311

   micro avg  0.46173469 0.46173469 0.46173469       784
   macro avg  0.67782583 0.55170867 0.39971550       784
weighted avg  0.73038854 0.46173469 0.35984603       784

Best parameters: 
{'C': 100, 'class_weight': {0: 0.7, 1: 1.5}, 'gamma': 'scale'}
F1-score Task A 0.6337854500616523
              precision    recall  f1-score   support

           0  0.80985915 0.48625793 0.60766182       473
           1  0.51400000 0.82636656 0.63378545       311

   micro avg  0.62117347 0.62117347 0.62117347       784
   macro avg  0.66192958 0.65631224 0.62072364       784
weighted avg  0.69249666 0.62117347 0.61802464       784

Sentiment feats alone

              precision    recall  f1-score   support

           0  0.63468635 0.36363636 0.46236559       473
           1  0.41325536 0.68167203 0.51456311       311

   micro avg  0.48979592 0.48979592 0.48979592       784
   macro avg  0.52397085 0.52265419 0.48846435       784
weighted avg  0.54684829 0.48979592 0.48307149       784

MPQA Sentiment feats + BoW

MI

Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.5987421383647799
              precision    recall  f1-score   support

           0  0.75666667 0.47991543 0.58732212       473
           1  0.49173554 0.76527331 0.59874214       311

   micro avg  0.59311224 0.59311224 0.59311224       784
   macro avg  0.62420110 0.62259437 0.59303213       784
weighted avg  0.65157281 0.59311224 0.59185226       784

Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.6122905027932961
              precision    recall  f1-score   support

           0  0.81500000 0.34460888 0.48439822       473
           1  0.46917808 0.88102894 0.61229050       311

   micro avg  0.55739796 0.55739796 0.55739796       784
   macro avg  0.64208904 0.61281891 0.54834436       784
weighted avg  0.67781809 0.55739796 0.53513100       784

Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.6160714285714285
              precision    recall  f1-score   support

           0  0.82412060 0.34672304 0.48809524       473
           1  0.47179487 0.88745981 0.61607143       311

   micro avg  0.56122449 0.56122449 0.56122449       784
   macro avg  0.64795774 0.61709143 0.55208333       784
weighted avg  0.68435874 0.56122449 0.53886130       784

Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.6155555555555555
              precision    recall  f1-score   support

           0  0.82564103 0.34038055 0.48203593       473
           1  0.47028862 0.89067524 0.61555556       311

   micro avg  0.55867347 0.55867347 0.55867347       784
   macro avg  0.64796483 0.61552790 0.54879574       784
weighted avg  0.68467853 0.55867347 0.53500098       784

Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.6145374449339207
              precision    recall  f1-score   support

           0  0.82887701 0.32769556 0.46969697       473
           1  0.46733668 0.89710611 0.61453744       311

   micro avg  0.55357143 0.55357143 0.55357143       784
   macro avg  0.64810684 0.61240083 0.54211721       784
weighted avg  0.68545986 0.55357143 0.52715282       784

Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.6106870229007634
              precision    recall  f1-score   support

           0  0.82584270 0.31078224 0.45161290       473
           1  0.46204620 0.90032154 0.61068702       311

   micro avg  0.54464286 0.54464286 0.54464286       784
   macro avg  0.64394445 0.60555189 0.53114996       784
weighted avg  0.68153057 0.54464286 0.51471501       784

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6352941176470589
              precision    recall  f1-score   support

           0  0.79393939 0.55391121 0.65255293       473
           1  0.53524229 0.78135048 0.63529412       311

   micro avg  0.64413265 0.64413265 0.64413265       784
   macro avg  0.66459084 0.66763084 0.64392352       784
weighted avg  0.69131848 0.64413265 0.64570664       784

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.639386189258312
              precision    recall  f1-score   support

           0  0.80511182 0.53276956 0.64122137       473
           1  0.53078556 0.80385852 0.63938619       311

   micro avg  0.64030612 0.64030612 0.64030612       784
   macro avg  0.66794869 0.66831404 0.64030378       784
weighted avg  0.69629107 0.64030612 0.64049339       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.645
              precision    recall  f1-score   support

           0  0.82033898 0.51162791 0.63020833       473
           1  0.52760736 0.82958199 0.64500000       311

   micro avg  0.63775510 0.63775510 0.63775510       784
   macro avg  0.67397317 0.67060495 0.63760417       784
weighted avg  0.70421713 0.63775510 0.63607595       784

Chi2

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6114352392065344
              precision    recall  f1-score   support

           0  0.79411765 0.39957717 0.53164557       473
           1  0.47985348 0.84244373 0.61143524       311

   micro avg  0.57525510 0.57525510 0.57525510       784
   macro avg  0.63698556 0.62101045 0.57154040       784
weighted avg  0.66945418 0.57525510 0.56329683       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6084425036390102
              precision    recall  f1-score   support

           0  0.75000000 0.64693446 0.69466515       473
           1  0.55585106 0.67202572 0.60844250       311

   micro avg  0.65688776 0.65688776 0.65688776       784
   macro avg  0.65292553 0.65948009 0.65155383       784
weighted avg  0.67298429 0.65688776 0.66046204       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5892857142857142
              precision    recall  f1-score   support

           0  0.73286052 0.65539112 0.69196429       473
           1  0.54847645 0.63665595 0.58928571       311

   micro avg  0.64795918 0.64795918 0.64795918       784
   macro avg  0.64066849 0.64602353 0.64062500       784
weighted avg  0.65971837 0.64795918 0.65123337       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5991058122205664
              precision    recall  f1-score   support

           0  0.74056604 0.66384778 0.70011148       473
           1  0.55833333 0.64630225 0.59910581       311

   micro avg  0.65688776 0.65688776 0.65688776       784
   macro avg  0.64944969 0.65507502 0.64960865       784
weighted avg  0.66827730 0.65688776 0.66004418       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6240928882438316
              precision    recall  f1-score   support

           0  0.76354680 0.65539112 0.70534699       473
           1  0.56878307 0.69131833 0.62409289       311

   micro avg  0.66964286 0.66964286 0.66964286       784
   macro avg  0.66616493 0.67335472 0.66471994       784
weighted avg  0.68628721 0.66964286 0.67311481       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6195965417867435
              precision    recall  f1-score   support

           0  0.76059850 0.64482030 0.69794050       473
           1  0.56135770 0.69131833 0.61959654       311

   micro avg  0.66326531 0.66326531 0.66326531       784
   macro avg  0.66097810 0.66806931 0.65876852       784
weighted avg  0.68156293 0.66326531 0.66686273       784


Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6218978102189782
              precision    recall  f1-score   support

           0  0.76097561 0.65961945 0.70668177       473
           1  0.56951872 0.68488746 0.62189781       311

   micro avg  0.66964286 0.66964286 0.66964286       784
   macro avg  0.66524716 0.67225346 0.66428979       784
weighted avg  0.68502779 0.66964286 0.67304936       784

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6440677966101694
              precision    recall  f1-score   support

           0  0.80487805 0.55813953 0.65917603       473
           1  0.54166667 0.79421222 0.64406780       311

   micro avg  0.65178571 0.65178571 0.65178571       784
   macro avg  0.67327236 0.67617588 0.65162191       784
weighted avg  0.70046639 0.65178571 0.65318284       784

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.644918444165621
              precision    recall  f1-score   support

           0  0.81879195 0.51585624 0.63294423       473
           1  0.52880658 0.82636656 0.64491844       311

   micro avg  0.63903061 0.63903061 0.63903061       784
   macro avg  0.67379927 0.67111140 0.63893134       784
weighted avg  0.70375949 0.63903061 0.63769420       784

CoreNLP Sentiment feats + BOW

MI

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5388127853881278
              precision    recall  f1-score   support

           0  0.69406393 0.64270613 0.66739846       473
           1  0.51156069 0.56913183 0.53881279       311

   micro avg  0.61352041 0.61352041 0.61352041       784
   macro avg  0.60281231 0.60591898 0.60310562       784
weighted avg  0.62166787 0.61352041 0.61639062       784


Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5680119581464873
              precision    recall  f1-score   support

           0  0.71596244 0.64482030 0.67853170       473
           1  0.53072626 0.61093248 0.56801196       311

   micro avg  0.63137755 0.63137755 0.63137755       784
   macro avg  0.62334435 0.62787639 0.62327183       784
weighted avg  0.64248227 0.63137755 0.63469032       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5727272727272729
              precision    recall  f1-score   support

           0  0.71954023 0.66173362 0.68942731       473
           1  0.54154728 0.60771704 0.57272727       311

   micro avg  0.64030612 0.64030612 0.64030612       784
   macro avg  0.63054375 0.63472533 0.63107729       784
weighted avg  0.64893333 0.64030612 0.64313431       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6156111929307806
              precision    recall  f1-score   support

           0  0.75480769 0.66384778 0.70641170       473
           1  0.56793478 0.67202572 0.61561119       311

   micro avg  0.66709184 0.66709184 0.66709184       784
   macro avg  0.66137124 0.66793675 0.66101145       784
weighted avg  0.68067826 0.66709184 0.67039262       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6056971514242879
              precision    recall  f1-score   support

           0  0.74532710 0.67441860 0.70810211       473
           1  0.56741573 0.64951768 0.60569715       311

   micro avg  0.66454082 0.66454082 0.66454082       784
   macro avg  0.65637142 0.66196814 0.65689963       784
weighted avg  0.67475257 0.66454082 0.66747973       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6217008797653958
              precision    recall  f1-score   support

           0  0.76029056 0.66384778 0.70880361       473
           1  0.57142857 0.68167203 0.62170088       311

   micro avg  0.67091837 0.67091837 0.67091837       784
   macro avg  0.66585956 0.67275990 0.66525225       784
weighted avg  0.68537209 0.67091837 0.67425138       784


Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6133333333333333
              precision    recall  f1-score   support

           0  0.75238095 0.66807611 0.70772676       473
           1  0.56868132 0.66559486 0.61333333       311

   micro avg  0.66709184 0.66709184 0.66709184       784
   macro avg  0.66053114 0.66683548 0.66053005       784
weighted avg  0.67951031 0.66709184 0.67028243       784

F1-score Task A 0.6105263157894737
              precision    recall  f1-score   support

           0  0.74883721 0.68076110 0.71317829       473
           1  0.57344633 0.65273312 0.61052632       311

   micro avg  0.66964286 0.66964286 0.66964286       784
   macro avg  0.66114177 0.66674711 0.66185231       784
weighted avg  0.67926251 0.66964286 0.67245793       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6240713224368498
              precision    recall  f1-score   support

           0  0.76066351 0.67864693 0.71731844       473
           1  0.58011050 0.67524116 0.62407132       311

   micro avg  0.67729592 0.67729592 0.67729592       784
   macro avg  0.67038700 0.67694405 0.67069488       784
weighted avg  0.68904108 0.67729592 0.68032883       784

Chi2

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5799457994579946
              precision    recall  f1-score   support

           0  0.72829132 0.54968288 0.62650602       473
           1  0.50117096 0.68810289 0.57994580       311

   micro avg  0.60459184 0.60459184 0.60459184       784
   macro avg  0.61473114 0.61889288 0.60322591       784
weighted avg  0.63819638 0.60459184 0.60803634       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5900151285930408
              precision    recall  f1-score   support

           0  0.73271889 0.67230444 0.70121279       473
           1  0.55714286 0.62700965 0.59001513       311

   micro avg  0.65433673 0.65433673 0.65433673       784
   macro avg  0.64493088 0.64965704 0.64561396       784
weighted avg  0.66307075 0.65433673 0.65710249       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6184012066365008
              precision    recall  f1-score   support

           0  0.75462963 0.68921776 0.72044199       473
           1  0.58238636 0.65916399 0.61840121       311

   micro avg  0.67729592 0.67729592 0.67729592       784
   macro avg  0.66850800 0.67419087 0.66942160       784
weighted avg  0.68630354 0.67729592 0.67996408       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5988023952095808
              precision    recall  f1-score   support

           0  0.74004684 0.66807611 0.70222222       473
           1  0.56022409 0.64308682 0.59880240       311

   micro avg  0.65816327 0.65816327 0.65816327       784
   macro avg  0.65013546 0.65558146 0.65051231       784
weighted avg  0.66871409 0.65816327 0.66119727       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6133333333333333
              precision    recall  f1-score   support

           0  0.75238095 0.66807611 0.70772676       473
           1  0.56868132 0.66559486 0.61333333       311

   micro avg  0.66709184 0.66709184 0.66709184       784
   macro avg  0.66053114 0.66683548 0.66053005       784
weighted avg  0.67951031 0.66709184 0.67028243       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6332842415316642
              precision    recall  f1-score   support

           0  0.76923077 0.67653277 0.71991001       473
           1  0.58423913 0.69131833 0.63328424       311

   micro avg  0.68239796 0.68239796 0.68239796       784
   macro avg  0.67673495 0.68392555 0.67659713       784
weighted avg  0.69584761 0.68239796 0.68554698       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6117647058823529
              precision    recall  f1-score   support

           0  0.75180723 0.65961945 0.70270270       473
           1  0.56368564 0.66881029 0.61176471       311

   micro avg  0.66326531 0.66326531 0.66326531       784
   macro avg  0.65774643 0.66421487 0.65723370       784
weighted avg  0.67718246 0.66326531 0.66662908       784

F1-score Task A 0.6172106824925816
              precision    recall  f1-score   support

           0  0.75534442 0.67230444 0.71140940       473
           1  0.57300275 0.66881029 0.61721068       311

   micro avg  0.67091837 0.67091837 0.67091837       784
   macro avg  0.66417359 0.67055736 0.66431004       784
weighted avg  0.68301246 0.67091837 0.67404230       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6140089418777943
              precision    recall  f1-score   support

           0  0.75235849 0.67441860 0.71125975       473
           1  0.57222222 0.66237942 0.61400894       311

   micro avg  0.66964286 0.66964286 0.66964286       784
   macro avg  0.66229036 0.66839901 0.66263435       784
weighted avg  0.68090137 0.66964286 0.67268195       784

Twitter Sentiment feats + BOW

Mutual Information

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.5710186513629842
              precision    recall  f1-score   support

           0  0.71859296 0.60465116 0.65671642       473
           1  0.51554404 0.63987138 0.57101865       311

   micro avg  0.61862245 0.61862245 0.61862245       784
   macro avg  0.61706850 0.62226127 0.61386753       784
weighted avg  0.63804677 0.61862245 0.62272151       784

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.5918653576437587
              precision    recall  f1-score   support

           0  0.73821990 0.59619450 0.65964912       473
           1  0.52487562 0.67845659 0.59186536       311

   micro avg  0.62882653 0.62882653 0.62882653       784
   macro avg  0.63154776 0.63732555 0.62575724       784
weighted avg  0.65358971 0.62882653 0.63276041       784

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.5963431786216595
              precision    recall  f1-score   support

           0  0.74218750 0.60253700 0.66511085       473
           1  0.53000000 0.68167203 0.59634318       311

   micro avg  0.63392857 0.63392857 0.63392857       784
   macro avg  0.63609375 0.64210451 0.63072702       784
weighted avg  0.65801618 0.63392857 0.63783184       784

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6134800550206327
              precision    recall  f1-score   support

           0  0.76086957 0.59196617 0.66587396       473
           1  0.53605769 0.71704180 0.61348006       311

   micro avg  0.64158163 0.64158163 0.64158163       784
   macro avg  0.64846363 0.65450399 0.63967701       784
weighted avg  0.67169037 0.64158163 0.64509015       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6253443526170799
              precision    recall  f1-score   support

           0  0.77235772 0.60253700 0.67695962       473
           1  0.54698795 0.72990354 0.62534435       311

   micro avg  0.65306122 0.65306122 0.65306122       784
   macro avg  0.65967284 0.66622027 0.65115199       784
weighted avg  0.68295721 0.65306122 0.65648469       784

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6233062330623307
              precision    recall  f1-score   support

           0  0.77310924 0.58350951 0.66506024       473
           1  0.53864169 0.73954984 0.62330623       311

   micro avg  0.64540816 0.64540816 0.64540816       784
   macro avg  0.65587546 0.66152968 0.64418324       784
weighted avg  0.68009979 0.64540816 0.64849711       784

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6228187919463087
              precision    recall  f1-score   support

           0  0.77428571 0.57293869 0.65856622       473
           1  0.53456221 0.74598071 0.62281879       311

   micro avg  0.64158163 0.64158163 0.64158163       784
   macro avg  0.65442396 0.65945970 0.64069251       784
weighted avg  0.67919131 0.64158163 0.64438580       784

Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6251655629139073
              precision    recall  f1-score   support

           0  0.77941176 0.56025370 0.65190652       473
           1  0.53153153 0.75884244 0.62516556       311

   micro avg  0.63903061 0.63903061 0.63903061       784
   macro avg  0.65547165 0.65954807 0.63853604       784
weighted avg  0.68108172 0.63903061 0.64129882       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6276041666666666
              precision    recall  f1-score   support

           0  0.78593272 0.54334038 0.64250000       473
           1  0.52735230 0.77491961 0.62760417       311

   micro avg  0.63520408 0.63520408 0.63520408       784
   macro avg  0.65664251 0.65913000 0.63505208       784
weighted avg  0.68335809 0.63520408 0.63659107       784

Chi-squared


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6
              precision    recall  f1-score   support

           0  0.76140351 0.45877378 0.57255937       473
           1  0.48697395 0.78135048 0.60000000       311

   micro avg  0.58673469 0.58673469 0.58673469       784
   macro avg  0.62418873 0.62006213 0.58627968       784
weighted avg  0.65254178 0.58673469 0.58344462       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5842349304482226
              precision    recall  f1-score   support

           0  0.72767857 0.68921776 0.70792617       473
           1  0.56250000 0.60771704 0.58423493       311

   micro avg  0.65688776 0.65688776 0.65688776       784
   macro avg  0.64508929 0.64846740 0.64608055       784
weighted avg  0.66215493 0.65688776 0.65885987       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6183431952662721
              precision    recall  f1-score   support

           0  0.75656325 0.67019027 0.71076233       473
           1  0.57260274 0.67202572 0.61834320       311

   micro avg  0.67091837 0.67091837 0.67091837       784
   macro avg  0.66458299 0.67110800 0.66455276       784
weighted avg  0.68358912 0.67091837 0.67410117       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.623688155922039
              precision    recall  f1-score   support

           0  0.75934579 0.68710359 0.72142064       473
           1  0.58426966 0.66881029 0.62368816       311

   micro avg  0.67984694 0.67984694 0.67984694       784
   macro avg  0.67180773 0.67795694 0.67255440       784
weighted avg  0.68989595 0.67984694 0.68265176       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.624813153961136
              precision    recall  f1-score   support

           0  0.76056338 0.68498943 0.72080089       473
           1  0.58379888 0.67202572 0.62481315       311

   micro avg  0.67984694 0.67984694 0.67984694       784
   macro avg  0.67218113 0.67850758 0.67280702       784
weighted avg  0.69044379 0.67984694 0.68272412       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6340740740740741
              precision    recall  f1-score   support

           0  0.76904762 0.68287526 0.72340426       473
           1  0.58791209 0.68810289 0.63407407       311

   micro avg  0.68494898 0.68494898 0.68494898       784
   macro avg  0.67847985 0.68548908 0.67873916       784
weighted avg  0.69719411 0.68494898 0.68796843       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.609720176730486
              precision    recall  f1-score   support

           0  0.75000000 0.65961945 0.70191226       473
           1  0.56250000 0.66559486 0.60972018       311

   micro avg  0.66198980 0.66198980 0.66198980       784
   macro avg  0.65625000 0.66260715 0.65581622       784
weighted avg  0.67562181 0.66198980 0.66534117       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6244477172312224
              precision    recall  f1-score   support

           0  0.76201923 0.67019027 0.71316085       473
           1  0.57608696 0.68167203 0.62444772       311

   micro avg  0.67474490 0.67474490 0.67474490       784
   macro avg  0.66905309 0.67593115 0.66880429       784
weighted avg  0.68826293 0.67474490 0.67796980       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6311688311688313
              precision    recall  f1-score   support

           0  0.79076923 0.54334038 0.64411028       473
           1  0.52941176 0.78135048 0.63116883       311

   micro avg  0.63775510 0.63775510 0.63775510       784
   macro avg  0.66009050 0.66234543 0.63763955       784
weighted avg  0.68709299 0.63775510 0.63897662       784

Bow alone

Mutual Information

Best parameters: 
{'C': 0.01, 'gamma': 'scale'}
F1-score Task A 0.6146788990825688
              precision    recall  f1-score   support

           0  0.80717489 0.38054968 0.51724138       473
           1  0.47771836 0.86173633 0.61467890       311

   micro avg  0.57142857 0.57142857 0.57142857       784
   macro avg  0.64244662 0.62114301 0.56596014       784
weighted avg  0.67648486 0.57142857 0.55589325       784


Best parameters: 
{'C': 0.1, 'gamma': 'scale'}
F1-score Task A 0.6074766355140188
              precision    recall  f1-score   support

           0  0.78661088 0.39746300 0.52808989       473
           1  0.47706422 0.83601286 0.60747664       311

   micro avg  0.57142857 0.57142857 0.57142857       784
   macro avg  0.63183755 0.61673793 0.56778326       784
weighted avg  0.66381877 0.57142857 0.55958131       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6084507042253521
              precision    recall  f1-score   support

           0  0.75324675 0.61310782 0.67599068       473
           1  0.54135338 0.69453376 0.60845070       311

   micro avg  0.64540816 0.64540816 0.64540816       784
   macro avg  0.64730007 0.65382079 0.64222069       784
weighted avg  0.66919211 0.64540816 0.64919867       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6022727272727273
              precision    recall  f1-score   support

           0  0.74680307 0.61733615 0.67592593       473
           1  0.53944020 0.68167203 0.60227273       311

   micro avg  0.64285714 0.64285714 0.64285714       784
   macro avg  0.64312164 0.64950409 0.63909933       784
weighted avg  0.66454561 0.64285714 0.64670890       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.5991316931982634
              precision    recall  f1-score   support

           0  0.74257426 0.63424947 0.68415051       473
           1  0.54473684 0.66559486 0.59913169       311

   micro avg  0.64668367 0.64668367 0.64668367       784
   macro avg  0.64365555 0.64992216 0.64164110       784
weighted avg  0.66409538 0.64668367 0.65042494       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6080691642651297
              precision    recall  f1-score   support

           0  0.75062344 0.63636364 0.68878719       473
           1  0.55091384 0.67845659 0.60806916       311

   micro avg  0.65306122 0.65306122 0.65306122       784
   macro avg  0.65076864 0.65741011 0.64842817       784
weighted avg  0.67140190 0.65306122 0.65676766       784


Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.6242038216560509
              precision    recall  f1-score   support

           0  0.78709677 0.51585624 0.62324393       473
           1  0.51687764 0.78778135 0.62420382       311

   micro avg  0.62372449 0.62372449 0.62372449       784
   macro avg  0.65198721 0.65181879 0.62372388       784
weighted avg  0.67990525 0.62372449 0.62362471       784


Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.6221662468513853
              precision    recall  f1-score   support

           0  0.78737542 0.50105708 0.61240310       473
           1  0.51138716 0.79421222 0.62216625       311

   micro avg  0.61734694 0.61734694 0.61734694       784
   macro avg  0.64938129 0.64763465 0.61728467       784
weighted avg  0.67789538 0.61734694 0.61627598       784

Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.6262376237623762
              precision    recall  f1-score   support

           0  0.79790941 0.48414376 0.60263158       473
           1  0.50905433 0.81350482 0.62623762       311

   micro avg  0.61479592 0.61479592 0.61479592       784
   macro avg  0.65348187 0.64882429 0.61443460       784
weighted avg  0.68332531 0.61479592 0.61199571       784

Chi2

Best parameters: 
{'C': 0.1, 'gamma': 'scale'}
F1-score Task A 0.616822429906542
              precision    recall  f1-score   support

           0  0.80334728 0.40591966 0.53932584       473
           1  0.48440367 0.84887460 0.61682243       311

   micro avg  0.58163265 0.58163265 0.58163265       784
   macro avg  0.64387548 0.62739713 0.57807414       784
weighted avg  0.67682756 0.58163265 0.57006747       784


Best parameters: 
{'C': 0.1, 'gamma': 'scale'}
F1-score Task A 0.6134259259259259
              precision    recall  f1-score   support

           0  0.80086580 0.39112051 0.52556818       473
           1  0.47920434 0.85209003 0.61342593       311

   micro avg  0.57397959 0.57397959 0.57397959       784
   macro avg  0.64003507 0.62160527 0.56949705       784
weighted avg  0.67326795 0.57397959 0.56041991       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6005830903790088
              precision    recall  f1-score   support

           0  0.74327628 0.64270613 0.68934240       473
           1  0.54933333 0.66237942 0.60058309       311

   micro avg  0.65051020 0.65051020 0.65051020       784
   macro avg  0.64630481 0.65254278 0.64496275       784
weighted avg  0.66634228 0.65051020 0.65413303       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6051873198847262
              precision    recall  f1-score   support

           0  0.74812968 0.63424947 0.68649886       473
           1  0.54830287 0.67524116 0.60518732       311

   micro avg  0.65051020 0.65051020 0.65051020       784
   macro avg  0.64821627 0.65474531 0.64584309       784
weighted avg  0.66886165 0.65051020 0.65424390       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.5979971387696709
              precision    recall  f1-score   support

           0  0.74242424 0.62156448 0.67663982       473
           1  0.53865979 0.67202572 0.59799714       311

   micro avg  0.64158163 0.64158163 0.64158163       784
   macro avg  0.64054202 0.64679510 0.63731848       784
weighted avg  0.66159421 0.64158163 0.64544355       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6189111747851003
              precision    recall  f1-score   support

           0  0.76070529 0.63847780 0.69425287       473
           1  0.55813953 0.69453376 0.61891117       311

   micro avg  0.66071429 0.66071429 0.66071429       784
   macro avg  0.65942241 0.66650578 0.65658202       784
weighted avg  0.68035076 0.66071429 0.66436605       784


Best parameters: 
{'C': 10, 'gamma': 'scale'}
F1-score Task A 0.6221590909090909
              precision    recall  f1-score   support

           0  0.76470588 0.63213531 0.69212963       473
           1  0.55725191 0.70418006 0.62215909       311

   micro avg  0.66071429 0.66071429 0.66071429       784
   macro avg  0.66097890 0.66815769 0.65714436       784
weighted avg  0.68241228 0.66071429 0.66437346       784


Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.628498727735369
              precision    recall  f1-score   support

           0  0.79288026 0.51797040 0.62659847       473
           1  0.52000000 0.79421222 0.62849873       311

   micro avg  0.62755102 0.62755102 0.62755102       784
   macro avg  0.65644013 0.65609131 0.62754860       784
weighted avg  0.68463312 0.62755102 0.62735227       784


Best parameters: 
{'C': 1, 'gamma': 'scale'}
F1-score Task A 0.6236024844720497
              precision    recall  f1-score   support

           0  0.79310345 0.48625793 0.60288336       473
           1  0.50809717 0.80707395 0.62360248       311

   micro avg  0.61352041 0.61352041 0.61352041       784
   macro avg  0.65060031 0.64666594 0.61324292       784
weighted avg  0.68004611 0.61352041 0.61110230       784

Combined

Chi2

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5663716814159292
              precision    recall  f1-score   support

           0  0.71462830 0.63002114 0.66966292       473
           1  0.52316076 0.61736334 0.56637168       311

   micro avg  0.62500000 0.62500000 0.62500000       784
   macro avg  0.61889453 0.62369224 0.61801730       784
weighted avg  0.63867625 0.62500000 0.62868897       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5753012048192772
              precision    recall  f1-score   support

           0  0.72157773 0.65750529 0.68805310       473
           1  0.54107649 0.61414791 0.57530120       311

   micro avg  0.64030612 0.64030612 0.64030612       784
   macro avg  0.63132711 0.63582660 0.63167715       784
weighted avg  0.64997583 0.64030612 0.64332626       784


Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5954198473282442
              precision    recall  f1-score   support

           0  0.73636364 0.68498943 0.70974808       473
           1  0.56686047 0.62700965 0.59541985       311

   micro avg  0.66198980 0.66198980 0.66198980       784
   macro avg  0.65161205 0.65599954 0.65258397       784
weighted avg  0.66912450 0.66198980 0.66439594       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.600609756097561
              precision    recall  f1-score   support

           0  0.74031891 0.68710359 0.71271930       473
           1  0.57101449 0.63344051 0.60060976       311

   micro avg  0.66581633 0.66581633 0.66581633       784
   macro avg  0.65566670 0.66027205 0.65666453       784
weighted avg  0.67315861 0.66581633 0.66824727       784


Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6255639097744361
              precision    recall  f1-score   support

           0  0.76046512 0.69133192 0.72425249       473
           1  0.58757062 0.66881029 0.62556391       311

   micro avg  0.68239796 0.68239796 0.68239796       784
   macro avg  0.67401787 0.68007111 0.67490820       784
weighted avg  0.69188069 0.68239796 0.68510434       784


Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6216216216216217
              precision    recall  f1-score   support

           0  0.75757576 0.68710359 0.72062084       473
           1  0.58309859 0.66559486 0.62162162       311

   micro avg  0.67857143 0.67857143 0.67857143       784
   macro avg  0.67033717 0.67634922 0.67112123       784
weighted avg  0.68836351 0.67857143 0.68134947       784


Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6077844311377245
              precision    recall  f1-score   support

           0  0.74707260 0.67441860 0.70888889       473
           1  0.56862745 0.65273312 0.60778443       311

   micro avg  0.66581633 0.66581633 0.66581633       784
   macro avg  0.65785003 0.66357586 0.65833666       784
weighted avg  0.67628632 0.66581633 0.66878240       784

Mutual Information

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5601217656012177
              precision    recall  f1-score   support

           0  0.71004566 0.65750529 0.68276619       473
           1  0.53179191 0.59163987 0.56012177       311

   micro avg  0.63137755 0.63137755 0.63137755       784
   macro avg  0.62091878 0.62457258 0.62144398       784
weighted avg  0.63933531 0.63137755 0.63411515       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.5753424657534247
              precision    recall  f1-score   support

           0  0.72146119 0.66807611 0.69374314       473
           1  0.54624277 0.60771704 0.57534247       311

   micro avg  0.64413265 0.64413265 0.64413265       784
   macro avg  0.63385198 0.63789658 0.63454280       784
weighted avg  0.65195490 0.64413265 0.64677553       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.581039755351682
              precision    recall  f1-score   support

           0  0.72562358 0.67653277 0.70021882       473
           1  0.55393586 0.61093248 0.58103976       311

   micro avg  0.65051020 0.65051020 0.65051020       784
   macro avg  0.63977972 0.64373262 0.64062929       784
weighted avg  0.65751787 0.65051020 0.65294243       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6066066066066067
              precision    recall  f1-score   support

           0  0.74592075 0.67653277 0.70953437       473
           1  0.56901408 0.64951768 0.60660661       311

   micro avg  0.66581633 0.66581633 0.66581633       784
   macro avg  0.65746742 0.66302523 0.65807049       784
weighted avg  0.67574476 0.66581633 0.66870461       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6059701492537314
              precision    recall  f1-score   support

           0  0.74588235 0.67019027 0.70601336       473
           1  0.56545961 0.65273312 0.60597015       311

   micro avg  0.66326531 0.66326531 0.66326531       784
   macro avg  0.65567098 0.66146170 0.65599176       784
weighted avg  0.67431160 0.66326531 0.66632785       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.608955223880597
              precision    recall  f1-score   support

           0  0.74823529 0.67230444 0.70824053       473
           1  0.56824513 0.65594855 0.60895522       311

   micro avg  0.66581633 0.66581633 0.66581633       784
   macro avg  0.65824021 0.66412650 0.65859788       784
weighted avg  0.67683613 0.66581633 0.66885567       784

Best parameters: 
{'C': 100, 'gamma': 'scale'}
F1-score Task A 0.6063348416289592
              precision    recall  f1-score   support

           0  0.74537037 0.68076110 0.71160221       473
           1  0.57102273 0.64630225 0.60633484       311

   micro avg  0.66709184 0.66709184 0.66709184       784
   macro avg  0.65819655 0.66353168 0.65896853       784
weighted avg  0.67620951 0.66709184 0.66984436       784

Irony/Sarcasm Detection

Todo: Coding

  1. Implement sentiment value creator with fit() and transform().
  2. Integrate corenlpSentimentAnalyzer code into fit() and transform() framework
  3. Add FeatureUnion component to pipeline
  4. Dig up MPQA feature creation in git repo
  5. Implement MPQA estimator
Irony/Sarcasm Detection

Log

Fixed issue where the server would not work with the sentiment analysis annotator enabled by upgrading to corenlp v 3.9.2

Irony/Sarcasm Detection

F-score for positive class analysis

Bag of Words alone

Feature Count Mutual Information Chi-squared
100 0.61468 0.61682
500 0.60748 0.61343
1000 0.60845 0.60058
2000 0.60227 0.60519
3000 0.59913 0.59800
5000 0.60807 0.61891
10000 0.62420 0.62216
12000 0.62217 0.62850
15000 0.62624 0.62360

MPQA + BOW

Feature Count Mutual Information Chi-squared
100 0.598742 0.611435
500 0.612291 0.608443
1000 0.616071 0.5892857
2000 0.615556 0.5991058
3000 0.614537 0.6240929
5000 0.610687 0.6195965
10000 0.635294 0.6218978
12000 0.639386 0.6440678
15000 0.645 0.6449184

CoreNLP + BOW

Feature Count Mutual Information Chi-squared
100 0.538813 0.5799457
500 0.568012 0.5900151
1000 0.572727 0.6184012
2000 0.615611 0.5988024
3000 0.605697 0.6133333
5000 0.621701 0.6332842
10000 0.613333 0.6117647
12000 0.610526 0.6172106
15000 0.624071 0.6140089

Twitter Sentiment Queues + BOW

Feature Count Mutual Information Chi-squared
100 0.571019 0.5799457
500 0.591865 0.5900151
1000 0.596343 0.6184012
2000 0.613480 0.5988024
3000 0.625344 0.6133333
5000 0.6 0.6332842
10000 0.613333 0.6117647
12000 0.625165 0.6244477
15000 0.6276041 0.6311688
Irony/Sarcasm Detection

Macro f-score analysis

General sentiment features (MPQA)

Feature Count Mutual Information Chi-squared
100 0.59303 0.57154
500 0.54834 0.65155
1000 0.55208 0.64063
2000 0.54880 0.64961
3000 0.54212 0.66472
5000 0.53115 0.65877
10000 0.64392 0.66429
12000 0.64031 0.65162
15000 0.63760 0.63893

CoreNLP Sentiment features

Feature Count Mutual Information Chi-squared
100 0.60311 0.60323
500 0.62327 0.64561
1000 0.63108 0.66942
2000 0.66101 0.65051
3000 0.65690 0.66053
5000 0.66525 0.67660
10000 0.66053 0.65723
12000 0.66185 0.66431
15000 0.67069 0.66263

Twitter Sentiment Features

Feature Count Mutual Information Chi-squared
100 0.61387 0.58628
500 0.62576 0.64608
1000 0.63073 0.66455
2000 0.63968 0.67255
3000 0.65115 0.67281
5000 0.64418 0.67874
10000 0.64069 0.65582
12000 0.63854 0.66880
15000 0.63505 0.63764

Bow alone

Feature Count Mutual Information Chi-squared
100 0.56596 0.57807
500 0.56778 0.56950
1000 0.64222 0.64496
2000 0.63910 0.64584
3000 0.64164 0.63732
5000 0.64843 0.65658
10000 0.62372 0.65714
12000 0.61729 0.62755
15000 0.61443 0.61324
Irony/Sarcasm Detection

Precision and Recall for top performers (macro f-score)

MI features

Data Source N-features Negative Precision Negative Recall Positive Precision Positive Recall
BOW only 5000 0.75062 0.63636 0.550914 0.678457
Generic Sentiment Features + BOW 10000 0.79394 0.553911 0.53524 0.78135
Syntactic Sentiment Features + BoW 5000 0.75238 0.66808 0.56868 0.66559
Twitter Sentiment Featuers + BoW 5000 0.77311 0.58351 0.538642 0.73955

Chi squared features

Data Source N-features Negative Precision Negative Recall Positive Precision Positive Recall
BOW only 5000 0.76071 0.638478 0.55814 0.69453
Generic Sentiment Features + BOW 10000 0.76098 0.65962 0.56952 0.68489
Syntactic Sentiment Features + BoW 5000 0.76923 0.676533 0.58424 0.69132
Twitter Sentiment Featuers + BoW 5000 0.76905 0.68288 0.58791 0.6881