Does liwkalike() handle proper regular expressions? #31

cjvanlissa · 2020-06-04T16:22:13Z

Dear Dr. Benoit,

I tried to run the following:

txt <- c("The red-shirted lawyer gave her yellow-haired, red nose ex-boyfriend $300
            out of pity:(.")
dict <- quanteda::dictionary(list(lawyer = c("\\blawyer\\b", "law.er")))
liwcalike(txt, dict, what = "word", valuetype = "regex")

But the word lawyer is not matched:

docname Segment WPS WC Sixltr Dic lawyer AllPunc Period Comma Colon SemiC QMark Exclam Dash Quote
1   text1       1  24 24   8.33   0      0   29.17   4.17  4.17  4.17     0     0      0 12.5     0
  Apostro Parenth OtherP
1       0       0   12.5`

Is this expected behavior? To what extent are regular expressions supported by liwkalike() and, downstream, tokens_lookup.tokens()?

Thank you sincerely,
Caspar

The text was updated successfully, but these errors were encountered:

kbenoit · 2020-06-05T08:08:42Z

Currently, liwcalike() only takes "glob" dictionary patterns, but it would be a reasonable feature request to add valuetype to the function.

To get the equivalent patterns, you would use:

library("quanteda.dictionaries")

txt <- c("The red-shirted lawyer gave her yellow-haired, 
          red nose ex-boyfriend $300 out of pity:(.")
dict <- quanteda::dictionary(list(lawyer = c("lawyer", "law?er")))
liwcalike(txt, dict)
##   docname Segment WPS WC Sixltr  Dic lawyer AllPunc Period Comma Colon SemiC
## 1   text1       1  24 24   8.33 4.17   4.17   29.17   4.17  4.17  4.17     0
##   QMark Exclam Dash Quote Apostro Parenth OtherP
## 1     0      0 12.5     0       0       0   12.5

cjvanlissa · 2020-06-05T11:29:49Z

Thank you for clarifying! I have a dictionary that makes extensive use of perl regex, so indeed, I would like to put my name down for this feature request :)

Sincerely,
Caspar

kbenoit · 2020-06-06T10:45:34Z

Noted! This will not be hard to add.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does liwkalike() handle proper regular expressions? #31

Does liwkalike() handle proper regular expressions? #31

cjvanlissa commented Jun 4, 2020 •

edited by kbenoit

Loading

kbenoit commented Jun 5, 2020

cjvanlissa commented Jun 5, 2020

kbenoit commented Jun 6, 2020

Does liwkalike() handle proper regular expressions? #31

Does liwkalike() handle proper regular expressions? #31

Comments

cjvanlissa commented Jun 4, 2020 • edited by kbenoit Loading

kbenoit commented Jun 5, 2020

cjvanlissa commented Jun 5, 2020

kbenoit commented Jun 6, 2020

cjvanlissa commented Jun 4, 2020 •

edited by kbenoit

Loading