Parse tree of the phrase 'The Riddle of Literary Quality'

Academic homepage of Andreas van Cranenburgh

I am an assistant professor in digital humanities and information sciences at the University of Groningen. Previously I was a postdoc at Heinrich Heine Universität Düsseldorf in the Beyond CFG project, and a PhD candidate in the project The Riddle of Literary Quality. My primary interests are applying computational models to the analysis of literary novels and statistical parsing.

Code: and
Profiles: Google Scholar; Semantic Scholar.


Peer reviewed publications (bibtex)

Andreas van Cranenburgh, Corina Koolen (2019).
The Literary Pepsi Challenge: intrinsic and extrinsic factors in judging literary quality.
Digital Humanities 2019, Utrecht, The Netherlands, 9-12 July.

Andreas van Cranenburgh, Karina van Dalen-Oskam, Joris van Zundert (2019).
Vector space explorations of literary language.
Language Resources & Evaluation. 26pp. (code)

Tatiana Bladier, Andreas van Cranenburgh, Kilian Evang, Laura Kallmeyer, Robin Möllemann, Rainer Osswald (2018).
RRGbank: a Role and Reference Grammar Corpus of Syntactic Structures Extracted from the Penn Treebank.
Proceedings of Treebanks and Linguistic Theories, pp. 5-16.

Andreas van Cranenburgh (2018).
Cliche expressions in literary and genre novels.
Proceedings of LaTeCH-CLfL workshop. (code)

Andreas van Cranenburgh (2018).
Active DOP: A constituency treebank annotation tool with online learning.
Proceedings of COLING 2018 demonstrations track. code

Tatiana Bladier, Andreas van Cranenburgh, Younes Samih, Laura Kallmeyer (2018).
German and French Neural Supertagging Experiments for LTAG Parsing.
ACL 2018 student research workshop.

Corina Koolen, Andreas van Cranenburgh (2018).
Blue eyes and porcelain cheeks: Computational extraction of physical descriptions from Dutch chick lit and literary novels.
Digital Scholarship in the Humanities, vol. 33, no. 1, pp. 59–71.

Corina Koolen, Andreas van Cranenburgh (2017).
These are not the Stereotypes You are Looking For: Bias and Fairness in Authorial Gender Attribution.
Proceedings of the First Ethics in NLP workshop, pp. 12-22. (notebook)

Andreas van Cranenburgh, Rens Bod (2017).
A Data-Oriented Model of Literary Language.
Proceedings of EACL, pp. 1228-1238. (code; slides; Q&A)

Andreas van Cranenburgh, Remko Scha, Rens Bod (2016).
Data-Oriented Parsing with Discontinuous Constituents and Function Tags.
Journal of Language Modelling, vol. 4, no. 1, pp. 57-111. (code; grammars)

Kim Jautze, Andreas van Cranenburgh, Corina Koolen (2016).
Topic Modeling Literary Quality.
Digital Humanities 2016, Krakow, Poland, 11-16 July.

Andreas van Cranenburgh (2016).
Machine Learning Literature using Textual Features.
Tiny Transactions on Computer Science, vol. 4.

Andreas van Cranenburgh, Corina Koolen (2015).
Identifying Literary Novels with Bigrams.
Proceedings of the Fourth Workshop on Computational Linguistics for Literature, pp. 58-67. (poster)

Federico Sangati, Andreas van Cranenburgh (2015).
Multiword Expression Identification with Recurring Tree Fragments and Association Measures.
Proceedings of the 11th Workshop on Multiword Expressions, pp. 10-18. (slides)

Andreas van Cranenburgh (2014).
Extraction of Phrase-Structure Fragments with a Linear Average Time Tree Kernel.
Computational Linguistics in the Netherlands Journal, vol. 4, pp. 3-16.

Dirk Roorda, Gino Kalkman, Martijn Naaijer, Andreas van Cranenburgh (2014).
LAF-Fabric: a data analysis tool for Linguistic Annotation Framework with an application to the Hebrew Bible.
Computational Linguistics in the Netherlands Journal, vol. 4, pp. 105-120.

Andreas van Cranenburgh, Rens Bod (2013).
Discontinuous Parsing with an Efficient and Accurate DOP Model.
Proceedings of the International Conference on Parsing Technologies, Nara, Japan, 27-29 November.
paper; slides; code; notes.

Kim Jautze, Corina Koolen, Andreas van Cranenburgh, Hayco de Jong (2013).
From high heels to weed attics: a syntactic investigation of chick lit and literature.
Proceedings of the Computational Linguistics for Literature workshop, Atlanta, Georgia, June 14. (slides)

Andreas van Cranenburgh (2012).
Literary authorship attribution with phrase-structure fragments.
Proceedings of the Computational Linguistics for Literature workshop, pp. 59-63. (code, slides, revised paper—includes results on Federalist papers).

Andreas van Cranenburgh (2012).
Efficient parsing with linear context-free rewriting systems.
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Avignon, France, April 23–27. (poster, errata, corrected version, code).

Maria Aloni, Andreas van Cranenburgh, Raquel Fernández, Marta Sznajder (2012).
Building a Corpus of Indefinite Uses Annotated with Fine-grained Semantic Functions.
The eighth international conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, May 23–25.

Andreas van Cranenburgh, Remko Scha, Federico Sangati (2011).
Discontinuous Data-Oriented Parsing: A mildly context-sensitive all-fragments grammar.
Proceedings of the 2nd Workshop on Statistical Parsing of Morphologically-Rich Languages (SPMRL), pages 34–44, Dublin, Ireland, October 6. (slides, template for slides, code).

Andreas van Cranenburgh, Galit Sassoon, Raquel Fernández (2010).
Invented antonyms: Esperanto as a semantic lab.
Proceedings of the 26th Annual Meeting of the Israel Association for Theoretical Linguistics (IATL 26).


Andreas van Cranenburgh (2012).
Extracting tree fragments in linear average time.
ILLC technical report.



Academic service