Syntax Matters for Rhetorical Structure: The Case of Chiasmus
The chiasmus is a rhetorical figure involving the repetition of a pair of words in reverse order, as in “all for one, one for all”. Previous work on detecting chiasmus in running text has only considered superficial features like words and punctuation. During this presentation, we explore the use of syntactic features as a means to improve the quality of chiasmus detection. Our results show that taking syntactic structure into account may increase average precision from about 40 to 65% on texts taken from European Parliament proceedings. To show the generality of the approach, we also evaluate it on literary text and observe a similar improvement and a slightly better overall result.
Fast & Frugal Detection of Chiasm-like Protofigures in English Subsection of CHILDES Corpus
Intervention shall focus on identification and extraction of instances of repetition-based rhetorical figures (e.g. epanaphor, epiphore, chiasm, antimetabole etc.) from English language sections of Child Language Data Exchange System (CHILDES). CHILDES is the biggest publicly available repository of language acquisition data and the size of its English section is quite exhaustive: 1841729 motherese utterances and 1673351 utterances produced by children between 0 and 10 years of age. After introducing the corpus, we shall introduce a fast & frugal method, exploiting the backreference faculty of Perl Compatible Regular Expressions (PCREs) allowing us to identify repetition-involving utterances in complete CHILDES in just few seconds. We shall subsequently assess frequencies of occurrence of chiasm-like expressions and estimates their distributions for different age groups. We shall conclude with discussion aiming to elucidate cognitive and ontogenetic aspects of surface, significant-encoded figures of speech. All experiments shall be immediately reproducible by workshop participants having an access to standard UNIX shell and GNU tools (wc, sort, grep, uniq and perl).