Habemus Corpora: Reapproaching Philological Problems in the Age of ‘Big’ Data

This paper demonstrates the potential of new methodologies for using existing corpora of medieval English to better contextualise linguistic variants, a major task of philology, and a key underpinning of our ability to answer major literary-historical questions, like when, where and to what purpose medieval texts and manuscripts were produced. The primary focus of the article is the assistance these methods can offer in dating the composition of texts, which it illustrates with a case study of the “Old” English Life of St Neot, uniquely preserved in the mid-twelfth-century South-Eastern homiliary, London, British Library, Cotton Vespasian D.xiv, fols. 4–169. While the Life has recently been dated around 1100, examining its orthography, lexis, syntax and style alongside that of all other English-language texts surviving from before 1150 using new techniques for searching the Dictionary of Old English Corpus suggests it is very unlikely to be this late. The article closes with some reflections on what book-historical research should prioritise as it further evolves into the digital age.

Pre-print available 1 April 2022

Reference: Mark Faulkner, ‘Habemus Corpora: Reapproaching Philological Problems in the Age of ‘Big’ Data’, Anglia: Zeitschrift für Englische Philologie 139 (2021), 94-127.