Brooklyn Law Review


During the last 5–10 years, corpus-linguistic applications have slowly become more widespread in matters of legal interpretation; specifically, we see more court cases in which corpus-linguistic data are brought to bear on the (original) ordinary/public meaning of expressions in legal texts (in briefs and judicial opinions), but also more academic research focusing on if/how corpus-linguistic methods can shed light on the plain/ordinary meaning of words in a legal text.While this development is welcome, it also comes with shortcoming/risks, some of which are now hotly debated in recent and forthcoming law review articles. In particular, there is a whole family of currently debated shortcomings/risks that is virtually exclusively due to the fact that several early adopters/promoters of corpus methods for legal applications have been massively simplifying the field of corpus linguistics to what they know and what seems convenient.This is not useful for several reasons, one of which is that it makes corpus-linguistic applications in the legal field more vulnerable to various lines of attack in the legal literature. More important, however, is that reductionist corpus-linguistic applications also undermine the strength of the cases that corpus linguistics can (help) make. In this paper, I will discuss a few applications that showcase the wider range of methods that proper/full-fledged corpus analysis has to offer: one case study on historical trends in corpus data based on frequencies augmented with required but never-used additional statistics such as dispersion and uncertainty/robustness estimates; the other involves applying semantic vector spaces and word embeddings to explore (heuristically) the scope of terms.