Corpus Linguistics and Law: Reflections of a Legal Scholar and recent Master’s Graduate from Lancaster University

Written by Adrian Hemler, University of Konstanz, Germany


Just a couple of months ago, I found myself in a curious situation: Having studied Corpus Linguistics (MA) remotely and part-time at Lancaster Universityโ€™s Linguistics Department for two years, I visited the campus for the first time for an occasion that usually means farewell for most other students: My graduation ceremony on 12 December 2024.

On that day, I also had the privilege of receiving the Chancellorโ€™s Medal and the Geoffrey Leech Award of the Linguistics Department. I was particularly excited to be selected for the Geoffrey Leech Award since he was, among many other things, a trailblazer in the fields of Corpus Linguistics, Semantics and Pragmatics. Therefore, his academic work is an excellent inspiration for my own research goals, which I would like to briefly outline in the following.

Professor Luke Harding (Head of Department) and Adrian Hemler

Before studying Corpus Linguistics, I had already completed my education as a fully qualified lawyer in Germany. I also obtained a PhD in Conflict of Laws in 2019 and graduated from the University of Cambridge with an LL.M. degree in 2021. Since then, I have been working as a PostDoc legal scholar and junior lecturer at the University of Konstanz in Germany.

Now, if one wishes to become a full professor in Germany, one usually needs to hand in the infamous โ€œsecond big bookโ€ called the Habilitationsschrift (habilitation thesis). When I had to decide on the topic of my habilitation thesis, I found myself at a crossroads: Should I choose a potentially less exciting but safe issue within legal doctrine, or should I choose a potentially more exciting but risky cutting-edge topic?

With a focus on empirical methods in the law, I chose the second option and soon identified Corpus Linguistics as a fascinating domain that might prove highly fruitful in the law. It also seemed to be a potential remedy to several issues within the legal sciences that had bothered me for a long time already: First, a fundamental disconnect between legal sciences and other disciplines, quite often marked by a real disinterest in interdisciplinarity.

Second, overburdening legal complexity, which frequently makes it difficult and expensive to determine suitable and adequate legal solutions, leading to power imbalances between wealthy and less affluent litigants. And third, an โ€œanything-goesโ€ mentality when it comes to determining legal meaning, which only adds to some peopleโ€™s feelings that legal decisions are somewhat arbitrary rolls of the dice. It sometimes seems that the elaborate legal-doctrinal constructions that we tend to focus on in the legal sciences only add to these problems instead of paving the way to just and accessible legal verdicts.

Adrian Hemler and Dr Dana Gablasova (Director of studies MACL)

Against this backdrop, combining Law and Corpus Linguistics appeared to me as a potential way to enrich legal arguments with a new layer of empirical precision and an attractive path to better integrate scientific methods into the law. Since I knew that just reading a few books on my own would probably not be enough to understand Corpus Linguistics to a sufficient degree, I decided to study the subject remotely at the University of Lancaster, which was, thankfully, possible due to an award and a scholarship I received for my habilitation project from my home university.

Overall, studying Corpus Linguistics at Lancaster University proved to be an excellent decision: The high-quality teaching methods, which really emphasise active learning and immediate, practical implementation, as well as the excellent lecturers, provided me with more understanding and ideas for my research than I could have ever hoped to gain on my own. After graduating, I can now say that nearly all topics and assignments within the programme turned out to be highly beneficial for my research.

Be it the determination of the legal meaning of terms such as โ€œprivate lawโ€ in a Corpus of Decisions of the International Court of Justice, the construction of a Corpus of Arbitral Awards of the International Centre for the Settlement of Investment Disputes, the comparison of collocation association measures with respect to their suitability as a means to uncover nuances in legal discourse, the analysis of the formulaicity of English court decisions, the analysis of metaphors within the jurisprudence of the German constitutional court or the assessment of equivalence of legal translations โ€“ all these topics proved over and over again how corpus methods can provide the search for the meaning of legal terms and concepts with an empirical basis.

The aforementioned MA research projects also were the foundation of other corpus-based research, which I presented at conferences and gave guest lectures on during the last two years. I now consider myself to be in an excellent position to put it all together in my habilitation thesis, which is due to be published by mid-2026 (in German, unfortunately, but I hope to provide an English version soon after).

My MA studies also helped me recognise that the intersections between law and linguistics are more far-reaching than I anticipated: I now hold the opinion that the determination of legal propositions โ€“ the most central task of lawyers โ€“ can, in fact, be reconstructed as a linguistic enterprise, best positioned within the linguistic subdisciplines of semantics and pragmatics. For example, the procedural conditions for the legal validity of a court verdict might just as well be considered felicity conditions of the speech acts that a court verdict contains. This means that using corpus methods is a potentially universal tool for determining the meaning of legal propositions. Its potential fields of application within the law are, therefore, far greater than the sparse existing research, which usually focuses on the determination of general language for legal purposes, might imply.

I like to think that this extension of practical applications of corpus-based semantics and pragmatics into other disciplines, such as the law, is an endeavour that Geoffrey might have approved of โ€“ and I sincerely hope that I can come back to Lancaster soon to report on my progress.