What’s wrong with “a bunch of migrants”? Looking at the linguistic evidence

This week at Prime Minister’s Questions, David Cameron used the term “a bunch of migrants to describe refugees at a camp in Calais. He was subsequently criticised by Labour MPs and members of the general public on Twitter, and the story was reported on in mainstream newspapers like the Guardian and the Telegraph. Critics described his comments as “dehumanising”, “callous” and “inflammatory”.

Something about David Cameron saying the words “bunch of” to describe a group of people caused a furore – but what was it? Is this how people normally use this phrase, or is this a noteworthy departure from the norm?

Here at CASS we have the unique opportunity to analyse a very large set of everyday conversations between speakers of British English from all over the UK, which participants have been recording in their homes and sending to us to be transcribed. Using the transcriptions, we can use computer software to analyse how words and phrases are used commonly across the entire country.

I searched through 4.5 million words of present day conversation to find out how people in the UK normally use the phrase “bunch of”. I found that “people”, “flowers” and “things” are the most likely words to be described in this way. Beyond this, there are several other words which refer to groups of people:

“kids”, “volunteers”, “retards”, “losers”, “lads”, “individuals”, “friends”, “dickheads”, “dancers”, “Aussies”, “alcoholics”, “thieving sods” and “thieving fuckers”.

Absent from this list is the word “migrants”, which does not occur in this context. The evidence suggests that people do often use “bunch of” to describe groups of people negatively or with distaste. Therefore the upset caused by Cameron’s use of the phrase “a bunch of migrants” is perhaps understandable.

We are still collecting recordings from speakers all over the UK. For information on how to contribute to this project, which is led by Lancaster University and Cambridge University Press, please visit the Spoken BNC2014 website.

Welcome Jens Zinn – Marie Skłodowska-Curie Fellow

Jens ZinnCASS is delighted to welcome Jens Zinn to the centre after being awarded a Marie Skłodowska-Curie Fellowship! This is an extremely prestigious award, named after the double Nobel Prize winning Polish-French scientist famed for her work on radioactivity. The fellowships support outstanding scholars at all stages of their careers, irrespective of nationality.

Jens has studied and taught at many universities in Germany, and in 2009 he was appointed Associate Professor and Reader in Sociology at The University of Melbourne. Jens has founded a number of international research networks on the Sociology of Risk and Uncertainty (SoRU). The joint internet portal of these groups is open to everyone to contribute to current debates and ongoing activities. His research activities include a number of studies on people’s management of risk and uncertainty during the course of their life (e.g. youth transitions into the labour market; certainty constructions in reflexive modernity; British veteran’s management of risk and uncertainty). He led a collaborative research initiative ‘Risk, Social Inclusion and the Life Course – A Social Policy Perspective’ at the University of Melbourne and a research project ‘Decision Taking in Times of Uncertainty. Towards an efficient strategy to manage risk and uncertainty in climate change adaptation’ funded by the Victorian Centre for Climate Change Adaptation Research. Most recently he has worked with Daniel Mcdonald on a project examining the change of the risk semantic in the New York Times from an historical perspective combining corpus linguistics with sociology.

Here at CASS, Jens will be working with Professor Tony McEnery on a project which aims to advance our understanding of the forces that have driven the proliferation of risk discourses in the UK and Germany since World War Two. Working at the boundaries of risk sociology and corpus linguistics, this is a highly innovative enterprise, both theoretically and methodologically. It will examine the contribution made by main-stream risk theories to explaining the increasing use of the risk semantic in media coverage during the last 50 years, and it will develop an empirically grounded theory of the observable shift towards risk. Jens will utilise cutting-edge corpus-based research strategies to systematically reconstruct the changing use of the discourse-semantics of risk and will complement these with interviews of media experts to examine how these changes are linked to institutional and socio-cultural changes and historically significant events.

CASS would like to congratulate Jens on securing this highly esteemed fellowship, and we are very much looking forward to working with Jens on this exciting project!

Check back soon for more updates!

Workshop on Corpus Linguistics in Ghana

Back in 2014, a team from CASS ran a well-received introductory workshop on Corpus Linguistics in Accra, Ghana – a country where Lancaster University has a number of longstanding academic partnerships and has recently established a campus.

We’re pleased to announce that in February of this year, we will be returning to Ghana and running two more introductory one-day events. Both events are free to attend, each consisting of a series of introductory lectures and practical sessions on topics in corpus linguistics and the use of corpus tools.

Since the 2014 workshop was attended by some participants from a long way away, this time we are running events in two different locations in Ghana. The first workshop, on Tuesday 23rd February 2016, will be in Cape Coast, organised jointly with the University of Cape Coast: click this link for details. The second workshop, on  Friday 26th February 2016, will be in Legon (nr. Accra), organised jointly with the University of Ghana: click this link for details. The same material will be covered at both workshops.

The workshop in 2014 was built largely around the use of our online corpus tools, particularly CQPweb. In the 2016 events, we’re going to focus instead on a pair of programs that you can run on your own computer to analyse your own data: AntConc and GraphColl. For that reason we will be encouraging participants who have their own corpora to bring them along to analyse in the workshop. These can be in any language – not just English! Don’t worry however – we will also provide sample datasets that participants who don’t have their own data can work with.

We invite anyone in Ghana who wants to learn more about the versatile methodology for language analysis that is corpus linguistics to attend! While the events are free, registration in advance is required, as places are limited.

Spoken BNC2014 Early Access Data Grant Scheme – winning proposals

Lancaster University’s ESRC funded Centre for Corpus Approaches to Social Science (CASS) and Cambridge University Press are pleased to announce the recipients of the Spoken BNC2014 Early Access Data Grants. These successful applicants will receive exclusive early access to approximately five million words of the Spoken BNC2014 via CQPweb. They will be the first to conduct research using the data and produce papers to be published in 2017, coinciding with the release of the full corpus.

The successful applicants, their institutions, and the research they intend to undertake, are:


Karin Aijmer


Investigating intensifiers in the Spoken BNC2014


Karin Axelsson


Canonical and non-canonical tag questions in the Spoken BNC2014: What has happened since the original BNC?


Andrew Caines1, Michael McCarthy2 and Paula Buttery1

1Cambridge, 2Nottingham

‘You still talking to me?’ The zero auxiliary progressive in spoken British English, twenty years on


Andreea Simona Calude


Sociolinguistic Variation in Cleft Constructions – a quantitative corpus study of spontaneous conversation


Jonathan Culpeper


Politeness variation in England


Robert Fuchs


Recent Change in the sociolinguistics of intensifiers in British English


Kazuki Hata, Yun Pan and Steve Walsh


Talking the talk, walking the walk: interactional competence in and out


Tanja Hessner and Ira Gawlitzek


Women speak in an emotional manner; men show their authority through speech! – A corpus-based study on linguistic differences showing which gender clichés are (still) true by analysing boosters in the Spoken BNC2014


Barbara McGillivray1, Jenset Gard1 and Michael Rundell2

1Oxford, 2Lexicography MasterClass

The dative alternation revisited: fresh insights from contemporary spoken data


Laura Paterson


‘You can just give those documents to myself’:  Untriggered reflexive pronouns in 21st century spoken British English


Chris Ryder, Jacqueline Laws and Sylvia Jaworska


From oldies to selfies: A diachronic corpus-based study into changing productivity patterns in British English suffixation


Tanja Säily1, Victoria González-Díaz2 and Jukka Suomela3

1Helsinki, 2Liverpool, 3Aalto

Variation in the productivity of adjective comparison


Deanna Wong


Investigating British English backchannels in the Spoken BNC2014


Thank you to everyone who applied, and congratulations to the winning proposals. Check back soon for more details on the Early Access Data Grant Scheme research.


Encyclopaedia of Shakespeare’s Language Project: A methodological journey

Just before Christmas 2015, the AHRC announced that it was going to fund the £1 million Encyclopaedia of Shakespeare’s Language project. I actually had the idea for the project 20 years ago. The fact that it took so long has much to do with method.

The approach I envisaged for Shakespeare’s language is analogous to more recent developments in dictionaries of general English, and, specifically, the departure from the philological tradition that resulted in the Collins Cobuild Dictionary of the English Language, the first full corpus-based dictionary. Being corpus-based implies both a particular methodology for revealing meanings, and a particular theoretical approach to meaning. There is less reliance on the vagaries and biases of editors, and a greater focus on the evidence of actual usage. The question ‘what does X mean?’ is pursued through another question: ‘how is X used?’

But I wanted more from the encyclopaedia than this. I wanted it to be comparative, to reveal not just the usage of words and other linguistic units in Shakespeare but also in the general language of the period. This way, we can tap into issues such as what is distinctive about Shakespeare’s language, and, more particularly, how Shakespeare’s language would have been perceived by his contemporary audience.

For example, the play Henry V contains Welsh, Irish and Scottish characters. A pilot examination I conducted with Alison Findlay (English and Creative Writing) of the words Welsh, Irish and Scottish used in over 100 million words written in Shakespeare’s time revealed that: (1) that the Welsh barely registered on the Elizabethan consciousness, being considered a harmless in-group, only noteworthy for their curious language, (2) the Irish were wild, savage, rebels, viewed positively only in relation to Irish rugs (an important colonial import), and (3) the Scottish, whilst also rebels, were respected for their political power. (Current Shakespearean dictionaries do not contain entries for any of these three words).

The problem 20 years ago was the lack of comparative data. Back in the early 1990s, the leading historical corpus of English was without doubt the Helsinki Corpus of English Texts, completed in 1991. This corpus amounted to 1.5 million words – an impressive figure in those days! Moreover, it had been put together with great care; it was reliable. But those 1.5 million words covered the period 730 to 1710. The section contemporaneous with Shakespeare amounted to less than half a million words, and was thus far short of what is required for serious comparative work.

To solve the problem, I set about, with Merja Kytö, creating the Corpus of English Dialogues. The reason for the focus on dialogues is that this would provide an interesting comparison for the dialogues of Shakespeare’s plays. This project soaked up 10 or more years, not just in creating the corpus but also in publishing the various insights it afforded into early modern dialogues along the way.

I was then overtaken – in a positive way! – by other events, notably, the advent of a fully-searchable 1.2 billion transcribed version of Early English Books Online (EEBO) (i.e. EEBO-TCP). For years, EEBO, which contains pretty much all early modern printed output, had been of limited value to linguists because the texts were only available as images, and language searches relied on OCR, with all its inaccuracies. Now, however, I have a 321 million word fully searchable corpus of texts written by Shakespeare’s contemporaries.

In addition, solutions, or at least partial solutions, had evolved for the various problems associated with the computational analysis of historical language data. Early modern spelling variation had been a major stumbling block (e.g. the word would could be spelt would, wold, wolde, woolde, wuld, vvold, etc.). This problem has been largely solved by the Variant Detector (VARD), devised by scholars at Lancaster, especially Alistair Baron . The Lancaster-developed CLAWS part-of-speech annotation system, which works well for present-day English, has been adapted for Early Modern English (though more work will be necessary). Similarly, semantic annotation has received attention from generations of researchers at Lancaster University, and has been (and is being) adapted for Early Modern English, most recently within the AHRC-funded SAMUELS project, involving a consortium of universities, including Lancaster.

I don’t doubt that there will be many more twists and turns, lumps and bumps in the future methodological journey. But I am cheered by the fact that I will not be facing them alone but in the company of a wonderful group of people who are part of the project: Andrew Hardie and Tony McEnery (both LAEL), Paul Rayson (Computing and Communications), Alison Findlay (English & Creative Writing) and Dawn Archer (Manchester Metropolitan).

For a brief project description, see: AHRC award to create a new Encyclopaedia of Shakespeare’s Language

Remembering Richard Xiao, 1966-2016

I first met Richard in 2000, when he came to Lancaster to be my PhD student. Interested initially in doing a PhD in the area of translation studies, I spoke to him about corpus research and, slowly as the months passed, he decided to use corpora to look at an interesting issue in linguistics – aspect. This was the first of many areas where we happily worked together. Over weeks and months we slowly worked on the problem of integrating corpora and theory, finally arriving at what we both felt was a very satisfactory outcome: a PhD for Richard, a book we wrote on the topic and one or two nice papers.

Early on Richard showed real promise as a researcher so, as I often do with my students, I set Richard onto a few side projects which we pursued together. The first project we worked on was on the F-word in English. I had analysed bad language in the spoken and written BNC, but my book on swearing in English only used the spoken material. So we worked together on the written data and produced the paper ‘Swearing in Modern British English’ which was published in Language and Literature.

That started something of a wave of publications from us – we worked together very well. We had similar interests and personalities, but, most importantly, we felt very comfortable about disagreeing with one another. Those disagreements were always purely intellectual – a cross word never passed between us. They were also not fruitless – we would always debate the point until one or the other of us would change our minds. Working with Richard was a pleasure.

On finishing his PhD Richard started to work as my research assistant. Courtesy of a grant from the UK ESRC we carried on our work on the grammar of Chinese. When I went on secondment from Lancaster University to the UK AHRC, I continued to work with Richard who remained my research assistant. Without Richard working pretty independently of me most of the time while I was on secondment, my time at the AHRC would have been much tougher. As it was, I could focus on the research council work during the day and then check in with Richard in the evening to see how our work was going. The end result was a series of papers on Chinese grammar that I am very proud to be associated with and the book Corpus-Based Contrastive Studies of English and Chinese.

After the grant we were working on finished we hit a snag – we had a very interesting project on Chinese split words lined up, but as I was working for the research council at that time I could not apply to them for a grant and as a research assistant Richard was ineligible to apply. So we wrote the proposal and persuaded our colleague Anna Siewierska to take on the supervisor role on the project. The project was funded and Richard and Anna worked together very well, though I will always regret not being able to be part of that work as it is so interesting. Look at this paper, for example:


Around the time that this grant was awarded Richard got his first lecturing position at the University of Central Lancashire, moving on to Edge Hill University and finally, to my delight, in 2012 he moved back to Lancaster University, where he was swiftly promoted to Reader.

In the fourteen years from when I first met him to the point where he retired on ill health grounds, if Richard had only done the work described above he would have had a good career. However, he did so much more as his Google Scholar profile shows:


In addition to what he did with me, he also undertook a great range of excellent research on his own, especially in the area of translation studies. Importantly, he contributed to the construction of a wide range of corpora of Mandarin Chinese as can be seen here:


I was delighted when Richard successfully applied to become a British citizen and was honoured to be asked to support his application. I was so pleased to be able to help Richard, his wife Lyn and his daughter in this way.

Sadly, Richard was diagnosed with cancer in 2013. Through surgery, chemotherapy and sheer will power he survived to the 2nd January 2016. The length of his illness, while distressing, did allow us time to publicly celebrate his work:


Throughout his illness he was unfailingly cheerful and optimistic. He was also still brimming with ideas – he was writing and undertaking journal and research council reviews until a few months before he left his suffering behind. I have no doubt that if he had survived longer he would have written many more books and papers well worth reading. As it was, when we last spoke together, just before Christmas 2015, we had a lovely time remembering what we had achieved together. Indeed this brief remembrance of Richard contains many of the things we recalled in that conversation. One thing we did was to decide upon our favourite three publications that we had written together. It seems appropriate to share these in his memory – we both thought they were well worth a read! They are:

McEnery, A. M. & Xiao, R. Z. (2004) ‘Swearing in modern British English: the case of fuck in the BNC.’ Language and Literature. 13, 3, pp. 235-268.


McEnery, A. M. & Xiao, R. Z. (2005) ‘HELP or HELP to: What do corpora have to say?’ English Studies. 86, 2, pp. 161-187.


Collocation, semantic prosody and near synonymy: A cross-linguistic perspective.

Xiao, R. Z. & McEnery, A. M. (2006) Applied Linguistics. 27, 1, pp. 103-129.


We spent a pleasant time discussing these papers and then we said farewell to each other. I can imagine no better a final conversation between two scholars and friends who worked together so well. I am so happy that we had the chance to have this final meeting of minds. Not only will it be a precious memory for me, I know that it meant a great deal to him. I will miss Richard very much, as will others. However, through his writing his thoughts will live on and as further studies are produced by others on the basis of his corpora, the energy, kindness and ingenuity of Richard Xiao will blaze forth afresh.

CASS represented at Winter Reception of the All-Party Parliamentary Group Against Antisemitism

Paul Iganski and Cat Smith MP for Lancaster & Fleetwood, and member of the All-Party Parliamentary Group against Antisemitism, at the Winter Reception.

Paul Iganski and Cat Smith MP for Lancaster & Fleetwood, and member of the All-Party Parliamentary Group Against Antisemitism, at the Winter Reception.

On Wednesday, 16th December, Paul Iganski and Abe Sweiry attended the Winter Reception of the All-Party Parliamentary Group Against Antisemitism in the Terrace Pavilion at the Houses of Parliament. Attendees heard speeches from John Mann MP, the chair of the Group, Commander Dean Haydon from the Metropolitan Police Service and Baroness Williams of Trafford, Parliamentary Under Secretary of State at the Department for Communities and Local Government.

The event ended a significant year for the APPG against antisemitism, in which it published its second major inquiry into antisemitism. John Mann MP instigated the report into the lessons that could be learned from the upsurge of anti-Jewish incidents associated with last year’s conflict in Gaza.

Professor Iganski and Dr Sweiry, as part of a team from Lancaster University’s ESRC Centre for Corpus Approaches to Social Science (CASS), were commissioned by the APPG to provide a rapid-response analysis of antisemitism on Twitter during the conflict to inform the Inquiry’s report.

CASS blog Anti-semitismIn highlighting the findings from CASS in the Inquiry report, the APPG called the analysis of Tweets ‘a unique piece of research which provides valuable and important early indications of trends that occurred during the summer’. [1]

The report recommended further research of the kind offered by CASS stating that ‘the importance of this research should not be underestimated. It helps identify some of the themes in discourse and with time could help to detect patterns of antisemitism and therefore to better direct resources to combat it’. [2]

In the intervening months between the report’s publication and the Winter reception, a progress review of the implementation of the APPG’s recommendations noted that ‘the CPS has pledged to review its guidance relating to communications sent via social media and review the handling of such cases within CPS Areas.’ [3]


[1] All-Party Parliamentary Group Against Antisemitism (APPG) (2015) Report of the All-Party Parliamentary Inquiry into Antisemitism, London: APPG, page 51.

[2] All-Party Parliamentary Group Against Antisemitism (APPG) (2015) Report of the All-Party Parliamentary Inquiry into Antisemitism, London: APPG, page 53.

[3] All-Party Parliamentary Group Against Antisemitism (APPG) (2015) Implementation of the All-Party Parliamentary Report into Antisemitism: feedback and responses, London: APPG, page 4.


Spoken BNC2014 meets FOLK

On Thursday 3rd December I visited the Institut für Deutsche Sprache (Institute for German Language) in Mannheim. The IDS is Germany’s national, non-university institution for the research and documentation of the German language in both the present day and the past.

I was thrilled to be invited there by Swantje Westpfahl, a PhD student at the Institute, who is working on the compilation of a large spoken corpus of German known as the FOLK (Forschungs- und Lehrkorpus Gesprochenes Deutsch; research and teaching corpus of spoken German). With the similarities between FOLK and the Spoken BNC2014 (my own PhD research project) apparent, we spent a day at the IDS learning about each other’s work.

In the morning, I gave an hour-long talk about the Spoken BNC2014, including an overview of our data collection and transcription methods as well as an investigation into speaker identification which I conducted earlier this year. I explained that, with a small budget, we (CASS and our partner Cambridge University Press) have very much favoured size and speed of production over minute detail of transcription; a decision that has allowed us to have produced approximately 8 million words of orthographic transcription so far in only 18 months.

After lunch, I attended a workshop entitled “Spoken BNC2014 meets FOLK”, where Dr Thomas Schmidt gave an equivalent talk to my own about the FOLK project, followed by Swantje, whose specific focus is on the annotation of the transcribed corpus data. In terms of general design, the FOLK is fairly similar to the Spoken BNC2014; it contains transcripts of audio recordings held between speakers in a variety of settings. The major differences, as I learned, lie in the approach to transcription and the release of data. I learned about the incredible level of detail with which the FOLK recordings are transcribed, using Thomas’ own transcription software FOLKER. I was impressed by the affordances of this tool and the dedication to detail that was evident at the IDS, including the transcription of breathing, pauses measured to the millisecond and direct alignment to the (anonymized) audio recordings. All of this work takes a long time (on average, one hour of recording take 100 hours to prepare in this way!), and as such the FOLK is much smaller than the Spoken BNC2014 (1.3 million words after three years), but extremely rich in terms of potential for analysis.

The IDS was in turn impressed by the Spoken BNC2014’s approach to data collection, where we ‘crowd-source’ participants and invite them, through media engagement and other means, to make recordings using their smartphones in exchange for payment. I suggested that they might like to try putting out a press release about marmalade to see whether the German media respond in the same way that the British media did.

Overall, my visit to Mannheim was a fantastic opportunity to learn about the FOLK project and to have some really interesting discussions about the aims of spoken corpus linguistics, and I would like to thank all at the IDS for their hospitality. I look forward to seeing Swantje again when CASS hosts her in Lancaster for a research visit in the Spring next year.

Beyond the checkbox – understanding what patients say in feedback on NHS services

In 2016 I will be working on a new project in CASS, which has received funding from the ESRC (£61,532 FEC). The purpose of this project is to help the National Health Service better understand the results of patient feedback so that they can improve their services. The NHS gathers a great deal of user feedback on its services from patients. Much of this is in “free text” format and represents a rich dataset, although the amount of text generated in the thousands of feedback forms patients fill in each year makes it unfeasible to undertake a close qualitative analysis of all of it. Categorisation-based approaches like sentiment analysis have been tried on the dataset but have not found to be revealing. In this project we will be working with the NHS to first identify a set of research questions they would like to be answered from the data, and then we will use corpus-based discourse analysis to draw out the main themes and issues arising from the data. We will focus on four key NHS services – dentists, GP practices, hospitals and pharmacies. From these services alone we have around 423,418 comments to analyse, totalling 105,380,697 words. Some of the issues we are likely to be focussing on include: what matters most for patients, the key drivers for positive and negative feedback, indicators in comments that might trigger an alert or urgent review and differences across providers/services or by socio-demographic group.

Language Matters: Communication, Culture and Society

On 12th November, the CASS team made their way over to the International Anthony Burgess Foundation in Manchester for the ESRC Festival of Social Science 2015. The theme for this year’s event was “Language Matters: Communication, Culture and Society,” and it featured a series of four informal talks by CASS researchers based at Lancaster. The talks were pitched to a general audience, and gave the public the opportunity to hear renowned scholars talk about their lives, their work, and what they find most interesting about the relationship between language and society.

The first talk was by Robbie Love, a current PhD student in CASS who is working on the Spoken BNC 2014 project along with Cambridge University Press. The researchers are collecting 10 million words for the project, and it has received a great deal of media attention since it was announced last year. Robbie delved into some preliminary findings from the corpus, and explained how “fortnight, cheerio, catalogue, marvellous, and marmalade” are all on the decrease, whilst “treadmill, essentially, Internet, Google, and Facebook” are all on the rise. Some of these findings might be expected, but these subtle differences say a great deal about how our language usage has changed over the past 20 years.

The second presentation was by Jonathan Culpeper who discussed “Impoliteness: The Language of Offence”. He drew upon several pieces of corpus research, and argued that impoliteness doesn’t necessarily stem from what people say, but rather the way in which they say it. The use of 3rd person constructions, sarcastic remarks, and reduced eye-contact can all signal impoliteness. He argues that impoliteness often fits into a category, such as insults, negative evaluations, dismissals, silencers, threats, or condescensions. By using corpus-based methods, Jonathan is able to determine the most common constructions which signal impoliteness, and then consider the subtle pragmatic cues that may accompany them.

The third of the mini-lectures came from Paul Iganski (Law School). The presentation was entitled: “Vile words. What is the case for criminalising everyday hate speech as hate crime?” Considering that well over half of the racially and religiously aggravated offences in England and Wales in 2010-11 were categorised as “hate speech,” Paul considers both the legal and societal implications when the state criminalises language. He is firmly of the view that there is no such thing as “free speech,” as every nation state in the EU criminalises a form of hate speech. Furthermore, he argues that hate crimes hurt more than otherwise motivated crimes as they send a message striking at the core of the victim’s identity, and the restriction on “free speech” goes a long way towards protected minority groups in society.

The fourth and final talk was from Claire Hardaker who discussed “The ethics of investigating online aggression”. Claire started by discussing the media’s culture when stories regarding online abuse arise. They tend to have an appetite for exposing online trolls, and want to put a real-life face to the otherwise faceless online character. Claire went on to describe how easy it is to track someone down based solely on the information they give away on their Twitter accounts. Academics, for example, often promote their position and institution on their profile, and a quick internet search can lead someone straight to the individual’s office. This information can be used by the media, and Claire discussed how the media essentially witch-hunted a 63-year-old troll called Brenda Leyland, despite her comments not actually being considered criminal under British law. There’s a constant, sensitive conflict between ethics and the online environment, and Claire argues that whilst we need to publish about our research, we must do so without endangering anyone involved.

After the four presentations came to a close, visitors had the opportunity to meet with the speakers, talk about their research, and network with other attendees.

I’d like to personally offer my thanks to not only the speakers for offering their time throughout the day, but to everyone who joined us in the audience too. I think you’ll agree that it was a huge success, and the day really highlighted why corpus-based research is so important for uncovering the fascinating relationship between language and society.