PhD conference prize for Mark Wilkinson

At the 2020 Corpora and Discourse International Conference, I was very honoured to receive an award for the conference paper “showing the greatest methodological innovation or reflexivity by a student researcher”. The award was sponsored by the Applied Corpus Linguistics journal and included a prize of £250. This year’s online conference, hosted by the University of Sussex, featured a wide variety of brilliant research from students around the world. That I was nominated for the award makes me truly humble and I am especially grateful to my supervisor, Professor Paul Baker, for all his support and guidance during my doctoral research. I would like to take this opportunity to share with you a summary of my talk which is titled: “Black or gay or Jewish or whatever”: A diachronic corpus-based discourse analysis of how the UK’s LGBTQI population came to be represented as secular, cisgender, gay, white and male (available to watch here:

This talk emerged from my PhD research in which I aim to map how The Times has used language to discursively construct LGBTQI identities in the UK over the past 60 years. I’m particularly interested in the histories of identity and this is why I’ve chosen to take a diachronic approach, collecting many decades of language data from one of the UK’s most influential broadsheets. This focus on history is based on the assumption adapted from post-structuralist discourse theory (Laclau & Mouffe 1985) that all identities are partially the result of consistent choices in representation made over a sustained period of time.

In order to garner a sense of which discourses have been consistent, I decided to look at both consistent keywords and consistent collocates. This revealed several currents running through the corpus. First, in spite of the fact that the search terms used to build the corpus reflected the inherent diversity within the LGBTQI population, the majority of key terms pertained to gay men. This indicates that the history of queer representation in The Times is primarily their history while the histories of lesbian, bisexual, trans and gender non-conforming people have been largely erased or obfuscated. Second, an analysis of consistent collocates for the word gay showed that additional identifications such as Black and Jewish were statistically significant from the 1980s onward. A closer analysis of the newspaper articles that featured this usage showed that such terms were used in one of two ways. First, Black and Jewish were often used as marked terms which implied that such intersecting identities were exceptional. I would therefore argue that this markedness implies the presumed whiteness and non-Jewishness of the archetypal gay man as presumed by The Times. Secondly, the terms Black and gay as well as Jewish and gay were often presented as mutually exclusive categories. In other words, individuals were represented as being either black or gay, but never both. Cumulatively, it was argued that the history of LGBTQI representation in The Times suggests that through consistent choices in representation over a sustained period of time, the queer population of the UK came to be represented as secular, cisgender, gay, white and male. But, as there was never any use of the term white, how could I make this claim?

Drawing on the intellectual tradition of critical race theory (Baldwin 1963; Crenshaw 1990; Morrison 1992; Hall 1997), I argued that ‘race’ – while certainly a lived experience with material consequences – is not simply a neutral taxonomy of phenotypical differences between people, but is rather an ideological construct that functions as a structuring force in society such that certain bodies are given more value than others. Within this racialised matrix, whiteness is not only privileged, but is passed off as neutral and universal – an unmarked category that functions largely by ‘erasing its own tracks’ (Trechter and Bucholtz 2001:10). From a linguistic perspective then, whiteness functions ‘much like a linguistic sign, taking its meaning from those surrounding categories to which it is structurally opposed’ (Trechter and Bucholtz 2001:5). Therefore, in the data from The Times, the racialisation of gay men as Black, necessarily implies that the whiteness of all other gay men is indeed the implied universal.

In conclusion, it was argued that these cumulative processes are not benign, but rather indicate how the power of language can erase entire groups of people from popular discourse. Furthermore, the combination of corpus data with theories from both within and beyond linguistics is essential in mapping the discursive construction and representation of identities.


Baldwin, J. (1963). The Fire Next Time. New York: Dial Press.

Crenshaw, K., (1990). Mapping the margins: Intersectionality, identity politics, and violence against women of color. Stanford Law Review43, p.1241.

Hall, S. (1997). ‘The spectacle of ‘the other’’. In Hall, S. (ed) (1997) Representation: cultural representations and signifying practices. London: Sage.

Laclau, E. and Mouffe, C. (1985). Hegemony and socialist strategy: Towards a radical democratic politics. London: Verso.

Morrison, T. (1992). Playing in the Dark: Whiteness and the literary imagination. Cambridge: Harvard University Press.

Trechter, S. and Bucholtz, M. (2001). ‘Introduction: White noise: Bringing language into whiteness studies’. Journal of Linguistic Anthropology, 11(1), pp.3-21.




Isobelle Clarke Receives Leverhulme Trust’s Early Career Fellowship

I am so unbelievably pleased to announce that the Research Awards Advisory Committee at the Leverhulme Trust have granted me, Dr. Isobelle Clarke, the Leverhulme Trust’s Early Career Fellowship to conduct my research entitled “Understanding the linguistic repertoires across anti-science narratives” at Lancaster University in the Centre for Corpus Approaches to Social Sciences.

Science improves our everyday lives, especially science that is aimed at safeguarding and protecting public health and safety, such as by improving the air we breathe and the water we drink (Carter et al., 2019). Because of science, food is safe and plentiful, and diseases can be treated, cured, isolated and prevented from spreading (Siegel, 2017). Science also anticipates threats to the environment and natural disasters, like hurricanes and storms (Carter et al., 2019). Although scientific advancements can be misused accidentally or for ill, science nevertheless has led to the development of new technologies, which have enhanced many individuals’ quality of life to a level that could never have been expected previously (Siegel, 2017).

Despite these advances, in this modern world we live in, value judgements and personal experience can (and often do) take precedence over scientifically-accepted facts. Throughout Brexit and Donald Trump’s presidency, we have witnessed an acceleration in the demonization of experts and knowledge. Throughout this COVID-19 pandemic we have heard the phrase “following the science”, but really it should be “following the science when it suits us” as our leaders ignore findings or utilise parts and not the whole to suit them (which, as many will already anticipate, will eventually lead to them pointing the finger at scientists demanding an explanation for where it all went wrong in order to demonise scientists and experts further). Making matters worse is a context of competing, manipulative and persuasive anti-science narratives, each claiming to be truth. These undermine the public’s chances of distinguishing fact from fiction. For example, after a conspiracy theory suggested the outbreak of COVID-19 was a result of 5G, many of us witnessed or read reports of the 5G masts in the West Midlands being vandalised and even burned down by members of the public. Stories such as these demonstrate that the advances in humanity’s safety and prosperity created by science are being significantly undermined and threatened by anti-science discourse and actions (Carter et al., 2019).

Anti-science views are not new. For example, the Leicester anti-vaccination movement began in the 19th century. But with limited public access to scientific sources and increasing access to non-scientific sources, especially via the internet, anti-science positions are becoming more pervasive, and include claims that i) the earth is flat; ii) a female biological mechanism exists to prevent pregnancy post-rape; iii) alternative medicines like homeopathy are effective; iv) climate change science is false v) genetically modified organisms (GMOs) are dangerous to human health; vi) stem-cell research has various pseudo-benefits; vii) autism can be cured with diet and viii) evolution theory is untrue (for further examples see Achenbach, 2015). Scientists have been asked to communicate their findings clearly and counter anti-scientism (Oreskes and Conway, 2010). But before we can begin to counter it, we first need to identify and understand its discourse. Anti-science discourse has been investigated through the optic of particular governments (Carter et al., 2019) or specific topics, such as anti-vaccination (Davis, 2019), anti-GMO (Cook et al., 2004), stem-cell research (Marcon, Murdoch and Caulfield, 2017), and climate denial discourse (Park, 2015). This research often details the development and the content of the anti-science position and discourses. However, little is known about the linguistic repertoires of contemporary anti-science discourses more generally and how they compare across topics. What are the linguistic mechanisms underpinning the persuasiveness of anti- science? Are there anti-science discourses that are shared across the topics, or does the discourse vary with the topic? How much linguistic variation do the anti-science topics display? To what degree are the anti-science communicative strategies more or less typical of particular topics? This fellowship will directly address these questions.

In this fellowship I will be developing and introducing a new methodological technique which combines corpus-assisted discourse studies (CADS) (Baker, 2006) – a methodological technique developed and used by researchers at Lancaster University to investigate the representations and discourses of social phenomena and groups – with the approach to corpus data I specialise in, Multi-Dimensional Analysis (MDA) (Biber, 1988).

I am honoured to receive this prestigious fellowship and I am truly grateful to Lancaster University in their support of me, especially my fellow colleagues in CASS.

As we all try to seek out help, advice and guidance in these unprecedented times, it has never been more important to understand how anti-science works across topics. So, wish me luck on this journey and I’ll see you at the end with my tin foil hat on.

I hope everyone is keeping safe and healthy.

My best wishes to you and your family.


Achenbach, A. (2015) Why do many reasonable people doubt science? National Geographic. Retrieved on 21/01/2020 from:

Baker, P. (2006) Using Corpora in Discourse Analysis. London: Continuum. Biber, D. (1988) Variation across speech and writing. Cambridge: Cambridge University Press.

Carter, J., Berman, E., Desikan, A., Johnson, C. and Goldman, G. (2019) The State of Science in the Trump Era: Damage Done, Lessons Learned, and a Path to Progress. Center for Science and Democracy at the UCS.

Cook, G., Pieri, E. and Robbins, P. T. (2004) ‘The scientists think and the public feels’: Expert perceptions of the discourse of GM food. Discourse and Society 15(4): 433—449.

Davis, M. (2019) ‘Globalist war against humanity shifts into high gear’: Online anti-vaccination websites and ‘anti-public’ discourse. Public Understanding of Science 28(3): 357—371.

Marcon, A. R., Murdoch, B. and Caulfield, T. (2017) Fake news portrayals of stem cells and stem cell research. Regenerative Medicine 12(7): 765—775.

Oreskes, N. and Conway, E. M. (2010) Defeating the merchants of doubt. Nature 465(10): 686—687. Park, J. T. (2015) Climate change and capitalism. Consilience 14: 189—206.

Siegal, E. (2017) Humanity needs science to survive and thrive. Retrieved on 21/01/2020 from:

New partnership between the ESRC Centre for Corpus Approaches to Social Science and the Sydney Corpus Lab

We’re excited to announce that the University of Sydney, Australia and the University of Lancaster, UK have signed an MOU agreement to work on collaborative research in corpus linguistics. This new partnership builds on existing connections between the newly established Sydney Corpus Lab and the Centre for Corpus Approaches to Social Science (CASS), which was founded in 2013. Last year, a CASS contingent attended the launch of the Sydney Corpus Lab in March 2019, and, in June 2019, A/Prof Monika Bednarek from Sydney was a Visiting Researcher at CASS. During her visit, we made plans for a new collaboration on representations of obesity in the Australian Press. This MOU now allows us to formalise this collaboration and to strengthen our existing research links.

Caption: CASS director Elena Semino (left) and Sydney Corpus Lab director Monika Bednarek (right) at the launch event in Sydney in March 2019

In the immediate future, CASS will build a corpus of Australian news items about obesity, and will advise on the analysis, based on a current project on representations of obesity in the UK Press. The Sydney Corpus Lab will analyse the Australian corpus, with the help of a new postgraduate research scholarship funded by the Charles Perkins Centre and the Faculty of Arts and Social Sciences at the University of Sydney.

The project will explore:

  • Existing media guidelines around the reporting of obesity
  • The use of the words obesity and obese in the Australian news media
  • The impact of the Obesity Collective’s campaign to shift the narrative away from stigma and blame
  • How obesity is represented more generally in the Australian news media, over time and across newspapers

Researchers from the Sydney Corpus Lab and CASS will collaborate on the dissemination of findings, including engagement with research users. We will work with the Obesity Collective and health journalism expert and educator Dr Catriona Bonfiglioli (University of Technology, Sydney) to help steer our analytical focus and for successful impact outside academia.

The MOU also includes mutual visits, and CASS Senior Research Associate Gavin Brookes is already planning to visit Sydney in July 2020, for research meetings and to present a talk at the Corpus Linguistics Down Under symposium.

Watch this space for updates on these activities and announcement of future collaborative initiatives between Lancaster and Sydney!


Representing trans people in the UK press – a follow-up study

I do not identify as trans, nor did I carry out this research for profit or because I am an activist. I approached the subject from the position of allowing the data to speak for itself, and the corpus methods I use rely on computational techniques that are unbiased – computer software identifies the most frequent words, phrases and combinations of words, which then have to be accounted for by the analyst.


A few years ago I published the “corpus linguistics” chapter in an edited collection relating to different methods of carrying out critical discourse studies. As a case study for the chapter, I decided to look at the representation of trans people in the British press. At the time there had been a disapproving article about a trans person who was also a school-teacher in The Daily Mail who had committed suicide three months later, while another article published in the Observer, one of the more respectable Sunday broadsheet newspapers, had used pejorative phrases about trans people like ‘a bunch of bed-wetters in bad wigs’ and ‘screaming mimis’. I wanted to use corpus approaches to see whether these articles were typical of the general press discussion around trans people or whether they stood out as unusually harsh. I built a (small by corpus linguistics standards) corpus of around 900 articles, just from 2012 and used traditional corpus methods (keywords, collocates, concordancing) to examine a range of words like transgender, transsexual and trannie. My analysis found that the two articles mentioned above were at the extreme end of a continuum, although:

“the analysis did find a great deal of evidence to support the view that trans people are regularly represented in reasonably large sections of the press as receiving special treatment lest they be offended, as victims or villains, as involved in transient relationships or sex scandals, as the object of jokes about their appearance or sexual organs and as attention-seeking freakish objects. There were a scattering of more positive representations but they were not as easy to locate and tended to appear as isolated cases, rather than occurring repeatedly as trends.” (Baker 2014)

I was recently approached by the charity Mermaids UK who asked me if I would carry out an updated analysis of more recent press representation. This time I collected data from the previous 2 years (21 October 2017 to 21 October 2019), resulting in a larger corpus of around 6,400 articles, indicating that there were around 3 and a half times as many articles written about trans people in this later period. In terms of news values, trans people are seen as rather more newsworthy these days. So has the discourse around them changed?

Changing Labels

In terms of how the press refer to trans people, in 2012, the most common term by far was transgender. In 2018-19, transgender and trans were about of equally frequency, this being mostly an effect of the Guardian and Observer showing a strong preference for trans. Terms I had expected that would have died out, like sex-change and transsexual, had decreased somewhat but were still being used about once every other day, with the Mail, Telegraph and Times making the bulk of such cases. Another decreasing term, tranny occurred about once a fortnight. In 2012 it was used to imply bad taste, outlandishness, sex romps or the subject of jokes. The term was a particular favourite term of journalist AA Gill (who used it in bizarre ways like tranny panto and tranny centaur night out). However, in 2018-19 it was now mainly acknowledged as a bullying term (AA Gill died in 2016). The rather jarring use of transgender(s) as a noun (“How about One Guy, A Girl, A Transgender and Two Nonbinary Persons” (The Sun)), occurred 37 times in 2018-19 (there was only 1 such usage in 2012).

Collocates of trans(gender)

Examining the contexts that trans and transgender people were written in showed one of the most notable changes though. I’d noted in 2012 that transgender people were implied to be quick to take offence – in that year there were 8 cases of trans(gender) co-occurring with words like angry, clash, complaint, fury, offended, outrage, row, spat, upset and wrath. There were enormous increases of this representation in 2018-19 though – 586 cases. While a small number of these cases don’t attribute trans people as being the ones who are cast as angry or complaining, the vast majority do – and the wider point is that trans people are being discussed as being at the centre of controversy. A similar set of words which relate to conflict including aggressive, demand, harassed, bullied, confronted, lunge, militant, outspoken, pressure and threat saw a similar pattern – 5 cases of these kinds of words appearing near trans(gender) in 2012, but 334 cases in 2018-19. The result is that trans people are constructed as newsworthy because they are difficult, angry, easily offended (and often unreasonably so).

Scout leaders have been told to avoid referring to children as boys and girls to ensure transgender members are not offended. (Mail on Sunday)

A transgender woman is demanding an apology and £2,500 compensation after claiming she was called “sir” by rail company staff. (Times, March 16, 2019)

It’s not a new representation. I saw the same thing when I looked at news stories about gay people in the early 2000s, Muslims in the 2000s and feminists in the 1990s and 2000s. Another representation (also used on gay people) was to link trans people with crime, connecting them to words like killer, prisoner, lag, criminal, murderer, rapist, jail and kill. These words occurred with trans(gender) 3 times in 2012, but 608 times in 2018-19.

It’s crazy to give trans prisoners everything they say they want,’ said chair Janice Williams. Why wouldn’t they lie in the circumstances? (Daily Mail)

Women’s jail holds trans lag born lad (The Sun, September 13, 2019)

Some of the trans brigade advocate the murder of Terfs as the best course. (Telegraph, 12 January 2019)

Transphobia, trans children and the trans lobby

What about more general contexts? What topics are trans people talked about in relation to more, or less these days? Here we see potentially a change for the better. Topics that now take up less space in the overall debate involve references to transvestites and ladyboys as well as discussion of implants, the clothing worn by trans people and their ability to “pass” as a particular gender. There’s less of the inappropriate prurience in trans people that’s associated with sitcom characters like Alan Partridge. In its place, the biggest area of growth is in stories relating to transphobia and discrimination, although there were also increases in references to transitioning, inclusivity and gender-neutral pronouns.

Lest we think that references to transphobia indicate that the press are overall more concerned about trans people being abused, a closer look indicates this is not always the case. Although such references are 112 times more frequent in 2018-9 compared to 2012, 15% of the 2018-19 mentions put the word transphobia in quotes, implying authorial distance or even rejection of the term.

A transgender teenager who demanded the removal of a female Labour member from her post as women’s officer over her allegedly “transphobic” views has been elected to the post in her local Labour party. (The Times, November 20, 2017)

I took 100 random cases of transphobia and related words like transphobe and looked at them in more detail. Approximately half (47) used the term to raise questions about its validity – either using the distancing quotes, referring to “supposed” or “alleged” transphobia, mentioned the way that the accusers behave: e.g. “howled down as transphobia” or simply baldly stating that something is not transphobia.

An analysis of the term trans(gender) children found a slightly better picture. That term doesn’t occur in the distancing scare quotes – so the concept of trans(gender) children appears to be more accepted in the press than the concept of transphobia. An analysis of 100 random cases found 56 that accepted the existence of trans children and/or advocated that they should receive support. Thirty seven cases were more disapproving, either suggesting that children who identify as trans should not be supported in transitioning or that efforts to support them (e.g. through pronoun stickers or gender-neutral toilets) are unnecessary, even unhelpful. A further seven cases appear more neutral, noting that this is an issue which divides people but not clearly coming down on either side. It’s very rare to find voices of trans(gender) children in these press articles.

A final change relates to the increase in the phrase trans(gender) lobby. There were no mentions of this phrase in 2012. In contrast, 2018-19 saw 151 mentions of it, with over 90% of such cases writing about it in a negative way (e.g. as silencing debate, peddling politically-correct fallacies, being deranged or aggressively militant). The transgender lobby is described in somewhat contradictory terms across the press. At times, journalists go out of their way to stress that it is unimportant, referring to it as miniscule and doomed, yet at other times it is described as powerful, hegemonic and influential (with the implication that it should not be these things).


The UK press wrote over 6,000 articles about trans people in 2018-19. On the surface there appear to have been improvements – the more sexualising and joking uses of language around trans people have reduced since 2012 and there are many more stories around transphobia and inclusivity. However, there are large swathes of the press which write about these topics in order to be critical of trans people and many articles which consequently paint trans people as unreasonable and aggressive. The picture suggests that the conservative press and most of the tabloids have shifted from an openly hostile and ridiculing stance on trans people towards a carefully worded but still very negative stance.


Baker, P. (2014) ‘”Bad wigs and screaming mimis”: Using corpus-Assisted techniques to carry out critical discourse analysis of the representation of trans people in the British press.’ In C. Hart and P. Cap (eds) Contemporary Critical Discourse Studies. London, Bloomsbury: 211-236.

Time to Celebrate: Trinity Lancaster Corpus

On Wednesday 30 October, The ESRC Centre for Corpus Approaches (CASS) organised a small get-together in its new location, Bailrigg House, to celebrate the research that is being carried out at the centre. Specifically, on this occasion, we wanted to highlight the Trinity Lancaster Corpus, a corpus of spoken learner English built in collaboration between Lancaster University and Trinity College London.

Cutting the cake with the Trinity Lancaster Corpus logo

We are really proud of the corpus, which is the largest learner corpus of its kind. It took us over five years to complete this part of the project. Here are a few numbers that describe the Trinity Lancaster Corpus:

  • Over 2,000 transcripts
  • Over 4.2 million words
  • Over 3,500 hours of transcription time
  • Over 10 L1 and cultural backgrounds
  • Up to four speaking tasks

A balanced sample of the corpus is now available for online searching via TLC Hub (password: Lancaster1964). To read more about the corpus and its development, check out this article in the International Journal of Learner Corpus Research:

Gablasova, D., Brezina, V., & McEnery, T. (2019). The Trinity Lancaster Corpus: Development, Description and ApplicationInternational Journal of Learner Corpus Research5(2), 126-158. [open access]

A new special issue of the journal featuring articles on various aspects of learner language, which use the Trinity Lancaster Corpus as their primary data source, is available from this link.

Table of contents of the special issue of the International Journal of Learner Corpus Research

A cake to celebrate the Trinity Lancaster Corpus

Celebrations at CASS

Celebrations at CASS (posters featuring research on TLC in the background)

New Senior Research Associate in CASS: Isobelle Clarke

My name is Isobelle Clarke. I am the newest member of CASS. This is my first academic position outside of education. I am so excited about being a part of CASS, not just because I can tell all my family that I FINALLY have a job, but also because the research environment here is buzzing and thriving (but I don’t need to tell you that)! My major research areas include corpus linguistics, forensic linguistics, discourse analysis and uncovering patterns of language variation and change. Lancaster University, especially CASS is certainly the place that covers all of these areas… I can honestly say I feel at home.

I have been appointed here as a Senior Research Fellow on the project investigating the Discourses of Islam in the press with Tony McEnery and Gavin Brookes that is partially funded by the Aziz foundation. Our task is to extend on the work of Baker, Gabrielatos and McEnery (2013) and Baker and McEnery (2019) that investigated the Discourses of Islam in the UK National Press from 1998-2014. We will be bringing that research up to the present day, examining the extent to which the Discourses have changed and how. Then, we will be comparing the representation of Islam in the national press with the local press and tracking the representations over time. We will also be looking at the representation of Islam on Twitter using various corpus and computational techniques. I am very excited to be working on this important project and in partnership with the Aziz foundation. The aim is not just to scrutinise the media’s representation of Islam, but to also be proactive, making suggestions about the ways in which the media can report on Islam in a more neutral way. It is hoped that the findings from the project will also help to provide British Muslims with the tools to critically engage with public narratives, as we identify and describe why and how particular depictions of Islam can be biased and damaging.

I am passionate about understanding language that harms and so most of my research has focused on online abuse. I have just submitted my PhD dissertation under the supervision of Jack Grieve. I investigated and compared the major communicative styles of trolling tweets and general tweets. Spoiler alert: they are considerably more similar than they are different.  “Why?” You may ask. I have a few theories but I’ll save them until I hear the opinions of my examiners!

I have conducted research with Jack Grieve looking at the major communicative styles of abusive language and hate speech on Twitter, and we have also investigated the major communicative styles of Donald Trump’s tweets and tracking their use over time, especially during the campaign.

On a more personal note, here are some facts about me: I love hummus… like big time! I make my own and I have a recipe, which can be found here. My other love is birds. My favourite bird is the marsh warbler because it is the bird that can mimic the most bird songs. Essentially, it is the bird version of Robin Williams in Mrs. Doubtfire (“I do voices”) or Ariana Grande (for the younger generation who haven’t seen Mrs. Doubtfire – #shameonyou).

I also love Harry Potter – for any language lover let me take you to the scene in the final deathly hallows film where Albus Dumbledore and Harry Potter are in the spiritual world (Kings Cross Station) after Harry has come to his fate with Voldemort. Dumbledore says the most amazing thing. He says: “Words are in my not so humble opinion our most inexhaustible source of magic, capable of both inflicting injury and remedying it.”


So long story short… Spread hummus not hate!


Baker, P., Gabrielatos, C. and McEnery, T. (2013) Discourse Analysis and Media Attitudes. Cambridge: Cambridge University Press.

Baker, P. and McEnery, T. (2019) The value of revisiting and extending previous studies: The case of Islam in the UK press. In R. Scholz (ed.) Quantifying Approaches to Discourse for Social Scientists, pp. 215—250. Palgrave Macmillan

CASS is strengthening its links with colleagues at the University of Mosul in Iraq

As reported in the media, in recent months we have been delighted to support staff and students at the University of Mosul in Iraq who are rebuilding the Department of English after the devastation caused by the so-called Islamic State group . Via the CorpusMOOC and other forms of long-distance support, we have begun to interact with colleagues in Mosul, and to appreciate both the size of the task ahead of them and their determination to succeed. We are now in the process of arranging a month-long visit to Lancaster from two Mosul academics, so that we can strengthen our ties, including by exploring joint projects. Watch this space for updates on the visit and our future joint activities.

What is corpus stats about? A new book on Statistics in Corpus Linguistics has been published

This practical guide will equip the reader to understand the key principles of statistical thinking and apply these concepts to their own research, without the need for prior statistical knowledge. The book provides step-by-step guidance through the process of statistical analysis and offers multiple examples of how statistical techniques can be used to analyse and visualize linguistic data. It also includes a useful selection of discussion questions and exercises. The book comes with a Companion website, which provides additional materials (answers to exercises, datasets, advanced materials, teaching slides etc.)  and Lancaster Stats Tools online (, a free click-and-analyse statistical tool for easy calculation of the statistical measures discussed in the book.

British National Corpus 2014: A sociolinguistic book is out

Have you ever wondered what real spoken English looks like? Have you ever asked the question of whether people from different backgrounds (based on gender, age, social class etc.) use language differently? Have you ever  thought it would be interesting to investigate how much English has changed over the last twenty years? All these questions can be answered by looking at language corpora such as the Spoken BNC 2014 and analysing them from a sociolinguistic persective. Corpus Approaches to Contemporary British Speech:  Sociolinguistic Studies of the Spoken BNC2014 is a book which offers a series of studies that provide a unique insight into a number of topics ranging from Discourse, Pragmatics and Interaction to Morphology and Syntax.

This is, however, only the first step. We are hoping that there will be many more studies to come based on this wonderful dataset. If you want to start exploring the Spoken BNC 2014 corpus, it is just three mouse clicks away:

Get access to the BNC2014 Spoken

  1. Register for free and log on to CQPweb.
  2. Sign-up for access to the BNC2014 Spoken.
  3. Select ‘BNC2014’in the main CQPweb menu.

Also, right now there is a great opportunity to take part in the written BNC 2014 project, a written counterpart to the Spoken BNC2014.  If you’d like to contribute to the written BNC2014, please check out the project’s website for more information.

CASS: Five more years

We are delighted to announce that CASS has been awarded £2.5 million funding from the Economic and Social Research Council (ESRC) and Lancaster University to continue existing activities and pursue a new research programme for five more years, from April 2018 to March 2023.

The funding, which includes £750,000 from the ESRC, will be used to maximise the economic and societal impact of the research carried out in the first phase of the Centre, particularly in the areas of: Corporate Communications; Climate Change and Maritime Security; Language Development, Disorders and Environment; and Spoken Learner Language.

In addition, a new research programme will extend the facilitative and transformative power of corpus methods to the study of health (care) communication, in the following areas:

  • Language and mental health (including: communication about anxiety disorder; presentation and diagnosis of psychosis; depression in users of social media);
  • Communicating and diagnosing chronic pain;
  • Media representations of obesity;
  • English language assessment and training for medical professionals.

The Centre will also continue to create new openly accessible corpora, extend the existing programme of methodological and technological innovation, especially through #LancsBox and CQPWeb, and continue to disseminate methods and tools through the Corpus MOOC, Summer Schools and free workshops in the UK and internationally.

The new CASS team brings together 15 scholars from different disciplines at Lancaster University and two collaborating institutions: Durham University and University College London (see below).

Two postdoctoral Research Associates will also be recruited to work with the rest of the team for the next five years.

CASS Director Professor Elena Semino said: “We are absolutely delighted to have been awarded five more years of funding by the ESRC and grateful to the University for its part in supporting the Centre.

“This award will ensure that the work we have done so far achieves its full potential in terms of societal impact, and will enable us to carry out new research on communication about illness and healthcare.”

CASS is one eight established research centres awarded a total of £6.9m to continue their work under a new funding model designed to secure the long term sustainability of social science research excellence in the UK.

Watch this space for updates on the Centre’s work and the release of new tools and corpora!

The CASS team from April 2018:

Principal Investigator:
Elena Semino – Linguistics and English Language (Lancaster University)

Andrew Hardie – Linguistics and English Language (Lancaster University)
Paul Baker – Linguistics and English Language (Lancaster University)
Vaclav Brezina – Linguistics and English Language (Lancaster University)
Dana Gablasova – Linguistics and English Language (Lancaster University)
Claire Hardaker – Linguistics and English Language (Lancaster University)
John Pill – Linguistics and English Language (Lancaster University)
Dimitrinka Atanasova – Linguistics and English Language (Lancaster University)

Basil Germond – Politics, Philosophy and Religion (Lancaster University)
Garrath Williams – Politics, Philosophy and Religion (Lancaster University)

Kate Cain – Psychology (Lancaster University)
Steve Young – Accounting and Finance (Lancaster University)

Angela Woods – English Studies and Hearing the Voice project (Durham  University)
Joanna Zakrzewska – University College London Hospitals

Zsófia Demjén – UCL Centre for Applied Linguistics (University College London)