Coming this year: Corpora and Discourse Studies (Palgrave Advances in Language and Linguistics)

Three members of CASS have contributed chapters to a new volume in the Palgrave Advances in Language and Linguistics series. Corpora and Discourse Studies will be released later this year.


corpdiscThe growing availability of large collections of language texts has expanded our horizons for language analysis, enabling the swift analysis of millions of words of data, aided by computational methods. This edited collection of chapters contains examples of such contemporary research which uses corpus linguistics to carry out discourse analysis. The book takes an inclusive view of the meaning of discourse, covering different text-types or modes of language, including discourse as both social practice and as ideology or representation. Authors examine a range of spoken, written, multimodal and electronic corpora covering themes which include health, academic writing, social class, ethnicity, gender, television narrative, news, Early Modern English and political speech. The chapters showcase the variety of qualitative and quantitative tools and methods that this new generation of discourse analysts are combining together, offering a set of compelling models for future corpus-based research in discourse.

Table of Contents:

  1. Introduction; Paul Baker and Tony McEnery
  2. E-Language: Communication in the Digital Age; Dawn Knight
  3. Beyond Monomodal Spoken Corpora: Using a Field Tracker to Analyse Participants’ Speech at the British Art Show; Svenja Adolphs, Dawn Knight and Ronald Carter
  4. Corpus-assisted Multimodal Discourse Analysis of Television and Film Narratives; Monika Bednarek
  5. Analysing Discourse Markers in Spoken Corpora: Actually as a Case Study; Karin Aijmer
  6. Discursive Constructions of the Environment in American Presidential Speeches 1960-2013: A Diachronic Corpus-assisted Study; Cinzia Bevitori
  7. 7. Health Communication and Corpus Linguistics: Using Corpus Tools to Analyse Eating Disorder Discourse Online; Daniel Hunt and Kevin Harvey
  8. Multi-Dimensional Analysis of Academic Discourse; Jack A. Hardy
  9. Thinking About the News: Thought Presentation in Early Modern English News Writing; Brian Walker and Dan McIntyre
  10. The Use of Corpus Analysis in a Multi-perspectival Study of Creative Practice; Darryl Hocking
  11. Corpus-assisted Comparative Case Studies of Representations of the Arab World; Alan Partington
  12.  Who Benefits When Discourse Gets Democratised? Analysing a Twitter Corpus Around the British Benefits Street Debate; Paul Baker and Tony McEnery
  13. Representations of Gender and Agency in the Harry Potter Series; Sally Hunt
  14. Filtering the Flood: Semantic Tagging as a Method of Identifying Salient Discourse Topics in a Large Corpus of Hurricane Katrina Reportage; Amanda Potts

Centre Vacancy: Senior Research Associate

Linguistics & English Language
Salary: £32,277 to £37,394
Closing Date: Sunday 03 May 2015
Interview Date: To be confirmed
Reference: A1198

The Centre for Corpus Approaches to Social Science, funded by the ESRC, is seeking to appoint to an 18 month research position to work on ‘discourses on distressed communities’. This position is available from 1 May 2015 or as soon as possible thereafter.

You must have relevant research experience in corpus linguistics and an ability to engage with research within human geography.

You will pursue research on developing and applying existing and new approaches to the use of corpus linguistics within the social sciences. This will focus on discourses around the UK’s distressed communities, how these are represented in the news media, and whether areas with high levels of poverty and marginalization are represented differently from other areas. The project will centre on creating and analyzing a corpus of contemporary newspaper material. It will additionally draw on techniques being developed by the Spatial Humanities project, thus a knowledge of using geographical information systems (GIS) or analyzing census data is desirable although training can be provided.

You will join an interdisciplinary team of internationally renowned researchers within the Departments of Linguistics and English Language and History. This project is supervised by Prof Ian Gregory within the overall Centre. You will be offered excellent career progression opportunities through the ESRC Centre.

Informal enquiries may be made to Professor Ian Gregory, i.gregory(Replace this parenthesis with the @ sign)lancaster.ac.uk

Further information on the Centre for Corpus Approaches to Social Science is available from: http://cass.lancs.ac.uk. The History Department’s website can be found at: http://www.lancaster.ac.uk/fass/history and the Spatial Humanities project’s website is at: http://www.lancaster.ac.uk/spatialhum

We welcome applications from people in all diversity groups.

Further details:

Apply through the Lancaster University website. 

Three CASS articles for special issue of Discourse & Communication available Open Access now

Discourse & Communication 9(2) will be an exciting Special Issue containing a number of articles which examine corpus-based approaches to the analysis of media discourse. CASS members Tony McEnery, Paul Baker, Amanda Potts, Mark McGlashan, and Robbie Love have contributed to three of these articles, all of which are now available for Open Access early download. Read abstracts of the articles below and follow links to download full PDFs of the works. More interesting papers are also available OnlineFirst for those with subscriptions to Discourse & Communication.


Picking the right cherries? A comparison of corpus-based and qualitative analyses of news articles about masculinity 

Paul Baker (Lancaster University, UK) and Erez Levon (Queen Mary University of London, UK)

As a way of comparing qualitative and quantitative approaches to critical discourse analysis (CDA), two analysts independently examined similar datasets of newspaper articles in order to address the research question ‘How are different types of men represented in the British press?’. One analyst used a 41.5 million word corpus of articles, while the other focused on a down-sampled set of 51 articles from the same corpus. The two ensuing research reports were then critically compared in order to elicit shared and unique findings and to highlight strengths and weaknesses between the two approaches. This article concludes that an effective form of CDA would be one where different forms of researcher expertise are carried out as separate components of a larger project, then combined as a way of triangulation.


How can computer-based methods help researchers to investigate news values in large datasets? A corpus linguistic study of the construction of newsworthiness in the reporting on Hurricane Katrina

Amanda Potts (Lancaster University, UK), Monika Bednarek (University of Sydney, Australia), and Helen Caple (University of New South Wales, Australia)

This article uses a 36-million word corpus of news reporting on Hurricane Katrina in the United States to explore how computer-based methods can help researchers to investigate the construction of newsworthiness. It makes use of Bednarek and Caple’s discursive approach to the analysis of news values, and is both exploratory and evaluative in nature. One aim is to test and evaluate the integration of corpus techniques in applying discursive news values analysis (DNVA). We employ and evaluate corpus techniques that have not been tested previously in relation to the large-scale analysis of news values. These techniques include tagged lemma frequencies, collocation, key part-of-speech tags (POStags) and key semantic tags. A secondary aim is to gain insights into how a specific happening – Hurricane Katrina – was linguistically constructed as newsworthy in major American news media outlets, thus also making a contribution to ecolinguistics.


Press and social media reaction to ideologically inspired murder: The case of Lee Rigby

Tony McEnery (Lancaster University, UK), Mark McGlashan (Lancaster University, UK), and Robbie Love (Lancaster University, UK)

This article analyses reaction to the ideologically inspired murder of a soldier, Lee Rigby, in central London by two converts to Islam, Michael Adebowale and Michael Adebolajo. The focus of the analysis is upon the contrast between how the event was reacted to by the UK National Press and on social media. To explore this contrast, we undertook a corpus-assisted discourse analysis to look at three periods during the event: the initial attack, the verdict of the subsequent trial and the sentencing of the murderers. To do this, we constructed and analysed corpora of press and Twitter coverage of the attack, the conviction of the suspects and the sentencing of them. The analysis shows that social media and the press are intertwined, with the press exerting a notable influence through social media, but social media not always being led by the press. When looking at social media reaction to such an event as this, analysts should always consider the role that the press are playing in forming that discourse.

New CASS Briefing now available — A ‘battle’ or a ‘journey’? Metaphors and cancer

CASSbriefings-melcA ‘battle’ or a ‘journey’? Metaphors and cancer. Metaphors matter because they ‘frame’ topics in different ways, which can affect our perception of ourselves and our experiences. The ‘battle’ metaphor for cancer has become controversial because of the framing it may impose on the patient’s experience; the ‘journey’ metaphor frames the cancer experience very differently. We were particularly concerned with whether and how different metaphors may place the patient in an ‘empowered’ or a ‘disempowered’ position, and with the resulting emotional associations.


New resources are being added regularly to the new CASS: Briefings tab above, so check back soon.

Big data media analysis and the representation of urban violence in Brazil: Kick-off meeting

urbanviolencemeeting

The first meeting of the project took place earlier this month at CASS, Lancaster. This kick-off meeting brought together the Brazilian researchers Professors Heloísa Pedroso de Moraes Feltes (UCS) and Ana Cristina Pelosi (UNISC/UFC) and the CASS team (Professors Elena Semino and Tony McEnery, and Dr Carmen Dayrell) to plan the project’s activities and discuss the next steps.

The meeting was an excellent opportunity to discuss the partners’ role and activities in the project and to clarify how CASS can provide the Brazilian researchers with the expertise needed in a corpus investigation. A key decision towards this goal was to run a two-day Workshop in Corpus Linguistics in Brazil. This will be run by the CASS team (also counting with the expertise of Dr Vaclav Brezina) in the last week of May.

The workshop aims to reach a wider audience and not only to the Brazilian researchers’ team. It will be open to their colleagues, graduate and undergraduate students, and anyone interested in learning and using corpus linguistics methods and tools in the research.

We are all looking forward to that!

Mahmoud El-Haj has recently joined CASS working on the ESRC funded project “Understanding Corporate Communications”

mahmoudThe project is a comprehensive analysis of the form, content and impact of communications between large, publicly traded corporations and their key stakeholder groups concerning the following three key aspects of corporate governance: i) compliance with governance requirements and recommendations (e.g. The Combined Code in the UK); ii) executive remuneration; and iii) senior management turnover.

Mahmoud is a Senior Research Associate at Lancaster University. His main research interests are natural language processing, corpus linguistics, information extraction, machine learning and computational linguistics. In his research he worked with multidisciplinary multilingual big data including financial narratives, news articles, medical journals, and data from social science and humanities. Mahmoud is also working with the School of Computing and Communications at Lancaster University on a project funded by UCREL working on VardSourcing and SenseSourcing – the use of crowdsourcing to build lexicons and check spelling variation in historical data.

Recent publications and presentations related to this project include:

El-Haj, M., Rayson, P., Young, S., and Walker, M.. ”Detecting Document Structure in a Very Large Corpus of UK Financial Reports”. In The 9th edition of the Language Resources and Evaluation Conference, 26-31 May 2014, Reykjavik, Iceland. http://ucrel.lancs.ac.uk/cfie/El-HajEtAl_lrec14.pdf

Athanasakou, V., El-Haj, M., Rayson, P., Young, S., and Walker, M.. “Computer-based Analysis of the Strategic Content of UK Annual Report Narratives”. In American Accounting Association Annual Meeting, August 2-6, 2014, Atlanta, USA. http://ucrel.lancs.ac.uk/cfie/AthanasakouEtAl_AAA_outline.pdf

IR Group Glasgow University, 2015 / School of Computing Science: Analysing UK Annual Report Narratives using Text Analysis and Natural Language Processing, Glasgow, Scotland. http://www.lancaster.ac.uk/staff/elhaj/docs/GlasgowTalk.pdf

Bangor University: PhD Training Session at Bangor Business School: Analysing Annual Report Narratives (co presented with Steve Young, LUMS, Lancaster University), 2014, Bangor, Wales. http://www.lancaster.ac.uk/staff/elhaj/docs/bangorSlides.pdf

The 8th LSE/LUMS/MBS Conference 2014 / London School of Economics: Natural Language Processing of UK Annual Report Narratives (co presented with Paul Rayson, SCC, Lancaster University), London, England. http://www.lancaster.ac.uk/staff/elhaj/docs/RaysonElHaj.pdf

New CASS Briefing now available — How to communicate successfully in English?

CASSbriefings-EDLHow to communicate successfully in English? An exploration of the Trinity Lancaster Corpus. Many speakers use English as their non-native language (L2) to communicate in a variety of situations: at school, at work or in other everyday situations. As well as needing to master the grammar and vocabulary of the English language, L2 users of English need to know how to react appropriately in different communicative situations. In linguistics, this aspect of language is studied under the label of “pragmatics”. This briefing offers an exploration of the pragmatic features of L2 speech in the Trinity Lancaster Corpus of spoken L2 production.

New resources are being added regularly to the new CASS: Briefings tab above, so check back soon.

The spectre of Nazism haunts social media

Each time there is an upsurge in the Israel-Palestine conflict there is a rise in violent and other abusive incidents against Jews around the world. This phenomenon is now well-known. So it was in 2014 with Israel’s military operation ‘Protective Edge’ in July and August. Numerous backlash incidents against Jews in the UK and elsewhere in the world were reported by news media.

The conflict between Israelis and Palestinians has become a global phenomenon spreading from Gaza and the Occupied Territories of the West Bank into some of Europe’s major cities and other cities around the world. Jews are seemingly targeted as representatives for the State of Israel and attacked as proxies for the Israel Defence Force. It is a crude form of political violence.

In the UK we have the most robust data collected internationally on the problem of anti-Jewish incidents. Last year, such incidents reportedly more than doubled compared to 2013, according to a report published by the Community Security Trust.[1]

What was noticeable this last time around in the Israel-Gaza conflict of July and August 2014 was an apparent upsurge of abuse against Jews on social media. By the end of July 2014, some of the press were reporting an “explosion” of such abuse.

John Mann MP, the chair of the All-Party Parliamentary Group Against Antisemitism, instigated a parliamentary inquiry into the lessons that could be learned from the upsurge of anti-Jewish incidents associated with last year’s conflict. The report of that inquiry was published this week. It includes some of the key findings concerning anti-Jewish abuse on social media produced by a rapid response analysis commissioned from a team at Lancaster University — Paul Iganski and Abe Sweiry from the Lancaster University Law School, along with Mark McGlashan — as part of their work with the Lancaster University ESRC Centre for Corpus Approaches to Social Science.

We downloaded a sample of 22 million Tweets from July and August 2014 and carried out a detailed analysis of a sub-sample of 38,460 Tweets containing the words “Israel” or “Gaza”, along with the words “Jew”, “Jews” or “Jewish”.

The results were very telling:

  • A keyword analysis – one of the core methods of corpus linguistics – showed that in the sub-sample analysed, the spectre of Nazism, with words such as “Hitler”, “Holocaust”, “Nazi” and “Nazis”, was present in the top 35 keywords for the downloaded sample. “Hitler” was mentioned 1117 times; “Holocaust” was mentioned in 505 tweets, and; “Nazi” or “Nazis” were mentioned in 851 tweets.
  • The Nazi theme was also evident in hashtags analysed for the sub-sample, with the high frequency of the hashtags #hitler, # hitlerwasright, and #genocide.

While providing a very useful indication of patterns of discourse, keyword analysis and hashtag analysis alone is never sufficient: the contexts of the tweets in which the keywords and hashtags are situated need to be interpreted. Using the linguistic technique of collocation analysis, tweets that seemed to express negative sentiment targeted explicitly at ‘Jews’ were isolated and subjected to a closer reading. Sadly, there was little interpretation that needed to be applied to our sample. The sentiments conveyed were stark:

  • Some contained explicit anti-Jewish invective which if shouted out on the streets – as does happen in many incidents – would clearly be racially or religiously aggravated public order offences.
  • Others wished violence upon Jews as proxies for Israelis, or simply just as Jews.
  • A number expressed the type of sentiment that “Hitler should have finished the job”. Some of these invoked Hitler to return for the task.
  • In other tweets, the use of gas chambers for Jews was invoked.
  • Others simply included Nazi-slogans.

Deep wounds are scratched when the Nazi-card is played in this way in discourse against Jews. Playing the Nazi-card is not simply abusive. It invokes painful collective memories for Jews and for many others. By using those memories against Jews it inflicts profound hurts. Those who play the Nazi-card know exactly what it means.

Reaction to the military practices of the Israeli state can be expressed in a variety of forceful and trenchant ways – none of which would be antisemitic. The hurts inflicted against Jews when the Nazi card is played cannot be written-off as collateral damage in the protest against Israel, just as the deaths and injuries of innocent Palestinian civilians cannot be written-off as the inevitable casualties of war. As Professor David Feldman, Director of the Pears Institute for the Study of Antisemitism, stated in his written evidence to the All-Party Parliamentary Inquiry Against Antisemitism, playing the Nazi-card with a statement such as ‘Hitler was right’, “invokes both a set of antisemitic stereotypes and a genocidal project targeted at Jews”.[2]

In the UK a sufficient statutory framework is arguably in place to prosecute against the types of anti-Jewish abuse we identified by proceedings under the Malicious Communications 1988 or the Communications Act 2003.[3] In such proceedings courts can treat the anti-Jewish abuse as racial or religious aggravation according to the Criminal Justice Act 2003. The inquiry’s recommendation therefore that the Crown Prosecution Service should give consideration “to the suitability of existing guidance on communications sent via social media” and  ”that hate crime guidance material on grossly offensive speech be reviewed to clarify what amounts to ‘criminal acts’ that ‘will be prosecuted’”[4] is opportune.


[1] Community Security Trust (2015) Antisemitic Incidents Report 2014, London: Community Security Trust, page 4.

[2] All-Party Parliamentary group Against Antisemitism (APPG) (2015) Report of the All-Party Parliamentary Inquiry into Antisemitism, London: APPG, page 103.

[4] All-Party Parliamentary group Against Antisemitism (APPG) (2015) Report of the All-Party Parliamentary Inquiry into Antisemitism, London: APPG, para. 13, page 114.