Corpus compilation: working paper now available

We are pleased to announce that the CASS Corpus on Urban Violence in Brazil is now ready to be analysed. It contains a total of about 5,127 articles (1,778,282 words) published between Jan-Dec 2014 by four Brazilian newspapers: Folha de São Paulo, Estado de São Paulo, Zero Hora and Pioneiro.

This working paper explains the process of compiling the corpus. It describes the selection of sources and individual texts, preparation of the texts so that they can be processed by corpus linguistics techniques, and concludes with an overview of the corpus’ content.

Changing Climates and the Media: Lancaster workshop

climate change workshopThe Lancaster workshop on Changing Climates and the Media took place last Monday (21st Sep 2015).  This was a joint event organised by the ESRC Centre for Corpus Approaches to Social Science (CASS) and the Department of Sociology, Lancaster University.

The workshop brought together leading academics from a wide range of disciplines – sociology, media studies, political and environmental sciences, psychology, and linguistics – as well as community experts from the Environment Agency and the Green Alliance. The result was a lively debate on the interaction between the news media and the British society, and a critical reflection on people’s perception of the problem and effective ways to communicate the issue and promote changes in behaviour and practices.

Professor John Urry from Lancaster University opened the event with a brief overview of the major challenges posed by climate change. He also introduced the CASS project on Changing Climates, a corpus-based research on how climate change issues have been debated in the British and Brazilian news media in the past decade. This contrastive analysis is interesting for various reasons. These include striking differences related to public perception of the problem. While climate-change scepticism is prominent within the public debate in Britain, Brazil is a leading country in terms of concern about climate change, with nine-in-ten Brazilians considering global warming a very serious problem. Dr Carmen Dayrell presented some examples of fundamental differences between the media debate in these two countries. Unlike the British press, Brazilian newspapers articulate the discourse along the same lines as those advocated by the IPCC. This includes stressing the position of developed and developing nations and the projected consequences of the impact of climate change on the Earth’s system, such as the melting of polar icefields, loss of biodiversity and increased frequency of extreme weather events.

The Changing Climates project is currently being extended to Germany and Italy. Dr Marcus Müller from the Technische Universität Darmstadt discussed his preliminary findings on how the German news media has represented climate change issues. Dr M. Cristina Caimotto and Dr Osman Arrobbio from the University of Turin presented their initial observations of the Italian context and data. The Changing Climates presentation concluded with insightful comments by Dr Glenn Watts, the Environment Agency’s research lead on climate change and resource use and Lancaster’s primary partner in the Changing Climates project.

The afternoon session explored climate change from various perspectives. It started with Professor Reiner Grundmann from University of Nottingham who presented corpus research on the media coverage of climate change across Britain, Germany, France and the US. Dr James Painter from the University of Oxford and Dr Neil Gavin from the University of Liverpool focused on the coverage of the UN IPCC reports in the news media and television respectively.

The focus then turned to the British parliament and the 2009 debate on the Climate Change Bill. How do politicians talk about climate change in public? This question was addressed by Rebecca Willis, a PhD candidate at Lancaster University and a member of the Green Alliance. Following that, Dr Neil Simcock, also from Lancaster University, explored the representations of ‘essential’ energy use in the UK media. The session concluded with Professor Alison Anderson from Plymouth University’s talk on the role of local news media in communicating climate change issues.

Our sincere thanks to all participants of the Lancaster workshop for making it a unique and very special event. This was an excellent opportunity to exchange ideas and share experiences which we hope will foster enhanced collaboration between the various disciplines.


Jonathan Culpeper talking ‘Sarcasm’ tonight on The One Show

Sarcasm is one of the phenomena that seems to have endless fascination for British people, partly because  they are stereotypically associated with it. When did sarcasm first start?  Is there something about British culture that makes it flourish?  And what is sarcasm anyway? These are some of the questions that Gyles Brandeth  of BBC 1’s The One Show puts to Jonathan Culpeper in an item reflecting on sarcasm in Britain.

Tune in to BBC1 tonight at 19:00 to hear CASS co-investigator Prof. Jonathan Culpeper discussing sarcasm on The One Show.

Update on Changing Climates

The Changing Climates project is a corpus-based investigation of discourses around climate change. It aims to examine how climate change has been framed in the media coverage across Britain and Brazil in the past decade. Here, we look at two different scenarios. Recent surveys have shown that climate change is currently considered a high priority concern within Brazil, with the country showing higher degree of concern than almost anywhere else. By contrast, climate change scepticism is increasingly prominent in the British public sphere.

We are pleased to announce that we have just finished collecting the data. The Brazilian corpus contains about 8 million words, comprising texts from 12 newspapers. The British corpus is much larger. It has nearly 80 million words and includes texts published by all major British broadsheet and tabloid papers.

CASS affiliated papers to be given at the upcoming 5th International Language in the Media Conference

In two weeks, several scholars affiliated with the Centre will be heading south to attend the 5th International Language in the Media Conference, taking place this year at Queen Mary, University of London. We are particularly excited about the theme — “Redefining journalism: Participation, practice, change” — as well as the conference’s continued prioritization of papers on “language and class, dis/ability, race/ethnicity, gender/sexuality and age; political discourse, commerce and global capitalism” (among other important themes). As a taster for those of you who will be joining us in London and an overview for those who are unfortunately unable to make it this year, abstracts of the CASS affiliated papers to be given at the conference are reproduced below.

“I hate that tranny look”: a corpus-based analysis of the representation of trans people in the national UK press

Paul Baker

In early 2013, two high-profile incidents involving press representation of trans people resulted in claims that the British press were transphobic. For example, Jane Fae wrote in The Independent, that ‘the trans community… is now a stand-in for various minorities… and a useful whipping girl for the national press… trans stories are only of interest when trans folk star as villains” (1/13/13). This paper examines Fae’s claims by using methods from corpus linguistics in order to identify the most frequent and salient representations of trans people in the national UK press. Corpus approaches use computational tools as an aid in human research, offering a good balance between quantitative and qualitative analyses, My analysis is based upon previous corpus-based research where I have examined the construction of gay people, refugees and asylum seekers and Muslims in similar contexts.

Using a 660,000 word corpus of news articles about trans people published in 2012, I employ concordancing techniques to examine collocates and discourse prosodies of terms like transgender, transsexual and tranny, in order to identify repetitive patterns of representation that occur across newspapers. I compare such patterns to sets of guidelines on language use by groups like The Beaumont Society, and discuss how certain representations can be enabled by the Press Complaints Commissions Code of Practice. While the analysis found that there are very different patterns of representation around the three labels under investigation, all of them showed a general preference for negative representations, with occasional glimpses of more positive journalism.

“I think we’d rather be called survivors”: A corpus-based critical discourse analysis of the semantic preferences of referential strategies in Hurricane Katrina news articles as indicators of ideology

Amanda Potts

In times of great crisis, people often rely upon the discourse of powerful institutions to help frame experiences and reinforce established ideologies (van Dijk 1985). Selection of referential strategies in such discourses can reveal much about our society; for instance, some words have the power to comfort addressees but further oppress the referents. Taking a corpus-based critical discourse analytical approach, in this paper I explore the discursive cues of underlying ideology (of both the publications and perhaps the assumed audience) with special attention on journalists’ referential and predicational strategies (Reisgl and Wodak 2000). Analysis is based on a custom-compiled 36.7-million-word corpus of American news print articles concerning Hurricane Katrina.

A variety of forms of reference have been identified in the corpus using part-of-speech tagged word lists. Collocates of each form of reference have been calculated and automatically assigned a semantic tag by the UCREL USAS tagger (Archer et al. 2002). Semantic categories represented by the highest proportion of collocates overall have been identified as the most salient indicators of ideology.

The semantic preferences of the referential strategies are found to be quite distinct. For instance, resident prefers the M: Movement semantic category, whereas collocates of evacuee tend to fall under N: Numbers. This may prime readers to interpret Gulf residents and evacuees as large, threatening, ‘invading’ masses (often in conjunction with negative water metaphors such as flood). The highest collocate semantic category for victim, displaced, and survivor is S: Social actions, states and processes, indicating that the [social] experiences of these referents—such as being helped or stranded, or linked to social identifies such as wife—are foregrounded rather than their numbers or movement.

Finally, the plummeting frequency of refugee following a unique debate in the media over the word’s meaning and even its semantic preference will also be discussed as an illustrative example of how unconscious language patterns can sometimes come to the fore in contested usage and influence the journalistic lexicon. Following from this, a more considered use of referential strategies is recommended, particularly in the media, where this could encourage heightened compassion for- and understanding of those gravely affected by catastrophic events.

Journalism through the Guardian’s goggles

Anna Marchi

‘Journalism is an intensely reflexive occupation, which constantly talks to and about itself’ (Aldridge and Evetts 2003: 560). Journalists create interpretative communities (Zelizer 2004) through the discourses they circulate about their profession, the meaning and role of journalism are constituted through daily performance (Matheson 2003) and can be studied by means of the self-reflexive traces in texts. That is, they can be detected and studied in a newspaper corpus.

This paper proposes a corpus-assisted discourse analysis (Partington 2009) of the ways journalists represent their trade in their own news-work. The focus of the research in one newspaper in particular: the Guardian. Previous research (Marchi and Taylor 2009) suggested that among British broadsheets the Guardian is by far the most interested in other media, as well as the most inclined to talk about itself. Using newspaper data from 2005, a particularly relevant year in the newspaper’s biography (it changed format from traditional broadsheet to berliner) and rich with self-reflexivity, I examine the discursive behavior of media-related lexical items in the corpus (such as journalist, reporter, hack, media, newspaper, press, tabloid) exploring the ways in which the Guardian conceptualises the role of the news media, how it represents professional values and the divide between good and bad journalism, and, ultimately, how it constructs its own identity. The study relies on the typical tools of corpus linguistics research – collocation analysis, keywords analysis, concordance analysis – and aims to a comprehensive description of the data, following the principle of total accountability (McEnery and Hardie 2012: 17), while keeping track of the broader extralinguistic context. From a methodological point of view this work encourages interdisciplinary contamination and a serendipitous approach to the data and wishes to offer an example of how corpus-based research can contribute to the academic investigation of journalism across disciplines.

Writing for the press: the deleted scenes

In late July and early August 2013, the stories of Caroline Criado-Perez, the bomb threats, and latterly, the horrific tragedy of Hannah Smith broke across the media, and as a result, the behaviour supposedly known as “trolling” was pitched squarely into the limelight. There was the inevitable flurry of dissections, analyses, and opinion pieces, and no doubt like any number of academics in similar lines of work, I was asked to write various articles on this behaviour. Some I turned down for different reasons, but one that I accepted was for the Observer. (Here’s the final version that came out in both the Observer and the Guardian.)

Like the majority of people, I have been mostly in the dark about how the media works behind the scenes. That said, throughout my time at university, I have studied areas like Critical Discourse Analysis and the language of the media, and over the past three years, my work has been picked up a few times in small ways by the media, so I probably had a better idea than many. I realise now, however, that even with this prior knowledge, I was still pretty naive about the process. I wasn’t too surprised, then, when I got a number of comments on the Observer article raising exactly the sorts of questions I too would have asked before I’d gone through what I can only describe as a steep media learning curve. There were, essentially, three main issues that kept recurring:

(1)    Why didn’t you talk about [insert related issue here]? This other thing is also important!

(2)  Why didn’t you define trolling properly? This isn’t what I’d call trolling!

(3)   Why did you only mention the negative types of trolling? There are good kinds too!

All three questions are interrelated in various ways, but I’ve artificially separated them out because each gives me a chance to explain something that I’ve learned about what happens behind the scenes during the process of producing media content.

