Corpus compilation: working paper now available

We are pleased to announce that the CASS Corpus on Urban Violence in Brazil is now ready to be analysed. It contains a total of about 5,127 articles (1,778,282 words) published between Jan-Dec 2014 by four Brazilian newspapers: Folha de São Paulo, Estado de São Paulo, Zero Hora and Pioneiro.

This working paper explains the process of compiling the corpus. It describes the selection of sources and individual texts, preparation of the texts so that they can be processed by corpus linguistics techniques, and concludes with an overview of the corpus’ content.

Big data media analysis and the representation of urban violence in Brazil: Kick-off meeting


The first meeting of the project took place earlier this month at CASS, Lancaster. This kick-off meeting brought together the Brazilian researchers Professors Heloísa Pedroso de Moraes Feltes (UCS) and Ana Cristina Pelosi (UNISC/UFC) and the CASS team (Professors Elena Semino and Tony McEnery, and Dr Carmen Dayrell) to plan the project’s activities and discuss the next steps.

The meeting was an excellent opportunity to discuss the partners’ role and activities in the project and to clarify how CASS can provide the Brazilian researchers with the expertise needed in a corpus investigation. A key decision towards this goal was to run a two-day Workshop in Corpus Linguistics in Brazil. This will be run by the CASS team (also counting with the expertise of Dr Vaclav Brezina) in the last week of May.

The workshop aims to reach a wider audience and not only to the Brazilian researchers’ team. It will be open to their colleagues, graduate and undergraduate students, and anyone interested in learning and using corpus linguistics methods and tools in the research.

We are all looking forward to that!

New CASS project: Big data media analysis and the representation of urban violence in Brazil

A new project in CASS has been funded jointly by the UK’s Economic and Social Research Council and the Brazilian research agency CONFAP. The project will involve a collaboration between two Lancaster academics (Professors Elena Semino and Tony McEnery) and two Brazilian academics: Professor Heloísa Pedroso de Moraes Feltes (University of Caxias do Sul) and Professor Ana Cristina Pelosi (University of Santa Cruz do Sul and Federal University of Ceara). The team will employ corpus methods to investigate the linguistic representation of urban violence in Brazil.

Urban violence is a major problem in Brazil: the average citizen is affected by acts of violence, more or less directly, on a daily basis. This creates a general state of fear and insecurity among the population, but, at the same time, may promote a sense of empathy with the less privileged classes in Brazil. Urban violence is also a regular topic in daily conversations and news media, so that people’s perceptions of the nature of this phenomenon are partly mediated by discourse. In particular, daily press reports of acts of violence may affect people’s views and attitudes in ways which may or may not be consistent with the actual incidence, forms and causes of violence.

This collaborative project will investigate the linguistic representation of urban violence in Brazil by applying the methods of Corpus Linguistics to two corpora:

  1. The existing transcripts of two focus groups on living with urban violence conducted in Fortaleza, Brazil, for a total of approximately 20,000 words;
  2. A new 2-million-word corpus of news reports in the Brazilian press, to be constructed as part of the partnership.

The linguistic representation of urban violence in the two corpora will be investigated by means of the analysis of: lexical and semantic concordances, collocational patterns and key words.  A comparison will also be carried out between the two corpora, in order to identify similarities and differences with respect to what types of violence are primarily talked about and how they are linguistically represented.

The comparative analysis of the two corpora will make it possible to explore in detail the relationships between official statistics about urban violence, media representations and citizens’ views. A better understanding of these relationships can help to alleviate the consequences of urban violence on citizens’ lives, and to foster attitudes conducive to the solution of the social problems that cause the violence in the first place.