Coming this year: Corpora and Discourse Studies (Palgrave Advances in Language and Linguistics)

Three members of CASS have contributed chapters to a new volume in the Palgrave Advances in Language and Linguistics series. Corpora and Discourse Studies will be released later this year.


corpdiscThe growing availability of large collections of language texts has expanded our horizons for language analysis, enabling the swift analysis of millions of words of data, aided by computational methods. This edited collection of chapters contains examples of such contemporary research which uses corpus linguistics to carry out discourse analysis. The book takes an inclusive view of the meaning of discourse, covering different text-types or modes of language, including discourse as both social practice and as ideology or representation. Authors examine a range of spoken, written, multimodal and electronic corpora covering themes which include health, academic writing, social class, ethnicity, gender, television narrative, news, Early Modern English and political speech. The chapters showcase the variety of qualitative and quantitative tools and methods that this new generation of discourse analysts are combining together, offering a set of compelling models for future corpus-based research in discourse.

Table of Contents:

  1. Introduction; Paul Baker and Tony McEnery
  2. E-Language: Communication in the Digital Age; Dawn Knight
  3. Beyond Monomodal Spoken Corpora: Using a Field Tracker to Analyse Participants’ Speech at the British Art Show; Svenja Adolphs, Dawn Knight and Ronald Carter
  4. Corpus-assisted Multimodal Discourse Analysis of Television and Film Narratives; Monika Bednarek
  5. Analysing Discourse Markers in Spoken Corpora: Actually as a Case Study; Karin Aijmer
  6. Discursive Constructions of the Environment in American Presidential Speeches 1960-2013: A Diachronic Corpus-assisted Study; Cinzia Bevitori
  7. Health Communication and Corpus Linguistics: Using Corpus Tools to Analyse Eating Disorder Discourse Online; Daniel Hunt and Kevin Harvey
  8. Multi-Dimensional Analysis of Academic Discourse; Jack A. Hardy
  9. Thinking About the News: Thought Presentation in Early Modern English News Writing; Brian Walker and Dan McIntyre
  10. The Use of Corpus Analysis in a Multi-perspectival Study of Creative Practice; Darryl Hocking
  11. Corpus-assisted Comparative Case Studies of Representations of the Arab World; Alan Partington
  12.  Who Benefits When Discourse Gets Democratised? Analysing a Twitter Corpus Around the British Benefits Street Debate; Paul Baker and Tony McEnery
  13. Representations of Gender and Agency in the Harry Potter Series; Sally Hunt
  14. Filtering the Flood: Semantic Tagging as a Method of Identifying Salient Discourse Topics in a Large Corpus of Hurricane Katrina Reportage; Amanda Potts

Big data media analysis and the representation of urban violence in Brazil: Kick-off meeting

urbanviolencemeeting

The first meeting of the project took place earlier this month at CASS, Lancaster. This kick-off meeting brought together the Brazilian researchers Professors Heloísa Pedroso de Moraes Feltes (UCS) and Ana Cristina Pelosi (UNISC/UFC) and the CASS team (Professors Elena Semino and Tony McEnery, and Dr Carmen Dayrell) to plan the project’s activities and discuss the next steps.

The meeting was an excellent opportunity to discuss the partners’ role and activities in the project and to clarify how CASS can provide the Brazilian researchers with the expertise needed in a corpus investigation. A key decision towards this goal was to run a two-day Workshop in Corpus Linguistics in Brazil. This will be run by the CASS team (also counting with the expertise of Dr Vaclav Brezina) in the last week of May.

The workshop aims to reach a wider audience and not only to the Brazilian researchers’ team. It will be open to their colleagues, graduate and undergraduate students, and anyone interested in learning and using corpus linguistics methods and tools in the research.

We are all looking forward to that!

Participate in our ESRC Festival of Social Sciences “Language Matters” event online

We are very pleased like to announce an event that we are live streaming on YouTube and Google+ next week. We hope you can find time to attend online*; if not, the recording will be available on YouTube afterwards.

From 1730 – 1900 GMT on 4 November, the ESRC Centre for Corpus Approaches to Social Science is hosting a live event in association with the ESRC Festival of Social Sciences and in tangent with our popular FutureLearn course. We would be thrilled if you could ‘tune in’ and collaborate with us during “Language Matters: Communication, Culture, and Society”.

This evening is a mini-series of four informal talks showcasing the impact of language on society. These are presented by some leading names in corpus linguistics (including the CASS Principal Investigator, Tony McEnery) and their talks draw upon the most popular themes in our corpus MOOC:

– What can corpora tell us about learning a foreign language? (with Vaclav Brezina)
– A ‘battle’, a ‘journey’, or none of these? Metaphors for cancer (with Elena Semino)
– Wolves in the wires: online abuse from people to press (with Claire Hardaker)
– Words ‘yesterday and today’ (with Tony McEnery, Claire Dembry, and Robbie Love)

Though we pride ourselves on bringing interesting, accessible material to people on the go, what really brings these events to life is the interactions that we have with attendees. That’s why we invite you to log in and contribute to the discussions taking place after each presentation.

There are two ways to virtually attend.

First, via Google Hangout if you have a Google account. Sign up at https://plus.google.com/events/ca15afbicmmeiu6d25pn1qbverg and then log in from 17:15 GMT  on 4 November to greet your fellow participants.

If you don’t have a Google account, you can watch us on YouTube at https://www.youtube.com/watch?v=hF_fl95tiSk with no registration.

We’ll be taking questions from the Google Hangout and from the #corpusMOOC hashtag on Twitter (particularly for those viewing on YouTube) and mixing these in with questions from our live audience.

We hope that you can take advantage of this event by participating online.


* If you are available, located in the London area, and would like to attend in person, please visit our event website to register.

Spoken BNC2014 project announcement

BNC2014 logo

We are excited to announce that the ESRC-funded Centre for Corpus Approaches to Social Science (CASS) at Lancaster University and Cambridge University Press have agreed to collaborate on the compilation of a new, publicly accessible corpus of spoken British English called the ‘Spoken British National Corpus 2014’ (the Spoken BNC2014).

The aim of the Spoken BNC2014 project, which will be led jointly by Lancaster University’s Professor Tony McEnery and Cambridge University Press’ Dr Claire Dembry, is to compile a very large collection of recordings of real-life, informal, spoken interactions between people whose first language is British English. These will then be transcribed and made available publicly for a wide range of research purposes.

We aim to encourage people from all over the UK to record their interactions and send them to us as MP3 files. For each hour of good quality recordings we receive, along with all associated consent forms and information sheets completed correctly, we will pay £18. Each recording does not have to be 1 hour in length; participants may submit two 30 minute recordings, or three 20 minute recordings, but for each hour in total, they will receive £18.

The collaboration between CASS at Lancaster University and Cambridge University Press brings together the best resources available for this task. Cambridge University Press is greatly experienced at collecting very large English corpora, and it already has the infrastructure in place to undertake such a large compilation project. CASS at Lancaster University has the linguistic research expertise necessary to ensure that the spoken BNC2014 will be as useful, and accessible as possible for a wide range of purposes. The academic community will benefit from access to a new large spoken British English corpus that is balanced according to a selection of useful demographic criteria, including gender, age, and socio-economic status. This opens the door for all kinds of research projects including the comparison of the spoken BNC2014 with older spoken corpora.

CASS at Lancaster University and Cambridge University Press are very excited to launch the Spoken BNC2014 project, and we look forward to sharing the corpus as widely as possible once it is complete.

To contribute to the Spoken BNC2014 project as a participant please email corpus@cambridge.org for more information.

Newby Fellow appointed to CASS

The Department of Linguistics and English Language has recently appointed a Newby Fellow, Dr. Helen Baker, to work on the CASS project entitled ‘Newspapers, Poverty and Long-Term Change. A Corpus Analysis of Five Centuries of Texts’.

Dr. Baker is a social historian who was awarded her Ph.D. in Russian History at the University of Leeds in 2002. Her thesis examined popular reactions to the Khodynka disaster, a stampede which took place during the coronation celebrations of Nicholas II in 1896. She taught Russian and European history at the University of Bradford before working as a teaching assistant in the Department of Russian and Slavonic Studies at the University of Leeds between 2003-2007.

Helen Baker has previously worked as a transcriber and historical researcher for the Department of Linguistics and Language, completing a historical chronology of the Scottish Glencairn Uprising of 1653 for the British Academy funded ‘Newsbooks at Lancaster’ project. This research sparked an interest in early modern history and she went on to investigate the lives of seventeenth-century English prostitutes. Her first book, co-authored with CASS Centre Director, Professor Tony McEnery, is forthcoming and uses the study of early-modern prostitution as a case study to illustrate that historians and corpus linguists have much to gain through academic collaboration.

The project ‘Newspapers, Poverty and Long-Term Change’, which is funded by the Newby Trust, aims to assemble the largest ever corpora of newspapers and related material from 1473 to 1900 and use this to investigate changing discourses on poverty across this period. Dr. Baker will officially join the project on 1 July 2014, working with Professor Tony McEnery, Dr. Andrew Hardie, and Professor Ian Gregory.

The appointment will mean something of a home-coming for Helen Baker, who studied for her undergraduate degree in the History Department at Lancaster University between 1994-1997.

Coming to CASS to code: The first two months

anthony_closeup_120px

After working at Waseda University in Japan for exactly 10 years, I was granted a one-year sabbatical in 2014 to concentrate on my corpus linguistics research. As my first choice of destination was Lancaster University, I was overjoyed to hear from Tony McEnery that the Centre for Corpus Approaches to Social Science (CASS) would be able to offer me office space and access to some of the best corpus resources in the world. I have now been at CASS for two months and thought this would be a good time to report on my experience here to date.

Since arriving at CASS, I have been working on several projects. My main project here is the development of a new database architecture that will allow AntConc, my freeware corpus analysis toolkit, to process very large corpora in a fast and resource-light way. The strong connection between the applied linguistics and computer science at Lancaster has allowed me to work closely with some excellent computer science faculty and graduate students, including Paul Rayson, John Mariani, Stephen Wattam, and John Vidler. We just presented our first results at LREC 2014 in Reykjavik.

I’ve also been working closely with the CASS members, including Amanda Potts and Robbie Love, to develop a set of ‘mini’ corpus tools to help with the collection, cleaning, and processing of corpora. I have now released VariAnt, which is a tool that finds spelling variants in a corpus, and SarAnt, which allows multiple search-and-replace functions to be carried out in a corpus as a batch process. I am also just about to release TagAnt, which will finally give corpus linguists a simple and intuitive interface to popular freeware Part-Of-Speech (POS) tagging tools such TreeTagger. I am hoping to develop more of these tools to help the corpus linguists in CASS and around the world to help with the complex and time-consuming tasks that they have to perform each day.

I always expected that I would enjoy the time at Lancaster, but did not anticipate that I would enjoy it as much as I am. Lancaster University has a great campus, the research facilities are some of the best in the world, the CASS members have treated me like family since the day I arrived, and even the weather has been kind to me, with sunny days throughout April and May. I look forward to writing more about my projects here at CASS.

Call for Participation: ESRC Summer School in Corpus Approaches to Social Science

The ESRC Summer School in Corpus Approaches to Social Sciences was inaugurated in 2013; the 2014 event is the second in the series. It will take place 15th to 18th July 2014, at Lancaster University, UK.

This free-to-attend summer school takes place under the aegis of CASS (https://cass.lancs.ac.uk), an ESRC research centre bringing a new method in the study of language – the corpus approach – to a range of social sciences. CASS is investigating the use and manipulation of language in society in a host of areas of pressing concern, including climate change, hate crime and education.

Who can attend?

A crucial part of the CASS remit is to provide researchers across the social sciences with the skills needed to apply the tools and techniques of corpus linguistics to the research questions that matter in their own discipline. This event is aimed at junior social scientists – especially PhD students and postdoctoral researchers – in any of the social science disciplines. Anyone with an interest in the analysis of social issues via text and discourse – especially on a large scale – will find this summer school of interest.

Programme

The programme consists of a series of intensive two-hour sessions, some involving practical work, others more discussion-oriented.

Topics include: Introduction to corpus linguistics; Corpus tools and techniques; Collecting corpus data; Foundational techniques for social science data – keywords and collocation; Understanding statistics for corpus analysis; Discourse analysis for the social sciences; Semantic annotation and key domains; Corpus-based approaches to metaphor in discourse; Pragmatics, politeness and impoliteness in the corpus.

Speakers include Tony McEnery, Paul Baker, Jonathan Culpeper, and Elena Semino.

The CASS Summer School is one of the three co-located Lancaster Summer Schools in Interdisciplinary Digital Methods; see the website for further information:

http://ucrel.lancs.ac.uk/summerschool

How to apply

The CASS Summer School is free to attend, but registration in advance is compulsory, as places are limited.

The deadline for registrations is Sunday 8th June 2014.

The application form is available on the event website as is further information on the programme.

 

Changing Climates: Crossing Boundaries

Last Friday (28th), CASS had the pleasure to host a cordial meeting in which researchers from CASS and the University of Bergen got together to discuss about their ongoing research on discourses surrounding climate change.

The Norwegian team runs the NTAP project (Networks of Texts and People) which aims to explore the flow of information across online social networks with a view to understanding how knowledge develops and how opinion is shaped. Among other topics, the project examines the dynamics of discussions in the blogosphere around the various issues related to climate change.

Dag Elgesem and Andrew Salway – the principal investigator and scientific co-ordinator of the NTAP project respectively – provided an overview of the main goals of the project, state of affairs, expectations and their next steps. The Technical consultant and programmer for the project, Knut Hofland, talked about the data and the process of collecting it, describing various issues and decisions made along this process. Lubos Steskal, the project’s post-doctoral fellow, presented an interesting graphical representation of bloggers’ interactions which offers the researcher a clear indication about how communities are formed as well as whether and how they interact with each other. Samia Touileb presented a sample of her ongoing PhD project which uses grammar induction techniques to capture typical expressions used in blogs that discuss climate change.

Tony McEnery and Carmen Dayrell represented the CASS centre. Tony McEnery first provided a general broad view of the centre’s activities and staff by briefly mentioning its various projects. He also talked about some techniques commonly used in corpus-based discourse analysis to extract and manipulate the data. As expected, more attention was paid to the Changing Climates project. Having the climate change sociologist John Urry as its principal investigator, the project aims to contrast how climate change is discussed in news printed media in Britain and Brazil. Carmen Dayrell presented the current stage of the project. Her talk revolved around the composition of the corpora used in this study and a preliminary analysis of the data.

This was an excellent opportunity for these researchers to exchange ideas and experiences, expand horizons and learn about other approaches, perspectives, and views. We hope this first meeting will encourage and foster fruitful enhanced collaboration between these research teams.

Introducing CASS 1+3 Research Student: Robbie Love

In 2013, the ESRC Centre for Corpus Approaches to Social Science was pleased to award its inaugural 1+3 (Masters to PhD) studentship to Robbie Love. Read a bit about the first year of his postgraduate experience, in Robbie’s own words below.


robbieloveI am a Research Student at CASS in the first year of a 1+3 PhD studentship. My main role is to investigate methodological issues in the collection of spoken corpora, but I also have interests in corpus-assisted critical discourse analysis.

I grew up in the north east of England in Blyth, Northumberland and Forest Hall in the outskirts of Newcastle. At school I found equal enjoyment in studying both English language and mathematics, but when deciding what to take at university I couldn’t think of something that would satisfy both, so I went with language.

I moved to Lancaster in 2010 to study my BA in English Language, which I soon converted to Linguistics. It was only in my third year that I was introduced to corpus linguistics, and became fascinated with its potential for revealing things about the way we communicate which I would never have predicted. I also liked its combination of quantitative and qualitative analysis, so it seemed like the perfect way to reengage with my enjoyment of maths. I had always been open to the idea of postgraduate study so when the opportunity came up to join CASS under the supervision of Tony McEnery it felt like the best thing for me to do.

Since joining CASS in the summer last year I have worked on several interesting projects including the changing language of gay rights opposition in Parliamentary debates (with Paul Baker), comments on online newspaper articles (with Amanda Potts), and the representation of Muslim people and Islam in the press reaction to the 2013 Woolwich incident (with Tony McEnery). I will be presenting findings on the Woolwich project at the upcoming Young Linguists’ Meeting in Poznań.

When I’m not playing with words on a computer, I am usually found rehearsing for a play or musical, playing my keyboard or eating any and all varieties of hummus.


For our People page for a full list of the centre’s investigators, researchers, and students.

CASS awarded £200,000 from landmark ESRC Urgency Grant Scheme

CASS is delighted to announce a successful ESRC application for funding on a project entitled “Twitter rape threats and the discourse of online misogyny” (ES/L008874/1). The award of £191,245.25 was one of the first (possibly even the first) to be made as part of the ESRC’s new Urgency Grants scheme. Under this scheme, applications are assessed very quickly, and projects also start within four weeks of a successful award. This particular project will begin in November and run for fourteen months. It will be part of the CASS Centre, and the team will be comprised of Claire Hardaker (PI), Tony McEnery (CI), Paul Baker (CI), Andrew Hardie (CI), Paul Iganski (CI), and two CASS-hosted research assistants.

This project will investigate the rape and death threats sent on Twitter in July and August 2013 to a number of high profile individuals, including MP Stella Creasy and journalist Caroline Criado-Perez. This project seeks to address the remarkable lack of research into such behaviour, especially in light of the fact that policymakers and legislators are under intense pressure to make quick, long-term decisions on relevant policy and procedure to allow enforcement agencies to act on this issue. Specifically, the project will investigate what the language used by those who send rape/death threats on Twitter reveals about…

  1. their concerns, interests, and ideologies; what concept do they seem to have of themselves and their role in society?
  2. their motivations and goals; what seems to trigger them? What do they seem to be seeking?
  3. the links between them and other individuals, topics, and behaviours; do they only produce misogynistic threats or do they engage in other hate-speech? Do they act alone or within networks?

The project will take a corpus approach, incorporating several innovative aspects, and it will produce results that should be relevant to several social sciences including sociology, criminology, politics, psychology, and law. It will also offer timely insight into an area where policy, practice, legislation, and enforcement is currently under intense scrutiny and requires such research to help shape future developments. As such, the results will likely be of interest to legislators, policymakers, investigative bodies, and law enforcement agencies, as well as the study participants, media, and general public.