Spoken BNC2014 project announcement

BNC2014 logo

We are excited to announce that the ESRC-funded Centre for Corpus Approaches to Social Science (CASS) at Lancaster University and Cambridge University Press have agreed to collaborate on the compilation of a new, publicly accessible corpus of spoken British English called the ‘Spoken British National Corpus 2014’ (the Spoken BNC2014).

The aim of the Spoken BNC2014 project, which will be led jointly by Lancaster University’s Professor Tony McEnery and Cambridge University Press’ Dr Claire Dembry, is to compile a very large collection of recordings of real-life, informal, spoken interactions between people whose first language is British English. These will then be transcribed and made available publicly for a wide range of research purposes.

We aim to encourage people from all over the UK to record their interactions and send them to us as MP3 files. For each hour of good quality recordings we receive, along with all associated consent forms and information sheets completed correctly, we will pay £18. Each recording does not have to be 1 hour in length; participants may submit two 30 minute recordings, or three 20 minute recordings, but for each hour in total, they will receive £18.

The collaboration between CASS at Lancaster University and Cambridge University Press brings together the best resources available for this task. Cambridge University Press is greatly experienced at collecting very large English corpora, and it already has the infrastructure in place to undertake such a large compilation project. CASS at Lancaster University has the linguistic research expertise necessary to ensure that the spoken BNC2014 will be as useful, and accessible as possible for a wide range of purposes. The academic community will benefit from access to a new large spoken British English corpus that is balanced according to a selection of useful demographic criteria, including gender, age, and socio-economic status. This opens the door for all kinds of research projects including the comparison of the spoken BNC2014 with older spoken corpora.

CASS at Lancaster University and Cambridge University Press are very excited to launch the Spoken BNC2014 project, and we look forward to sharing the corpus as widely as possible once it is complete.

To contribute to the Spoken BNC2014 project as a participant please email corpus(Replace this parenthesis with the @ sign)cambridge.org for more information.

CASS visit to Ghana

On June 24th, I and three other members of CASS spent a week in Accra, Ghana, demonstrating corpus methods and our own research at two universities, the University of Ghana and the recently established Lancaster University Ghana campus in Accra. From the UK it’s just over a six hour flight although thankfully only one hour of time difference. However, travel did involve some advance preparation, with jabs for yellow fever (and a few other things), visa applications and taking anti-malarial pills for a month after the trip. Fortunately, we only encountered one mosquito during the whole trip and none of us were bitten.

Although close together, the two universities we visited have a very different feel to them, the former is a large university spread out over a lot of land, with many departments and buildings, while the latter is (at the moment), a three storey modern-looking grey and red building with the familiar Lancaster logo on it.

ghana1

Our first trip was to the University of Ghana, where Andrew, Tony and I each gave a lecture to about 90 members of staff and students. Tony talked about the theoretical principles behind corpus linguistics, I discussed (and problematized) sex differences in the British National Corpus and Andrew showed applications of corpus linguistics to field linguistics using Corpus Workbench. The University of Ghana has some alumni members of Lancaster University and it was great to run into Clement Appah and Grace Diabah (formely Bota) again.

ghana2

Over the following two days, we gave corpus linguistics workshops, which included a two hour lab session where Andrew walked students through setting up a CQPweb account and doing some analysis of the Brown Family of corpora. I suspect this was the highlight of the day for those who attended, who were pleased to get access to many of the corpora we have at Lancaster. Each day we taught about 35 people, including some who had travelled quite long distances to get to us. Four students had driven in that morning from Cape Coast – a journey that we did some of when we went to Kakum National Park on our day off, and that took us over three hours – so we were impressed by their dedication. Tony gave an introduction to corpus linguistics and Vaclav talked about the General Service List for English words and let the students use a tool he had developed for exploring it. I ended each day with a talk on corpus linguistics and discourse analysis.

ghana3

As I’d mentioned, we had a day off, where we visited Kakum National Park. This gave us an opportunity to see more of Ghana on the drive there, and then we had a great experience in the park, walking across a 350m network of rope bridges (the Kakum Canopy Walk) that were suspended high above the ground – you literally got a bird’s eye view of the tropical rainforest below. It was one of the most memorable experiences I’ve had and I think we all came away with very positive feelings about our trip, and are looking forward to our next visit to Ghana. I also hope that we managed to inspire people to incorporate some corpus linguistics methods into their own research.