Spoken BNC2014 Symposium

On the afternoon of Monday 26th June, CASS hosted a special symposium to celebrate the upcoming public launch of the Spoken British National Corpus 2014 โ€“ a corpus which members of CASS and Cambridge University Press have spent the last three years compiling.

More than fifty guests attended, representing a mixture of Lancaster Summer Schools participants, members of the CASS Challenge Panel, and those who travelled to Lancaster just for the day.

To kick off the symposium, CASS Centre Director Andrew Hardie said a few words about the history of Corpus Linguistics at Lancaster University, and put the compilation of a new BNC into context against previous developments in the field. He expressed his delight at the interest in the Spoken BNC2014 project as evidenced by the number of guests who were in attendance for the symposium.

I then gave the first talk alongside Claire Dembry (from Cambridge University Press) and Andrew Hardie, as representatives of the Spoken BNC2014 research team which also includes Vaclav Brezina and Tony McEnery. We discussed the main methodological decisions we made when thinking about the design, data collection, transcription and processing of the corpus. Andrew then gave a quick demonstration of the corpus in CQPweb, showing how features including speaker IDs, overlaps and attribution confidence are displayed in the interface.

Following our talk came the first of four research presentations, all of which used (the early access subset of) the Spoken BNC2014. The first of these was a talk by Karin Aijmer (University of Gothenburg) about the intensifier fucking, which went down very well with the audience. Karinโ€™s Spoken BNC2014 research, which also includes other intensifiers, will be published as a chapter in Brezina et al. (forthcoming).

After a short break for refreshments, Jacqueline Laws (University of Reading) presented research into verb-forming suffixation which she had undertaken with Chris Ryder and Sylvia Jaworska. Comparing the demographically-sampled component of the Spoken BNC1994 to the new Spoken BNC2014, she found that females now appear to produce more neologisms (e.g. favouritize, popify) compared to males. Laws et al.โ€™s research will be published in a forthcoming special issue of the International Journal of Corpus Linguistics.

Susan Reichelt (Lancaster University) was next to present her work on producing sociolinguistically comparable subsets of both the original and new Spoken British National Corpora. She highlighted a point which I had touched upon in my earlier talk: that the compilation of the Spoken BNC2014 sought to strike a balance between direct comparability with the original corpus on the one hand, and methodological improvement on the other. The areas where improvement was favoured over comparability (e.g. the classification of speaker socio-economic status) ought to be considered especially when thinking about sociolinguistic analysis. Susanโ€™s work is associated with the recently announced CASS SDA project.

Finally, Jonathan Culpeper and Mathew Gillings (Lancaster University) presented their work on politeness variation between the north and south of England. They aimed to assess the extent to which commonly held stereotypes about differences between northern and southern politeness were reflected in language use in both the original and new corpora as a single dataset. Their work will be published as a chapter in Brezina et al. (forthcoming).

My reaction as the organiser of the symposium was that there is definitely a sense of anticipation about the release of the Spoken BNC2014, which is planned to take place in the autumn. Furthermore it was lovely to meet so many friendly and enthusiastic attendees. I am very grateful to each of the speakers for giving such interesting talks, and to all who attended โ€“ especially those who tweeted their reactions to the talks using the #BNC2014 hashtag! As one of my final duties as a member of CASS before moving onto pastures new, I am very glad that the symposium went as well as it did.