New CASS Briefing now available — How to communicate successfully in English?

CASSbriefings-EDLHow to communicate successfully in English? An exploration of the Trinity Lancaster Corpus. Many speakers use English as their non-native language (L2) to communicate in a variety of situations: at school, at work or in other everyday situations. As well as needing to master the grammar and vocabulary of the English language, L2 users of English need to know how to react appropriately in different communicative situations. In linguistics, this aspect of language is studied under the label of “pragmatics”. This briefing offers an exploration of the pragmatic features of L2 speech in the Trinity Lancaster Corpus of spoken L2 production.

New resources are being added regularly to the new CASS: Briefings tab above, so check back soon.

Trinity Lancaster Corpus at the International ESOL Examiner Training Conference 2015

On Friday 30th January 2015, I gave a talk at the International ESOL Examiner Training Conference 2015 in Stafford. Every year, the Trinity College London, CASS’s research partner, organises a large conference for all their examiners which consists of plenary lectures and individual training sessions. This year, I was invited to speak in front of an audience of over 300 examiners about the latest development in the learner corpus project.  For me, this was a great opportunity not only to share some of the exciting results from the early research based on this unique resource, but also to meet the Trinity examiners; many of them have been involved in collecting the data for the corpus. This talk was therefore also an opportunity to thank everyone for their hard work and wonderful support.

It was very reassuring to see the high level of interest in the corpus project among the examiners who have a deep insight into examination process from their everyday professional experience.  The corpus as a body of transcripts from the Trinity spoken tests in some way reflects this rich experience offering an overall holistic picture of the exam and, ultimately, L2 speech in a variety of communicative contexts.

Currently, the Trinity Lancaster Corpus consists of over 2.5 million running words sampling the speech of over 1,200 L2 speakers from eight different L1 and cultural backgrounds. The size itself makes the Trinity Lancaster Corpus the largest corpus of its kind. However, it is not only the size that the corpus has to offer. In cooperation with Trinity (and with great help from the Trinity examiners) we were able to collect detailed background information about each speaker in our 2014 dataset. In addition, the corpus covers a range of proficiency levels (B1– C2 levels of the Common European Framework), which allows us to research spoken language development in a way that has not been previously possible.  The Trinity Lancaster Corpus, which is still being developed with an average growth of 40,000 words a week, is an ambitious project:  Using this robust dataset, we can now start exploring crucial aspects of L2 speech and communicative competence and thus help language learners, teachers and material developers to make the process of L2 learning more efficient and also (hopefully) more enjoyable. Needless to say, without Trinity as a strong research partner and the support from the Trinity examiners this project wouldn’t be possible.

Trinity oral test corpus: The first hurdle

At Trinity we are wildly excited – yes, wildly – to finally have our corpus project set up with CASS. It’s a unique opportunity to create a learner corpus of English based on some fairly free flowing L2 language which is not too constrained by the testing context.  All Trinity oral tests are recorded and most of the tests include one or two tasks where the candidate has free rein to talk about their own interests in their own way – very much their own contributions, expressed as themselves. We have been hoping to use what is referred to as our ‘gold dust’ for research that will be meaningful – not just to the corpus community but also in terms of the impact on our tests and our feedback to learners and teachers. Working with CASS has now given us this golden opportunity.

The project is now up and running and in the corpus building stage and we have moved from the heady excitement of imaging what we could do with all the data to the grindstone of pulling together all the strands of meta data needed to make the corpus robust and useful. The challenges are real – for example, we need to log first languages but how do we ensure reliability? Meta data is now an  opt-in in most countries so how do we capture everyone? Even when the data boxes are completed how do we know it’s true? No, the only way is the very non-technological method of contacting the students again and following up in person.

A related concern is has the meta data we need shifted? We would normally be interested in what kind of input students had had to their learning so e.g. how many years study etc. In the past, part of this  data gathering was to ask about time learners had spent in an English-speaking country. Should this now be shifted to time spent watching videos online in English, in social media, in reading online sources? What is relevant –and also collectable?

The challenges in what might be considered this no-core information is forcing us to re-examine how sure we are about influences on learning – not just our perception but form the learner’s perception as well.