Trinity Lancaster Spoken Learner Corpus: A milestone to celebrate

On Monday 19 May we came together to celebrate the completion of the first part of the Trinity Lancaster Spoken Learner Corpus project. The transcription of our 2012 dataset is now complete and the corpus comprises 1.5 million running words. The Trinity Lancaster Spoken Learner Corpus represents a balanced sample of learner speech from six different countries (Italy, Spain, Mexico, India, China and Sri Lanka) covering the B1.2 – C2 levels of the Common European Framework (CEFR). Below are some pictures from our small celebration.


trinity1 trinity2

We are continuing with the corpus development adding more data from our 2014 dataset so there is still a lot of work to be done. However, we are really excited about the possibilities of applied linguistic and language testing research based on this unique dataset.

You can read more about the Trinity Lancaster Spoken Learner Corpus in the AEA-Europe newsletter report.

Introducing Challenge Panel Member: Karin Aijmer

We are very pleased to announce Karin Aijmer’s membership on the CASS Challenge Panel. An introduction, in her own words:


I am Professor Emerita in English Linguistics at the University of Gothenburg. I have been using corpora and corpus-linguistic methods to study topics in several different areas.

One of my research areas involves the study of spoken English. The research deals for example with speech acts, discourse coherence as well as words and constructions which have special functions in spoken language (pragmatic markers, speech management phenomena such as pausing and self-repair). In my own research I have used the Lund Corpus of Spoken English to study speech acts which have been conventionalized and can therefore be studied on the basis of particular speech act words. We can use more or less fixed ‘conversational routines’ for thanking, apologising and requesting in addition to less conventionalized ways of expressing the same acts.  Another topic which I have dealt with in several publications is connectives and pragmatic markers (such as you know, well, oh). These are essential for successful communication. However they are notoriously difficult to describe since they do not have meanings in the same way as lexical words but contribute to discourse coherence or express attitudes and feelings.

I have also been part of a research team compiling the English-Swedish Parallel Corpus. A parallel corpus consists of translations from one language into another and vice versa. By confronting two languages in translation we can get a fine-grained picture of similarities and differences between the languages. Cross-linguistic studies can therefore contribute to our knowledge of what is universal and what is language-specific. The area is also of interest for translation studies and for the training of translators.

Another branch of my research deals with learner corpora and is closely associated with foreign language teaching and second language acquisition. The Swedish learner corpus was compiled as a part of an international project (International Corpus of Learner English) and consists of argumentative essays written by advanced Swedish learners of English.  On the basis of learner corpora we can study learner problems which do not involve errors and are therefore difficult to detect unless we use quantitative methods.  In addition a spoken learner corpus has been compiled with Swedish learners of English. We can therefore compare the English spoken by non-native speakers and native speakers.

My publications in these areas include several books where I use corpora to study spoken English. I have also co-edited Handbooks of Socio-Pragmatics and Corpus Pragmatics as well as corpus-based studies in contrastive linguistics, on pragmatic markers in contrast and on the use of corpora in language teaching.  I am also the co-author of a text-book on Pragmatics.

I much enjoyed meeting members of the challenge panel at the Corpus Linguistics 2013 conference at Lancaster in the summer.  I am looking forward to future cooperation at the Centre for Corpus Approaches to Social Science.

Did you miss our previous introductions? Click through to the Challenge Panel page to see profiles, and check back soon for updates.