Spoken Learner Corpus (SLC) Project

TLC-LogoThis project is a collaboration between CASS and Trinity College London, a major international exam board. The aim of the project is to create a large corpus of learner (and examiner) speech which will be used in a wide range of research contexts including Second Language Acquisition, language testing, L2 pedagogy and material development etc. Trinity Lancaster Learner Corpus will be made freely available to the research community (planned release 2017).

The corpus will be a unique research resource for investigating learner speech at different proficiency levels (advanced, intermediate and lower intermediate/threshold) and will provide an insight into spoken learner production across different tasks (both monologic and interactive). Also, the corpus will sample language of learners with a variety of L1 backgrounds, representing English speakers from Italy, Spain, Mexico, China, India, Sri Lanka and Russia.

*** We are currently accepting applications for the Trinity Lancaster Corpus Early Access Data Grant Scheme. Click here for more information. ***


Principal Investigator: Tony McEnery


Trinity Team:

  • Elaine Boyd
  • Cathy Taylor
  • Avril Ikeda-Wood

Senior Research Associate: Dana Gablasova

Audio transcriber: Ruth Avon

Read the latest updates on this project:

  • Further Trinity Lancaster Corpus research: Examiner strategies (25 August 2016)

    This month saw a further development in the corpus analyses: the examiners. Let me introduce myself, my name is Cathy Taylor and I’m responsible for examiner training at Trinity and was very pleased to be asked to do some corpus research into the strategies the examiners use when communicating with the test takers. In the GESE ...

  • TLC and innovation in language testing (26 May 2016)

    One of the objectives of Trinity College London investing in the Trinity Lancaster Spoken Corpus has been to share findings with the language assessment community. The corpus allows us to develop an innovative approach to validating test constructs and offers a window into the exam room so we can see how test takers utilise their ...

  • From Corpus to Classroom 2 (16 March 2016)

    There is great delight that the Trinity Lancaster Corpus is providing so much interesting data that can be used to enhance communicative competences in the classroom. From Corpus to Classroom 1 described some of these findings. But how exactly do we go about ‘translating’ this for classroom use so that it can be used by ...

  • Syntactic structures in the Trinity Lancaster Corpus (3 March 2016)

    We are proud to announce collaboration with Markus Dickinson and Paul Richards from the Department of Linguistics, Indiana University on a project  that will analyse syntactic structures in the Trinity Lancaster Corpus. The focus of the project is to develop a syntactic annotation scheme of spoken learner language and apply this scheme to the Trinity ...

  • From Corpus to Classroom 1 (17 February 2016)

    The Trinity Lancaster Corpus of Spoken Learner English is providing multiple sets of data that can not only be used for validating the quality of our tests but also – and most importantly – to feedback important features of language that can be utilised in the classroom. It is essential that some of our research ...

  • The heart of the matter … (30 April 2015)

    How wonderful it is to get to the inner workings of the creature you helped bring to life! I’ve just spent a week with the wonderful – and superbly helpful – team at CASS devoting time to matters on the Trinity Lancaster Spoken Corpus. Normally I work from London situated in the very 21st century environment ...

  • New CASS Briefing now available — How to communicate successfully in English? (26 February 2015)

    How to communicate successfully in English? An exploration of the Trinity Lancaster Corpus. Many speakers use English as their non-native language (L2) to communicate in a variety of situations: at school, at work or in other everyday situations. As well as needing to master the grammar and vocabulary of the English language, L2 users of English need to know how to ...

  • The arrival of the Trinity Lancaster Learner Corpus logo (12 February 2015)

    We at the Trinity Lancaster Learner Corpus team are very pleased to announce that we have a logo for our lovely corpus. We very much hope that it represents the corpus by capturing its key features. We knew we wanted to portray what we feel are the unique aspects of our corpus – interactive L2 ...

  • Trinity Lancaster Corpus at the International ESOL Examiner Training Conference 2015 (10 February 2015)

    On Friday 30th January 2015, I gave a talk at the International ESOL Examiner Training Conference 2015 in Stafford. Every year, the Trinity College London, CASS’s research partner, organises a large conference for all their examiners which consists of plenary lectures and individual training sessions. This year, I was invited to speak in front of ...

  • A Journey into Transcription, Part 4: The Question Question (3 February 2015)

    question: (NOUN) A sentence worded or expressed so as to elicit information. Since we speak in utterances (not sentences), most forms of punctuation are omitted in this corpus of learner language; the exceptions being apostrophes, hyphens and question marks.  This blog concerns question marks.  (Warning: there are not many jokes!) When we started transcription, the convention seemed simple and straightforward: ...