Data-driven learning: learning from assessment

The process of converting valuable spoken corpus data into classroom materials is not necessarily straightforward, as a recent project conducted by Trinity College London reveals.

One of the buzz words we increasingly hear from teacher trainers in English Language Teaching (ELT) is the use of data-driven learning. This ties in with other contemporary pedagogies, such as discovery learning.  A key component of this is how data from a corpus can be used to inform learning. One of our long-running projects with the Trinity Lancaster Corpus has been to see how we could use the spoken data in the classroom so that students could learn from assessment as well as for assessment. We have reported before (From Corpus to Classroom 1 and From Corpus to Classroom 2) on the research focus on pragmatic and strategic examples. These linguistic features and competences are often not practised – or are only superficially addressed – in course books and yet can be significant in enhancing learners’ communication skills, especially across cultures. Our ambition is to translate the data findings for classroom use, specifically to help teachers improve learners’ wider speaking competences.

We developed a process of constructing sample worksheets based on, and including, the corpus data. The data was contextualized and presented to teachers in order to give them an opportunity to use their expertise in guiding how this data could be developed for, and utilized in, the classroom. So, essentially, we asked teachers to collaborate on checking how useful the data and tasks were and potentially improving these tasks. We also asked teachers to develop their own tasks based on the data and we now have the results of this project.

Overwhelmingly, the teachers were very appreciative of the data and they each produced some great tasks. All of these were very useful for the classroom but they did not really exploit the unique information we identified as being captured in the data. We have started exploring why this might be the case.

What the teachers did was the following:

  • Created noticing and learner autonomy activities with the data (though most tasks would need much more scaffolding).
  • Focused on traditional information about phrases identified in the data, e.g. the strength and weakness of expressions of agreement.
  • Created activities that reflected traditional course book approaches.
  • Created reflective, contextual practice related to the data although this sometimes became lost in the addition of extra non-corpus texts.

We had expectations that the data would inspire activities which:

  • showed new ways of approaching the data
  • supported discovery learning tasks with meaningful outcomes
  • explored the context and pragmatic functions of the data
  • reflected pragmatic usage; perhaps even referring to L1 as a resource for this
  • focused on the listener and interpersonal aspects rather than just the speaker

It was clear that the teachers were intellectually engaged and excited, so we considered the reasons why their tasks had taken a more traditional path than expected. Many of these have been raised in the past by Tim Johns and Simon Borg. There is no doubt that the heavy teacher workload affects how far teachers feel they can be innovative with materials. There is a surety in doing what you know and what you know works. Also many teachers, despite being in the classroom everyday, often need a certain confidence to design input when this is traditionally something that has been left to syllabus and course book creators. Another issue was that we realised that teachers would probably have to have more support in understanding corpus data and many don’t have the time to do extra training. Finally, there may be the issue with this particular data that teachers may not be fully aware of the importance of pragmatic and strategic competences. Often they are seen as an ‘add-on’ rather than a core competence especially in contexts for contemporary communications when it is largely being used as a lingua franca.

Ultimately, there was a difference between what the researchers ‘saw’ and what the teachers ‘saw’. As an alternative, we asked a group of expert material writers to produce new tasks and they have produced some innovative material. We concluded that maybe this is a fairer approach. In other words, instead of expecting each of the roles involved in language teaching (SLA researchers, teachers, materials designers) to find the time to become experts in new skills, it may sometimes be better to use each other as a resource. This would still be a learning experience as we draw on each other’s expertise.

In future if we want teachers to collaborate on designing materials we must make sure we discuss the philosophy or pedagogy behind our objectives (Rapti, 2013) with our collaborators, that we show how the data is mapped to relevant curricula and that we recognise the restrictions caused by practical issues such as a lack of time or training opportunities.

The series of worksheets is now available from the Trinity College London website. More to come in the future so keep checking.

Corpus-based insights into spoken L2 English: Introducing eight projects that use the Trinity Lancaster Corpus

In November 2016, we announced the Early Data Grant Scheme in which researchers could apply for access to the Trinity Lancaster Corpus (TLC) before its official release in 2018.  The Early Data subset of the corpus contains 2.83 million words from 1,244 L2 speakers.

The Trinity Lancaster Corpus project is a product of an ongoing collaboration between The ESRC Centre for Corpus Approaches to Social Science (CASS), Lancaster University, and Trinity College London, a major international examination board. The Trinity Lancaster Corpus contains several features (rich metadata, a range of proficiency levels, L1s and age groups) that make it an important resource for studying L2 English. Soon after we started working on the corpus development in 2013, we realised the great potential of the dataset for researchers in language learning and language testing. We were very excited to receive a number of outstanding applications from around the world (Belgium, China, Germany, Italy, Spain, UK and US).  The selected projects cover a wide range of topics focusing on different aspects of learner language use. In the rest of this blog post we introduce the successful projects and their authors.

  1. Listener response in examiner-EFL examinee interactions

Erik Castello and Sara Gesuato, University of Padua

The term listener response is used to denote (non-)verbal behaviour produced in reaction to an interlocutor’s talk and sharing a non-turn status, e.g. short verbalisations, sentence completion, requests for clarifications, restatements, shakes, frowns (Xudong 2009). Listener response is a form of confluence-oriented behaviour (McCarthy 2006) which contributes to the construction and smooth handling of conversation (Krauss et al. 1982). Response practices can vary within the same language/culture in terms of placement and function in the turn sequence and the roles played by the same listener response types (Schiffrin 1987; Gardner 2007). They can also vary across cultures/groups (Cutrone 2005; Tottie 1991) and between the sexes (Makri-Tsilipakou 1994; Rühlemann 2010). Therefore, interlocutors from different linguistic/cultural backgrounds may experience communication breakdown, social friction and the emergence of negative attitudes (Wieland 1991; Li 2006), including participants in examiner-EFL examinee interactions (Götz 2013) and in EFL peer-to-peer interactions (Castello 2013). This paper explores the listener response behaviour of EFL examinees in the Trinity Lancaster Corpus (Gablasova et al. 2015), which may display interference from the examinees’ L1s and affect the examiners’ impression of their fluency. It aims to: identify forms of verbal listener responses in examinee turns and classify them in terms of conventions of form (mainly following Clancy et al. 1996) and conventions of function (mainly following Maynard 1997); identify strategies for co-constructing turn-taking, if any (Clancy/McCarthy 2015); and determine the frequencies of occurrence of the above phenomena across types of interaction, examinees’ perceived proficiency levels and between the sexes.

Erik Castello is Assistant Professor of English Language and Translation at the University of Padua, Italy. His research interests include (learner) corpus linguistics, discourse analysis, language testing, academic English and SFL. He has co-edited two volumes and published two books and several articles on these topics.

Sara Gesuato is Associate Professor of English language at the University of Padua, Italy. Her research interests include pragmatics, genre analysis, verbal aspect, and corpus linguistics. She has co-edited two volumes on pragmatic issues in language teaching, and is currently investigating sociopragmatic aspects of L2 written speech acts.

  1. Formulaic expressions in learner speech: New insights from the Trinity Lancaster Corpus

Francesca Coccetta, Ca’ Foscari University of Venice

This study investigates the use of formulaic expressions in the dialogic component of the Trinity Lancaster Corpus. Formulaic expressions are multi-word units serving pragmatic or discourse structuring functions (e.g. discourse markers, indirect forms performing speech acts, and hedges), and their mastery is essential for language learners to sound more native-like. The study explores the extent to which the Trinity exam candidates use formulaic expressions at the various proficiency levels (B1, B2 and C1/C2), and the differences in their use between successful and less successful candidates. In addition, it investigates how the exam candidates compare with native speakers in the use of formulaic expressions. To do this, recurrent multi-word units consisting of two to five words will be automatically extracted from the corpus using Sketch Engine; then, the data will be manually filtered to eliminate unintentional repetitions, phrase and clause fragments (e.g. in the, it and, of the), and the multi-word units that do not perform any pragmatic or discourse function. The high-frequency formulaic expressions of each proficiency level will be provided and compared with each other and with the ones identified in previous studies on native speech. The results will offer new insights into learners’ use of prefabricated expressions in spoken language, particularly in an exam setting.

Francesca Coccetta is a tenured Assistant Professor at Ca’ Foscari University of Venice. She holds a doctorate in English Linguistics from Padua University where she specialised in multimodal corpus linguistics. Her research interests include multimodal discourse analysis, learner corpus research, and the use of e-learning in language learning and teaching. 

  1. The development of high-frequency verbs in spoken EFL and ESL

Gaëtanelle Gilquin, Université catholique de Louvain

This project aims to contribute to the recent effort to bridge the paradigm gap between second language acquisition research and corpus linguistics. While most such studies have relied on written corpus data to compare English as a Foreign Language (EFL) and English as a Second Language (ESL), the present study will take advantage of a new resource, the Trinity Lancaster Corpus, to compare speech in an EFL variety (Chinese English) and in an ESL variety (Indian English). The focus will be on high-frequency verbs and how their use develops across proficiency levels in the two varieties, as indicated by the CEFR scores provided in the corpus. Various aspects of language will be considered, taking high-frequency verbs as a starting point, among which grammatical complexity (e.g. through the use of infinitival constructions of the causative type), idiomaticity (e.g. through the degree of typicality of object nouns) and fluency (e.g. through the presence of filled pauses in the immediate environment). The assumption is that, given the different acquisitional contexts of EFL and ESL, one and the same score in EFL and ESL may correspond to different linguistic realities, and that similar developments in scores (e.g. from B1 to B2) may correspond to different developments in language usage. More particularly, it is hypothesised that EFL speakers will progress more rapidly in aspects that can benefit from instruction (e.g. those involving grammatical rules), whereas ESL speakers will progress more rapidly in aspects that can benefit from exposure to naturalistic language (like phraseology).

Gaëtanelle Gilquin is a Lecturer in English Language and Linguistics at the University of Louvain. She is the coordinator of LINDSEI and one of the editors of The Cambridge Handbook of Learner Corpus Research. Her research interests include spoken learner English, the link between EFL and ESL, and applied construction grammar.

  1. Describing fluency across proficiency levels: From ‘can-do- statements’ towards learner-corpus-informed descriptions of proficiency

Sandra Götz, Justus Liebig University Giessen

 While it has been noted that current assessment scales (e.g. the Common European Framework of Reference; CEF; Council of Europe 2009) describing learners’ proficiency levels in ‘can-do-statements’ are often formulated somewhat vaguely (e.g. North 2014), researchers and CEF-developers have pointed out the benefits of including more specific linguistic descriptors emerging from learner corpus analyses (e.g. McCarthy 2013; Park 2014). In this project, I will test how/if descriptions of fluency in learner language such as the CEF can benefit from analyzing learner data at different proficiency levels in the Trinity Lancaster Corpus. More specifically I will test if the learners’ proficiency levels can serve as robust predictors in their use core fluency variables, such as filled and unfilled pauses (e.g. er, erm, eh, ehm), discourse markers (e.g. you know, like, well), or small words (e.g. sort of, kind of). Also, I will test if learners show similar or different paths in their developmental stages of fluency from the B1 to the C2 level, regardless of (or dependent on) their L1. Through the meta-information available on the learners in the Trinity Lancaster Corpus, sociolinguistic and learning context variables (such as the learners’ age, gender or the task type) will also be taken into consideration in developing data-driven descriptor scales on fluency at different proficiency levels. Thus, it will be possible to differentiate between L1-specific and universal learner features in fluency development.

Sandra Götz obtained her PhD from Justus Liebig University Giessen and Macquarie University Sydney in 2011. Since then, she has been working as a Senior Lecturer in English Linguistics at University of Giessen. Her main research interests include (learner) corpus linguistics and its application to language teaching and testing, applied linguistics and World Englishes.

  1. Self-repetition in the spoken English of L2 English learners: The effects of task type and proficiency levels

Lalita Murty, York University

Self-repetition (SR) where the speaker repeats a word/phrase is a much-observed phenomenon in spoken discourse. SR serves a range of distinct communicative and interactive functions in interactions such as expressing agreement or disagreement or adding emphasis to what the speaker wants to say as the following example shows ‘Yes, I know I know and I certainly think that limits are…’ (to express agreement with the previous speaker) (Gablasova, et al, 2015). Self-repetitions also help in creating coherence (Bublitz, 1989 as cited in Fung, 2007: 224), enhancing the clarity of the message (Kaur, 2012), keeping the floor, maintaining smooth flow of conversation, linking speaker’s ideas to previous speaker’s ideas (Tannen, 1989), and initiating self and other repairs (Bjorkman, 2011; Robinson and Kevoe-Feldman, 2010). This paper will use Sketch Engine to extract instances of single content word self-repetitions in the Trinity Lancaster Corpus data to examine the effect of (i) L2 proficiency levels and (ii) task types on the frequency and functions of different types of self-repetitions made by speakers at varying proficiency levels in the different tasks. A quantitative and qualitative analysis of the data thus extracted will be conducted using a mix of Norrick’s (1987) framework along with CA approaches.

Lalita Murty is a Lecturer at the Norwegian Study Centre, University of York.  Her previous research focused on spoken word recognition and call centre language. Currently she is working on Reduplication and Iconicity in Telugu, a South Indian language.

  1. Certainty adverbs in learner language: The role of tasks and proficiency

Pascual Pérez-Paredes, University of Cambridge and María Belén Díez-Bedmar, University of Jaén

When comparing native and non-native use of stance adverbs, the effect of task has been largely ignored. An exception is Gablasova et al.’s (2015). The authors researched the effect of different speaking tasks on L2 speakers’ use of epistemic stance markers and concluded that there was a significant difference between the monologic prepared tasks and every other task and between the dialogic general topic and the dialogic pre-selected topic (p < .05). This study suggests that the type of speaking task conditions speakers’ repertoire of markers, including certainty markers. Pérez-Paredes & Bueno (forthcoming) looked at how certainty stance adverbs were employed during the picture description task in the LINDSEI and the extended LOCNEC (Aguado et al., 2012). In particular, the authors discussed the contexts of use of obviously, really and actually by native and NNSs across the same speaking task in the four datasets when expressing the range of meanings associated with certainty. The authors found that different groups of speakers used these adverbs differently, both quantitatively and qualitatively. Our research seeks to expand the findings in Gablasova et al.’s (2015) and Pérez-Paredes & Bueno (forthcoming) and examine the uses of certainty adverbs across the L1s, proficiency and tasks represented in the Trinity Lancaster Corpus. We believe that the use of this corpus, together with the findings from the LINDSEI, will help us reach a better understanding of the uses of certainty adverbs in spoken learner language.

Pascual Pérez-Paredes is a Lecturer in Research in Second Language Education at the Faculty of Education, University of Cambridge. His main research interests are learner language variation, the use of corpora in language education and corpus-assisted discourse analysis.

María Belén Díez-Bedmar is Associate Professor at the University of Jaén (Spain). Her main research interests include Learner Corpus Research, error-tagging, the learning of English as a Foreign Language, language testing and assessment, the CEFR and CMC.  She is currently involved in national and international corpus-based projects.

  1. Emerging verb constructions in spoken learner English

Ute Römer and James Garner, Georgia State University

Recent research in first language (L1) and second language (L2) acquisition has demonstrated that we learn language by learning constructions, defined as conventionalized form-meaning pairings. While studies in L2 English acquisition have begun to examine construction development in learner production data, these studies have been based on rather small corpora. Using a larger set of data from the Trinity Lancaster Corpus (TLC), this study investigates how verb-argument constructions (VACs; e.g. ‘V about n’) emerge in the spoken English of L2 learners at different proficiency levels. We will systematically and exhaustively extract a small set of VACs (’V about n’, ‘V for n’, ‘V in n’, ‘V like n’, and ‘V with n’) from the L1 Italian and L1 Spanish subsets of the TLC, separately for three CEFR proficiency levels. For each VAC and L1-proficiency combination (e.g. Italian-B1), we will create frequency-sorted verb lists, allowing us to determine how learners’ verb-construction knowledge develops with increasing proficiency. We will also examine in what ways VAC emergence in the TLC data is influenced by VAC usage as captured in a large native-speaker reference corpus (the BNC). We will use chi-square tests to compare VAC type and token frequencies across L1 subsets and proficiency levels. We will use path analysis (a type of structural equation modeling) including the predictor variables L1 status, proficiency level, and BNC usage information to gain insights into how learner characteristics and variables concerning L1 construction usage affect the emergence of the target VACs in spoken L2 learner English.

Ute Römer is currently Assistant professor in the Department of Applied Linguistics and ESL at Georgia State University. Her research interests include corpus linguistics, phraseology, second language acquisition, discourse analysis, and the application of corpora in language teaching. She serves on a range of editorial and advisory boards of professional journals and organizations, and is general editor of the Studies in Corpus Linguistics book series.

James Garner is currently a PhD student in the Department of Applied Linguistics and ESL at Georgia State University. His current research interests include learner corpus research, phraseology, usage-based second language acquisition, and data-driven learning.

  1. Verb-argument constructions in Chinese EFL learners’ spoken English production

Jiajin Xu and Yang Liu, Beijing Foreign Studies University

The widespread recognition of usage-based approach to constructions has made Corpus Linguistics a most viable methodology to scrutinise such frequent morpho-syntactic patterns as verb-argument constructions (VACs) in learner language. The present study attempts to examine the use of VACs in Chinese EFL learners’ spoken English. Our focus will be on the semantics of the verbal constructions in light of collostructional statistics (Stefanowitsch & Gries, 2003) as well as the comparisons across learners’ proficiency levels and task types. 20 VACs were collected from COBUILD Grammar Patterns 1: Verbs (Francis, Hunston & Manning, 1996). On the basis of the retrieved VAC concordances from the Trinity Lancaster Corpus, the semantic prototypicality of the VACs will be analysed according to the collocational strength of verbs with their host constructions. Comparisons of Chinese EFL learners against the native speakers will be made, and also within different task types. It is hoped that our findings would shed light on Chinese EFL learners’ knowledge of VACs and the crosslinguistic influence that impacts verb semantics of learners’ spoken English. Meanwhile, we also consider language proficiency and task type as potential factors that may account for the differences across CEFR groups based on the comparisons within Chinese EFL learners.

Jiajin Xu is Professor of Linguistics at the National Research Centre for Foreign Language Education, Beijing Foreign Studies University as well as secretary general and a founding member of the Corpus Linguistics Society of China. His research interests include discourse studies, second language acquisition, contrastive linguistics and translation studies, and corpus linguistics.

Yang Liu is currently a PhD candidate at Beijing Foreign Studies University. His research focus is on the corpus-based study of construction acquisition of Chinese EFL learners.

 

 

 

 

 

 

Further Trinity Lancaster Corpus research: Examiner strategies

This month saw a further development in the corpus analyses: the examiners. Let me introduce myself, my name is Cathy Taylor and I’m responsible for examiner training at Trinity and was very pleased to be asked to do some corpus research into the strategies the examiners use when communicating with the test takers.

In the GESE exams the examiner and candidate co-construct the interaction throughout the exam. The examiner doesn’t work from a rigid interlocutor framework provided by Trinity but instead has a flexible test plan which allows them to choose from a variety of questioning and elicitation strategies. They can then respond more meaningfully to the candidate and cover the language requirements and communication skills appropriate for the level. The rationale behind this approach is to reflect as closely as possible what happens in conversations in real life. Another benefit of the flexible framework is that the examiner can use a variety of techniques to probe the extent of the candidate’s competence in English and allow them to demonstrate what they can do with the language. If you’re interested more information can be found in Trinity’s speaking and listening tests: Theoretical background and research.

After some deliberation and very useful tips from the corpus transcriber, Ruth Avon, I decided to concentrate my research on the opening gambit for the conversation task at Grade 6, B1 CEFR. There is a standard rubric the examiner says to introduce the subject area ‘Now we’re going to talk about something different, let’s talk about…learning a foreign language.’  Following this, the examiner uses their test plan to select the most appropriate opening strategy for each candidate. There’s a choice of six subject areas for the conversation task listed for each grade in the Exam information booklet.

Before beginning the conversation examiners have strategies to check that the candidate has understood and to give them thinking time. The approaches below are typical.

  1. E: ‘Let’s talk about learning a foreign language…’
    C: ‘yes’
    E:Do you think English is an easy language?’ 
  1. E: ‘Let ‘s talk about learning a foreign language’
    C: ‘It’s an interesting topic’
    E: ‘Yes uhu do you need a teacher?
  1. It’s very common for the examiner to use pausing strategies which gives thinking time:
    E: ‘Let ‘s talk about learning a foreign language erm why are you learning English?’
    C: ‘Er I ‘m learning English for work erm I ‘m a statistician.’

There are a range of opening strategies for the conversation task:

  • Personal questions: ‘Why are you learning English?’ ‘Why is English important to you?’
  • More general question: ‘How important is it to learn a foreign language these days?’
  • The examiner gives a personal statement to frame the question: ‘I want to learn Chinese (to a Chinese candidate)…what do I have to do to learn Chinese?’
  • The examiner may choose a more discursive statement to start the conversation: ‘Some people say that English is not going to be important in the future and we should learn Chinese (to a Chinese candidate).’
  • The candidate sometimes takes the lead:
  • Examiner: ‘Let’s talk about learning a foreign language’
  • Candidate: ‘Okay, okay I really want to learn a lo = er learn a lot of = foreign languages’

A salient feature of all the interactions is the amount of back channelling the examiners do e.g. ‘erm, mm’  etc. This indicates that the examiner is actively listening to the candidate and encouraging them to continue. For example:

E: ‘Let’s talk about learning a foreign language, if you want to improve your English what is the best way?
C: ‘Well I think that when you see programmes in English’
E: ‘mm
C: ‘without the subtitles’
E: ‘mm’
C: ‘it’s a good way or listening to music in other language
E: ‘mm
C: ‘it’s a good way and and this way I have learned too much

When the corpus was initially discussed it was clear that one of the aims should be to use the findings for our examiner professional development programme.  Using this very small dataset we can develop worksheets which prompt examiners to reflect on their exam techniques using real examples of examiner and candidate interaction.

My research is in its initial stages and the next step is to analyse different strategies and how these validate the exam construct. I’m also interested in examiner strategies at the same transition point at the higher levels, i.e. grade 7 and above, B2, C1 and C2 CEFR. Do the strategies change and if so, how?

It’s been fascinating working with the corpus data and I look forward to doing more in the future.

Continue reading

TLC and innovation in language testing

One of the objectives of Trinity College London investing in the Trinity Lancaster Spoken Corpus has been to share findings with the language assessment community. The corpus allows us to develop an innovative approach to validating test constructs and offers a window into the exam room so we can see how test takers utilise their language skills in managing the series of test tasks.

Recent work by the CASS team in Lancaster has thrown up a variety of features that illustrate how test takers voice their identity in the test, how they manage interaction through a range of strategic competences and how they use epistemic markers to express their point of view and negotiate a relationship with the examiner (for more information see Gablasova et al. 2015). I have spent the last few months disseminating these findings at a range of language testing conferences and have found that the audiences have been fascinated by the findings.

We have presented findings at BAAL TEASIG in Reading, at EAQUALS in Lisbon  and at EALTA in Valencia. Audiences ranged from assessment experts to teacher educators and classroom practitioners and there was great interest both in how the test takers manage the exam as well as the manifestations of L2 language. Each presentation was tailored to the audience and the theme of the conference. In separate presentations, we covered how assessments can inform classroom practice, how the data could inform the type of feedback we give learners and how the data can be used to help validate aspects of the test construct. The feedback has been very positive, urging us to investigate further. Comments have praised the extent and quality of the corpus and range from the fact that the evidence “is something that we have long been waiting for” (Dr Parvaneh Tavakoli, University of Reading) to musings on what some of the data might mean both for how we assess spoken language and the implications for the classroom. It has certainly opened the door on the importance of strategic and pragmatic competences as well as validating Trinity’s aims to allow the test taker to bring themselves into the test.  The excitement spilled over into some great tweets. There is general recognition that the data offers something new – sometimes confirming what we suspected and sometimes – as with all corpora – refuting our beliefs!

We have always recognised that the data is constrained by the semi-formal context of the test but the fact that each test is structured but not scripted and has tasks which represent language pertinent to communicative events in the wider world allows the test taker to produce language which is more reflective of naturally occurring speech than many other oral tests. It has been enormously helpful to have feedback from the audiences who have fully engaged with the issues raised and highlighted aspects we can investigate in greater depth as well as raising features they would like to know more about. These features are precisely those that the research team wishes to explore in order to develop ‘a more fine-grained and comprehensive understanding of spoken pragmatic ability and communicative competence’ (Gablasova et al. 2015: 21)

One of the next steps is to show how this data can be used to develop and support performance descriptors. Trinity is confident that the features of communication which the test takers display are captured in its new Integrated Skills in English exam validating claims that Trinity assesses real world communication.

From Corpus to Classroom 2

There is great delight that the Trinity Lancaster Corpus is providing so much interesting data that can be used to enhance communicative competences in the classroom. From Corpus to Classroom 1 described some of these findings. But how exactly do we go about ‘translating’ this for classroom use so that it can be used by busy teachers with high pressured curricula to get through? How can we be sure we enhance rather than problematize the communicative feature we want to highlight?

Although the Corpus data comes from a spoken test, we want to use it to illustrate  wider pragmatic features of communication. The data fascinates students who are entranced to see what their fellow learners do, but how does it help their learning? The first step is to send the research outputs to an experienced classroom materials author to see what they suggest.

Here’s how our materials writer, Jeanne Perrett, went about this challenging task:

As soon as I saw the research outputs from TLC, I knew that this was something really special; proper, data driven learning on how to be a more successful speaker. I could also see that the corpus scripts, as they were, might look very alien and quirky to most teachers and students. Speaking and listening texts in coursebooks don’t usually include sounds of hesitation, people repeating themselves, people self-correcting or even asking ‘rising intonation’ questions. But all of those things are a big part of how we actually communicate so I wanted to use the original scripts as much as possible. I also thought that learners would be encouraged by seeing that you don’t have to speak in perfectly grammatical sentences, that you can hesitate and you can make some mistakes but still be communicating well.

Trinity College London commissioned me to write a series of short worksheets, each one dealing with one of the main research findings from the Corpus, and intended for use in the classroom to help students prepare for GESE and ISE exams at a B1 or B2 level.

I started each time with extracts from the original scripts from the data. Where I thought that the candidates’ mistakes would hinder the learner’s comprehension (unfinished sentences for example), I edited them slightly (e.g. with punctuation). But these scripts were not there for comprehension exercises; they were there to show students something that they might never have been taught before.

For example, sounds of hesitation: we all know how annoying it is to listen to someone (native and non-native speakers) continually erm-ing and er-ing in their speech and the data showed that candidates were hesitating too much. But we rarely, if ever, teach our students that it is in fact okay and indeed natural to hesitate while we are thinking of what we want to say and how we want to say it. What they need to know is that, like the more successful candidates in the data,  there are other words and phrases that we can use instead of erm and er. So one of the worksheets shows how we can use hedging phrases such as ‘well..’ or ‘like..’ or ‘okay…’ or ‘I mean..’ or ‘you know…’.

The importance of taking responsibility for a conversation was another feature to emerge from the data and again, I felt that these corpus findings were very freeing for students; that taking responsibility doesn’t, of course, mean that you have to speak all the time but that you also have to create opportunities for the other person to speak and that there are specific ways in which you can do that such as making active listening sounds (ah, right, yeah), asking questions, making short comments and suggestions.

Then there is the whole matter of how you ask questions. The corpus findings show that there is far less confusion in a conversation when properly formed questions are used. When someone says ‘You like going to the mountains?’ the question is not as clear as when they say ‘Do you like going to the mountains?’ This might seem obvious but pointing it out, showing that less checking of what has been asked is needed when questions are direct ones, is, I think very helpful to students. It might also be a consolation-all those years of grammar exercises really were worth it! ‘Do you know how to ask a direct question? ‘Yes, I do!’

These worksheets are intended for EFL exam candidates but the more I work on them, the more I think that the Corpus findings could have a far wider reach. How you make sure you have understood what someone is saying, how you can be a supportive listener, how you can make yourself clear, even if you want to be clear about being uncertain; these are all communication skills which everyone needs in any language.

 

 

Syntactic structures in the Trinity Lancaster Corpus

We are proud to announce collaboration with Markus Dickinson and Paul Richards from the Department of Linguistics, Indiana University on a project  that will analyse syntactic structures in the Trinity Lancaster Corpus. The focus of the project is to develop a syntactic annotation scheme of spoken learner language and apply this scheme to the Trinity Lancaster Corpus, which is being compiled at Lancaster University in collaboration with Trinity College London. The aim of the project is to provide an annotation layer for the corpus that will allow sophisticated exploration of the morphosyntactic and syntactic structures in learner speech. The project will have an impact on both the theoretical understanding of spoken language production at different proficiency levels as well as on the development of practical NLP solutions for annotation of learner speech.  More specific goals include:

  • Identification of units of spoken production and their automatic recognition.
  • Annotation and visualization of morphosyntactic and syntactic structures in learner speech.
  • Contribution to the development of syntactic complexity measures for learner speech.
  • Description of the syntactic development of spoken learner production.

 

From Corpus to Classroom 1

The Trinity Lancaster Corpus of Spoken Learner English is providing multiple sets of data that can not only be used for validating the quality of our tests but also – and most importantly – to feedback important features of language that can be utilised in the classroom. It is essential that some of our research is focused on how Trinity informs and supports teachers in improving communicative competences in their learners and this is forming part of an ongoing project the research team are setting up in order to give teachers access to this information.

Trinity has always been focused on communicative approaches to language teaching and the heart of the tests is about communicative competences. The research team are especially excited to see that the data is revealing the many ways in which test takers use these communicative competences to manage their interaction in the spoken tests. It is very pleasing to see that not only does the corpus evidence support claims that the Trinity tests of spoken language are highly interactive but also it establishes some very clear features of effective communicative that can be utilised by teachers in the classroom.

The strategies which test takers use to communicate successfully include:

  • Asking more questions

Here the test taker relies less on declarative sentences to move a conversation forward but asks clear questions (direct and indirect) that are more immediately accessible to the listener.

  • Demonstrating active listenership through backchannelling

This involves offering more support to the conversational partner by using signals such as okay, yes, uhu, oh, etc to demonstrate engaged listenership.

  • Taking responsibility for the conversation through their contributions

Successful test takers help move the conversation along by by creating opportunities with e.g. questions, comments or suggestions that their partner can easily react to.

  • Using fewer hesitation markers

Here the speaker makes sure they keep talking and uses fewer markers such as er, erm which can interrupt fluency.

  • Clarifying what is said to them before they respond

This involves the test taker checking through questions that they have understood exactly what has been said to them.

Trinity is hopeful that these types of communicative strategies can be investigated across the tests and across the various levels in order to extract information which can be fed back into the classroom.  Teachers – and their learners – are interested to see what actually happens when the learner has the opportunity to put their language into practice in a live performance situation. It makes what goes on in the classroom much more real and gives pointers to how a speaker can cope in these situations.

More details about these points can be found on the Trinity corpus website and classroom teaching materials will be uploaded shortly to support teachers in developing these important strategies in their learners.

Also see CASS briefings for more information on successful communication strategies in L2.