Event Information:

  • Wed

    Research seminar, "Big Educational Data: any good for SLA research?"

    4:00 pmCounty South C89, Lancaster University, UK

    Dora Alexopoulou (Cambridge)
    Joint work with J Geertzen, A Korhonen and D Meurers

    The emergence of online EFL teaching  platforms offering teaching and learning to students around the globe results in unprecedented amounts of learner production data: data can come from  rich task sets across the proficiency spectrum and learners from a variety of linguistic, educational and cultural backgrounds. Exploiting such datasets opens important opportunities for SLA research and, in particular, linking SLA findings to  second language teaching. But at the same time, such datasets have all the pitfalls of big data: a range of variables standardly controlled for in carefully designed data collections (e.g. task sets) are not considered. Access to unprecedented numbers of learners is set against lack of rich learner metadata targeted in typical data collections. In addition, the very context of production poses  arbitrary constraints (e.g. word limits on writings). Last, but not least, the size of such datasets brings new challenges  for extracting information and addressing the noisy aspects of the data. 

    Can we then  use such data for SLA research, crucially, to link SLA findings to teaching second languages? I will argue that Natural Language Processing (NLP) tools can help us address many of the methodological issues and  will show that we can  obtain valuable information for SLA research. I will use the EF-Cambridge Open Language Database (EFCAMDAT) as an example of a big data resource.  I will focus on the  the developmental trajectory of Relative Clauses (RCs) as a study case and consider specific issues that can affect the developmental picture, such as task effects, formulaic language and national language effects. I will conclude by showing  that not only we can arrive at reliable generalisations about RC development based on a resource like EFCAMDAT,  but we can also obtain new generalisations, a fact strongly indicating the potential of big educational data for SLA research.