From Corpus to Classroom 1

The Trinity Lancaster Corpus of Spoken Learner English is providing multiple sets of data that can not only be used for validating the quality of our tests but also – and most importantly – to feedback important features of language that can be utilised in the classroom. It is essential that some of our research is focused on how Trinity informs and supports teachers in improving communicative competences in their learners and this is forming part of an ongoing project the research team are setting up in order to give teachers access to this information.

Trinity has always been focused on communicative approaches to language teaching and the heart of the tests is about communicative competences. The research team are especially excited to see that the data is revealing the many ways in which test takers use these communicative competences to manage their interaction in the spoken tests. It is very pleasing to see that not only does the corpus evidence support claims that the Trinity tests of spoken language are highly interactive but also it establishes some very clear features of effective communicative that can be utilised by teachers in the classroom.

The strategies which test takers use to communicate successfully include:

  • Asking more questions

Here the test taker relies less on declarative sentences to move a conversation forward but asks clear questions (direct and indirect) that are more immediately accessible to the listener.

  • Demonstrating active listenership through backchannelling

This involves offering more support to the conversational partner by using signals such as okay, yes, uhu, oh, etc to demonstrate engaged listenership.

  • Taking responsibility for the conversation through their contributions

Successful test takers help move the conversation along by by creating opportunities with e.g. questions, comments or suggestions that their partner can easily react to.

  • Using fewer hesitation markers

Here the speaker makes sure they keep talking and uses fewer markers such as er, erm which can interrupt fluency.

  • Clarifying what is said to them before they respond

This involves the test taker checking through questions that they have understood exactly what has been said to them.

Trinity is hopeful that these types of communicative strategies can be investigated across the tests and across the various levels in order to extract information which can be fed back into the classroom.  Teachers – and their learners – are interested to see what actually happens when the learner has the opportunity to put their language into practice in a live performance situation. It makes what goes on in the classroom much more real and gives pointers to how a speaker can cope in these situations.

More details about these points can be found on the Trinity corpus website and classroom teaching materials will be uploaded shortly to support teachers in developing these important strategies in their learners.

Also see CASS briefings for more information on successful communication strategies in L2.

Introducing the Corpus of Translational English (COTE)

We are pleased to announce that CASS has recently compiled another new corpus, the Corpus of Translational English (COTE). The construction of COTE is supported by the joint ESRC (UK) – RGC (Hong Kong) research project, “Comparable and Parallel Corpus Approaches to the Third Code: English and Chinese Perspectives” (ES/K010107/1). The project is led by Dr Richard Xiao and Dr Andrew Hardie at CASS in collaboration with Dr Dechao Li and Professor Chu-Ren Huang of the Hong Kong Polytechnic University.

COTE is a one-million-word balanced comparable corpus of translated English texts, which is designed as a translational counterpart of the Freiburg–LOB Corpus of British English (F-LOB). The new corpus is intended to match F-LOB as closely as possible in size and composition, but is supposed to represent translational English published in the 1990s. Like the F-LOB corpus, COTE comprises five hundred text samples of around 2,000 words each, which are distributed across 15 text categories. The corpus is created with the explicit aim of providing a reliable empirical basis for identifying the typical common features of translated English texts and investigating variations in such features across different types of text on the basis of quantitative analyses of the balanced corpus of translational English in contrast with comparable corpora of native English.

Like many balanced native English corpora such as F-LOB, COTE includes metadata information such as text type and date of publication as well as linguistic annotation such as part-of-speech tagging. But as a translational English corpus, COTE additionally includes various translation-specific metadata, e.g. the source language, translator, date and source of publication in the header of each text sample, which makes it possible to categorize the texts to suit different research purposes. The corpus is currently restricted for in-house use by the project team. It will be released and made accessible online when the project is completed.

Related outputs:

Hu, X.  (2014) Does the Style of Translation Exist? A corpus-based Multidimensional Analysis of the stylistic features of the translated Chinese. Paper presented at the 2nd Second Asia Pacific Corpus Linguistics Conference. 7 – 9 March, the Hong Kong Polytechnic University.

Hu, X. & Xiao, R. (2014). How different is English translation from native writings of English? A multi-feature statistical model for linguistic variation analysis. Paper presented at the 35th ICAME conference. 30 April to 4 May, the University of Nottingham.

Hu, X. & Xiao, R. (2014). What role do Source Languages play in the variation of translational English? A corpus-based survey of Source Language interference. Paper presented at the 7th IVACS conference, 19-21 June 2014, Newcastle University.

Xiao, R. & Hu, X.  (2014). General tendencies and variations of translational English across registers. Paper presented at the 4th UCCTS conference, 24-26 July 2014, Lancaster University.

McEnery, A. & Xiao, R. (2014). The development of corpus linguistics in English and Chinese contexts. In Ishikawa, S. (ed.) Learner Corpus studies in Asia and the World: Papers from LCSAW2014, Vol. 2, pp. 7-45. Kobe, Japan: Kobe University.

Hu, X., Xiao, R. & Hardie, A. (under preparation). How do English translations differ from native English writings? A multi-feature statistical model for linguistic variation analysis.

Translation and contrastive linguistic studies at the interface of English and Chinese

A forthcoming special issue of Corpus Linguistics and Linguistics Theory, which is guest-edited by Dr Richard Xiao and Professor Naixing Wei, President of the Corpus Linguistics Society of China, is now available online as Ahead of Print at the journal website.

This special issue focuses on corpus-based translation and contrastive linguistic studies involving two genetically different languages, namely English and Chinese, which we believe have formed an important interface with its unique features as a result of the mutual interaction between the two languages.

Corpora have tremendously benefited translation and contrastive studies, and in the meantime, corpus-based translation and contrastive linguistic studies have also significantly expanded the scope of corpus linguistic research. While contrastive linguistics and translation studies have traditionally been accepted as two separate disciplines within applied linguistics, there are many contact points between the two; and with the common corpus-based approach and the usually shared type of data (e.g. comparable and parallel corpora), corpus-based translation and contrastive linguistic studies have become even more closely interconnected, as demonstrated by the articles included in this special issue.

This special issue of Corpus Linguistics and Linguistics Theory includes five research articles together with an extensive introduction written by the guest editors.

These studies combine contrastive analysis and translation studies on the basis of comparable corpora (either multilingual or monolingual) and parallel corpora of English and Chinese, two most widely spoken world languages that differ genetically. While the decision to involve English and Chinese in the research reported in this volume was largely based on the authors’ strong languages (they are all competently bilingual in Chinese and English), the significance of the typological distance between the two languages covered in these studies cannot be underestimated. In comparison with studies of typologically related languages, translation and cross-linguistic studies of genetically distant languages such as English and Chinese can have more important theoretical implications for linguistic theorization. Studying such language pairs help us gain a better appreciation of the scale of variability in the human language system while theories and observations based on closely related language pairs can give rise to conclusions which seem certain but which, when studied in the context of a language pair such as English and Chinese, become not merely problematized afresh, but significantly more challenging to resolve (cf. Xiao and McEnery 2010).

Studies reported on in this special issue embody features at the interface of English and Chinese, which can be expected to have important significance and practical implications for linguistic theorizing.