Introducing the Corpus of Translational English (COTE)

We are pleased to announce that CASS has recently compiled another new corpus, the Corpus of Translational English (COTE). The construction of COTE is supported by the joint ESRC (UK) – RGC (Hong Kong) research project, “Comparable and Parallel Corpus Approaches to the Third Code: English and Chinese Perspectives” (ES/K010107/1). The project is led by Dr Richard Xiao and Dr Andrew Hardie at CASS in collaboration with Dr Dechao Li and Professor Chu-Ren Huang of the Hong Kong Polytechnic University.

COTE is a one-million-word balanced comparable corpus of translated English texts, which is designed as a translational counterpart of the Freiburg–LOB Corpus of British English (F-LOB). The new corpus is intended to match F-LOB as closely as possible in size and composition, but is supposed to represent translational English published in the 1990s. Like the F-LOB corpus, COTE comprises five hundred text samples of around 2,000 words each, which are distributed across 15 text categories. The corpus is created with the explicit aim of providing a reliable empirical basis for identifying the typical common features of translated English texts and investigating variations in such features across different types of text on the basis of quantitative analyses of the balanced corpus of translational English in contrast with comparable corpora of native English.

Like many balanced native English corpora such as F-LOB, COTE includes metadata information such as text type and date of publication as well as linguistic annotation such as part-of-speech tagging. But as a translational English corpus, COTE additionally includes various translation-specific metadata, e.g. the source language, translator, date and source of publication in the header of each text sample, which makes it possible to categorize the texts to suit different research purposes. The corpus is currently restricted for in-house use by the project team. It will be released and made accessible online when the project is completed.

Related outputs:

Hu, X.  (2014) Does the Style of Translation Exist? A corpus-based Multidimensional Analysis of the stylistic features of the translated Chinese. Paper presented at the 2nd Second Asia Pacific Corpus Linguistics Conference. 7 – 9 March, the Hong Kong Polytechnic University.

Hu, X. & Xiao, R. (2014). How different is English translation from native writings of English? A multi-feature statistical model for linguistic variation analysis. Paper presented at the 35th ICAME conference. 30 April to 4 May, the University of Nottingham.

Hu, X. & Xiao, R. (2014). What role do Source Languages play in the variation of translational English? A corpus-based survey of Source Language interference. Paper presented at the 7th IVACS conference, 19-21 June 2014, Newcastle University.

Xiao, R. & Hu, X.  (2014). General tendencies and variations of translational English across registers. Paper presented at the 4th UCCTS conference, 24-26 July 2014, Lancaster University.

McEnery, A. & Xiao, R. (2014). The development of corpus linguistics in English and Chinese contexts. In Ishikawa, S. (ed.) Learner Corpus studies in Asia and the World: Papers from LCSAW2014, Vol. 2, pp. 7-45. Kobe, Japan: Kobe University.

Hu, X., Xiao, R. & Hardie, A. (under preparation). How do English translations differ from native English writings? A multi-feature statistical model for linguistic variation analysis.

Translation and contrastive linguistic studies at the interface of English and Chinese

A forthcoming special issue of Corpus Linguistics and Linguistics Theory, which is guest-edited by Dr Richard Xiao and Professor Naixing Wei, President of the Corpus Linguistics Society of China, is now available online as Ahead of Print at the journal website.

This special issue focuses on corpus-based translation and contrastive linguistic studies involving two genetically different languages, namely English and Chinese, which we believe have formed an important interface with its unique features as a result of the mutual interaction between the two languages.

Corpora have tremendously benefited translation and contrastive studies, and in the meantime, corpus-based translation and contrastive linguistic studies have also significantly expanded the scope of corpus linguistic research. While contrastive linguistics and translation studies have traditionally been accepted as two separate disciplines within applied linguistics, there are many contact points between the two; and with the common corpus-based approach and the usually shared type of data (e.g. comparable and parallel corpora), corpus-based translation and contrastive linguistic studies have become even more closely interconnected, as demonstrated by the articles included in this special issue.

This special issue of Corpus Linguistics and Linguistics Theory includes five research articles together with an extensive introduction written by the guest editors.

These studies combine contrastive analysis and translation studies on the basis of comparable corpora (either multilingual or monolingual) and parallel corpora of English and Chinese, two most widely spoken world languages that differ genetically. While the decision to involve English and Chinese in the research reported in this volume was largely based on the authors’ strong languages (they are all competently bilingual in Chinese and English), the significance of the typological distance between the two languages covered in these studies cannot be underestimated. In comparison with studies of typologically related languages, translation and cross-linguistic studies of genetically distant languages such as English and Chinese can have more important theoretical implications for linguistic theorization. Studying such language pairs help us gain a better appreciation of the scale of variability in the human language system while theories and observations based on closely related language pairs can give rise to conclusions which seem certain but which, when studied in the context of a language pair such as English and Chinese, become not merely problematized afresh, but significantly more challenging to resolve (cf. Xiao and McEnery 2010).

Studies reported on in this special issue embody features at the interface of English and Chinese, which can be expected to have important significance and practical implications for linguistic theorizing.