Remembering Richard Xiao, 1966-2016

I first met Richard in 2000, when he came to Lancaster to be my PhD student. Interested initially in doing a PhD in the area of translation studies, I spoke to him about corpus research and, slowly as the months passed, he decided to use corpora to look at an interesting issue in linguistics – aspect. This was the first of many areas where we happily worked together. Over weeks and months we slowly worked on the problem of integrating corpora and theory, finally arriving at what we both felt was a very satisfactory outcome: a PhD for Richard, a book we wrote on the topic and one or two nice papers.

Early on Richard showed real promise as a researcher so, as I often do with my students, I set Richard onto a few side projects which we pursued together. The first project we worked on was on the F-word in English. I had analysed bad language in the spoken and written BNC, but my book on swearing in English only used the spoken material. So we worked together on the written data and produced the paper ‘Swearing in Modern British English’ which was published in Language and Literature.

That started something of a wave of publications from us – we worked together very well. We had similar interests and personalities, but, most importantly, we felt very comfortable about disagreeing with one another. Those disagreements were always purely intellectual – a cross word never passed between us. They were also not fruitless – we would always debate the point until one or the other of us would change our minds. Working with Richard was a pleasure.

On finishing his PhD Richard started to work as my research assistant. Courtesy of a grant from the UK ESRC we carried on our work on the grammar of Chinese. When I went on secondment from Lancaster University to the UK AHRC, I continued to work with Richard who remained my research assistant. Without Richard working pretty independently of me most of the time while I was on secondment, my time at the AHRC would have been much tougher. As it was, I could focus on the research council work during the day and then check in with Richard in the evening to see how our work was going. The end result was a series of papers on Chinese grammar that I am very proud to be associated with and the book Corpus-Based Contrastive Studies of English and Chinese.

After the grant we were working on finished we hit a snag – we had a very interesting project on Chinese split words lined up, but as I was working for the research council at that time I could not apply to them for a grant and as a research assistant Richard was ineligible to apply. So we wrote the proposal and persuaded our colleague Anna Siewierska to take on the supervisor role on the project. The project was funded and Richard and Anna worked together very well, though I will always regret not being able to be part of that work as it is so interesting. Look at this paper, for example:

Around the time that this grant was awarded Richard got his first lecturing position at the University of Central Lancashire, moving on to Edge Hill University and finally, to my delight, in 2012 he moved back to Lancaster University, where he was swiftly promoted to Reader.

In the fourteen years from when I first met him to the point where he retired on ill health grounds, if Richard had only done the work described above he would have had a good career. However, he did so much more as his Google Scholar profile shows:

In addition to what he did with me, he also undertook a great range of excellent research on his own, especially in the area of translation studies. Importantly, he contributed to the construction of a wide range of corpora of Mandarin Chinese as can be seen here:

I was delighted when Richard successfully applied to become a British citizen and was honoured to be asked to support his application. I was so pleased to be able to help Richard, his wife Lyn and his daughter in this way.

Sadly, Richard was diagnosed with cancer in 2013. Through surgery, chemotherapy and sheer will power he survived to the 2nd January 2016. The length of his illness, while distressing, did allow us time to publicly celebrate his work:

Throughout his illness he was unfailingly cheerful and optimistic. He was also still brimming with ideas – he was writing and undertaking journal and research council reviews until a few months before he left his suffering behind. I have no doubt that if he had survived longer he would have written many more books and papers well worth reading. As it was, when we last spoke together, just before Christmas 2015, we had a lovely time remembering what we had achieved together. Indeed this brief remembrance of Richard contains many of the things we recalled in that conversation. One thing we did was to decide upon our favourite three publications that we had written together. It seems appropriate to share these in his memory – we both thought they were well worth a read! They are:

McEnery, A. M. & Xiao, R. Z. (2004) ‘Swearing in modern British English: the case of fuck in the BNC.’ Language and Literature. 13, 3, pp. 235-268.


McEnery, A. M. & Xiao, R. Z. (2005) ‘HELP or HELP to: What do corpora have to say?’ English Studies. 86, 2, pp. 161-187.


Collocation, semantic prosody and near synonymy: A cross-linguistic perspective.

Xiao, R. Z. & McEnery, A. M. (2006) Applied Linguistics. 27, 1, pp. 103-129.


We spent a pleasant time discussing these papers and then we said farewell to each other. I can imagine no better a final conversation between two scholars and friends who worked together so well. I am so happy that we had the chance to have this final meeting of minds. Not only will it be a precious memory for me, I know that it meant a great deal to him. I will miss Richard very much, as will others. However, through his writing his thoughts will live on and as further studies are produced by others on the basis of his corpora, the energy, kindness and ingenuity of Richard Xiao will blaze forth afresh.

2014/15 in retrospective: Perspectives on Chinese

Looking back over the academic year as it draws to a close, one of the highlights for us here at CASS was the one-day seminar we hosted in January on Perspectives on Chinese: Talks in Honour of Richard Xiao. This event celebrated the contributions to linguistics of CASS co-investigator Dr. Richard Zhonghua Xiao, on the occasion of both his retirement in October 2014 (and simultaneous taking-up of an honorary position with the University!), and the completion of the two funded research projects which Richard has led under the aegis of CASS.

The speakers included present and former collaborators with Richard – some (including myself) from here at Lancaster, others from around the world – as well as other eminent scholars working in the areas that Richard has made his own: Chinese corpus linguistics (especially, but not only, comparative work), and the allied area of the methodologies that Richard’s work has both utilised and promulgated.

In the first presentation, Prof. Hongyin Tao of UCLA took a classic observation of corpus-based studies – the existence, and frequent occurrence, of highly predictable strings or structures, pointed out a little-noticed aspect of these highly-predictable elements. They often involve lacunae, or null elements, where some key component of the meaning is simply left unstated and assumed. An example of this is the English expression under the influence, where “the influence of what?” is often implicit, but understood to be drugs/alcohol. It was pointed out that collocation patterns may identify the null elements, but that a simplistic application of collocation analysis may fail to yield useful results for expressions containing null elements. Finally, an extension of the analysis to yinxiang, the Chinese equivalent of influence, showed much the same tendencies – including, crucially, the importance of null elements – at work.

The following presentation came from Prof. Gu Yueguo of the Chinese Academy of Social Sciences. Gu is well-known in the field of corpus linguistics for his many projects over the years to develop not just new corpora, but also new types of corpus resources – for example, his exciting development in recent years of novel types of ontology. His presentation at the seminar was very much in this tradition, arguing for a novel type of multimodal corpus for use in the study of child language acquisition.

At this point in proceedings, I was deeply honoured to give my own presentation. One of Richard’s recently-concluded projects involved the application of Douglas Biber’s method of Multidimensional Analysis to translational English as the “Third Code”. In my talk, I presented methodological work which, together with Xianyao Hu, I have recently undertaken to assist this kind of analysis by embedding tools for the MD approach in CQPweb. A shorter version of this talk was subsequently presented at the ICAME conference in Trier at the end of May.

Prof. Xu Hai of Guangdong University of Foreign Studies gave a presentation on the study of the study of Learner Chinese, an issue which was prominent among Richard’s concerns as director of the Lancaster University Confucius Institute. As noted above, Richard has led a project funded by the British Academy, looking at the acquisition of Mandarin Chinese as a foreign language; as a partner on that project, Xu’s presentation of a preliminary report on the Guangwai Lancaster Chinese Learner Corpus was timely indeed. This new learner corpus – already in excess of a million words in size, and consisting of a roughly 60-40 split between written and spoken materials – follows the tradition of the best learner corpora for English by sampling learners with many different national backgrounds, but also, interestingly, includes some longitudinal data. Once complete, the value of this resource for the study of L2 Chinese interlanguage will be incalculable.

The next presentation was another from colleagues of Richard here at Lancaster: Dr. Paul Rayson and Dr. Scott Piao gave a talk on the extension of the UCREL Semantic Analysis System (USAS) to Chinese. This has been accomplished by means of mapping the vast semantic lexicon originally created for English across to Chinese, initially by automatic matching, and secondarily by manual editing. Scott and Paul, with other colleagues including CASS’s Carmen Dayrell, went on to present this work – along with work on other languages – at the prestigious NAACL HLT 2015 conference, in whose proceedings a write-up has been published.

Prof. Jiajin Xu (Beijing Foreign Studies University) then made a presentation on corpus construction for Chinese. This area has, of, course, been a major locus of activity by Richard over the years: his Lancaster Corpus of Mandarin Chinese (LCMC), a Mandarin match for the Brown corpus family, is one of the best openly-available linguistic resources for that language, and his ZJU Corpus of Translational Chinese (ZCTC) was a key contribution of his research on translation in Chinese . Xu’s talk presented a range of current work building on that foundation, especially the ToRCH (“Texts of Recent Chinese”) family of corpora – a planned Brown-family-style diachronic sequence of snapshot corpora in Chinese from BFSU, starting with the ToRCH2009 edition. Xu rounded out the talk with some case studies of applications for ToRCH, looking first at recent lexical change in Chinese by comparing ToRCH2009 and LCMC, and then at features of translated language in Chinese by comparing ToRCH2009 and ZCTC.

The last presentation of the day was from Dr. Vittorio Tantucci, who has recently completed his PhD at the department of Linguistics and English Language at Lancaster, and who specialises in a number of issues in cognitive linguistic analysis including intersubjectivity and evidentiality. His talk addressed specifically the Mandarin evidential marker 过 guo, and the path it took from a verb meaning ‘to get through, to pass by’ to becoming a verbal grammatical element. He argued that this exemplified a path for an evidential marker to originate from a traversative structure – a phenomenon not noted on the literature on this kind of grammaticalisation, which focuses on two other paths of development, from verbal constructions conveying a result or a completion. Vittorio’s work is extremely valuable, not only in its own right but as a demonstration of the role that corpus-based analysis, and cross-linguistic evidence, has to play on linguistic theory. Given Richard’s own work on the grammar and semantics of aspect in Chinese, a celebration of Richard’s career would not have been complete without an illustration of how this trend in current linguistics continues to develop.

All in all, the event was a magnificent tribute to Richard and his highly productive research career, and a potent reminder of how diverse his contributions to the field have actually been, and of their far-reaching impact among practitioners of Chinese corpus linguistics. The large and lively audience certainly seemed to agree with our assessment!

Our deep thanks go out to all the invited speakers, especially those who travelled long distances to attend – our speaker roster stretched from California in the west, to China in the east.

Introducing the Corpus of Translational English (COTE)

We are pleased to announce that CASS has recently compiled another new corpus, the Corpus of Translational English (COTE). The construction of COTE is supported by the joint ESRC (UK) – RGC (Hong Kong) research project, “Comparable and Parallel Corpus Approaches to the Third Code: English and Chinese Perspectives” (ES/K010107/1). The project is led by Dr Richard Xiao and Dr Andrew Hardie at CASS in collaboration with Dr Dechao Li and Professor Chu-Ren Huang of the Hong Kong Polytechnic University.

COTE is a one-million-word balanced comparable corpus of translated English texts, which is designed as a translational counterpart of the Freiburg–LOB Corpus of British English (F-LOB). The new corpus is intended to match F-LOB as closely as possible in size and composition, but is supposed to represent translational English published in the 1990s. Like the F-LOB corpus, COTE comprises five hundred text samples of around 2,000 words each, which are distributed across 15 text categories. The corpus is created with the explicit aim of providing a reliable empirical basis for identifying the typical common features of translated English texts and investigating variations in such features across different types of text on the basis of quantitative analyses of the balanced corpus of translational English in contrast with comparable corpora of native English.

Like many balanced native English corpora such as F-LOB, COTE includes metadata information such as text type and date of publication as well as linguistic annotation such as part-of-speech tagging. But as a translational English corpus, COTE additionally includes various translation-specific metadata, e.g. the source language, translator, date and source of publication in the header of each text sample, which makes it possible to categorize the texts to suit different research purposes. The corpus is currently restricted for in-house use by the project team. It will be released and made accessible online when the project is completed.

Is translated Chinese still Chinese?

Is translated Chinese still Chinese? Do translated English and translated Chinese have anything in common? Can the properties observed on the basis of translational English in contrast to comparable non-translated English be generalised to other translational languages? These interesting questions were explored in Dr Richard Xiao’s keynote lecture entitled “Translation universal hypotheses reevaluated from the Chinese perspective”, delivered at the joint meeting of the 11th International Congress of the Brazilian Association of Researchers in Translation (ABRAPT) and the 5th International Congress of Translators, held at the Federal University of Santa Catarina (UFSC), Florianópolis on 23-26 September 2013.

Corpus-based Translation Studies focuses on translation as a product by comparing comparable corpora of translated and non-translated texts. A number of distinctive features of translations have been posited including, for example, explicitation, simplification, normalisation, levelling out (convergence), source language interference, and under-representation of target language unique items.

Nevertheless, research of this area has until recently been confined largely to translational English and closely related European languages. If the features of translational language that have been reported on the basis of these languages are to be generalised as “translation universals”, the language pairs involved must not be restricted to English and closely related European languages. Clearly, evidence from a genetically distant language pair such as English and Chinese is arguably more convincing, if not indispensable.

Richard’s presentation reevaluated the largely English-based translation universal hypotheses from the perspective of translational Chinese, on the basis of a systematic empirical study of the lexical and grammatical properties of translational Chinese represented in a one-million-word balanced corpus of translated texts in contrast with a comparable corpus of native Chinese texts.

The conference was organised by the Brazilian Association of Researchers in Translation (ABRAPT). During the conference, Richard also gave a talk about corpus-based translation studies at the Roundtable “Translation and Interdisciplinarity”.