2014/15 in retrospective: Perspectives on Chinese

Looking back over the academic year as it draws to a close, one of the highlights for us here at CASS was the one-day seminar we hosted in January on Perspectives on Chinese: Talks in Honour of Richard Xiao. This event celebrated the contributions to linguistics of CASS co-investigator Dr. Richard Zhonghua Xiao, on the occasion of both his retirement in October 2014 (and simultaneous taking-up of an honorary position with the University!), and the completion of the two funded research projects which Richard has led under the aegis of CASS.

The speakers included present and former collaborators with Richard – some (including myself) from here at Lancaster, others from around the world – as well as other eminent scholars working in the areas that Richard has made his own: Chinese corpus linguistics (especially, but not only, comparative work), and the allied area of the methodologies that Richard’s work has both utilised and promulgated.

In the first presentation, Prof. Hongyin Tao of UCLA took a classic observation of corpus-based studies – the existence, and frequent occurrence, of highly predictable strings or structures, pointed out a little-noticed aspect of these highly-predictable elements. They often involve lacunae, or null elements, where some key component of the meaning is simply left unstated and assumed. An example of this is the English expression under the influence, where “the influence of what?” is often implicit, but understood to be drugs/alcohol. It was pointed out that collocation patterns may identify the null elements, but that a simplistic application of collocation analysis may fail to yield useful results for expressions containing null elements. Finally, an extension of the analysis to yinxiang, the Chinese equivalent of influence, showed much the same tendencies – including, crucially, the importance of null elements – at work.

The following presentation came from Prof. Gu Yueguo of the Chinese Academy of Social Sciences. Gu is well-known in the field of corpus linguistics for his many projects over the years to develop not just new corpora, but also new types of corpus resources – for example, his exciting development in recent years of novel types of ontology. His presentation at the seminar was very much in this tradition, arguing for a novel type of multimodal corpus for use in the study of child language acquisition.

At this point in proceedings, I was deeply honoured to give my own presentation. One of Richard’s recently-concluded projects involved the application of Douglas Biber’s method of Multidimensional Analysis to translational English as the “Third Code”. In my talk, I presented methodological work which, together with Xianyao Hu, I have recently undertaken to assist this kind of analysis by embedding tools for the MD approach in CQPweb. A shorter version of this talk was subsequently presented at the ICAME conference in Trier at the end of May.

Prof. Xu Hai of Guangdong University of Foreign Studies gave a presentation on the study of the study of Learner Chinese, an issue which was prominent among Richard’s concerns as director of the Lancaster University Confucius Institute. As noted above, Richard has led a project funded by the British Academy, looking at the acquisition of Mandarin Chinese as a foreign language; as a partner on that project, Xu’s presentation of a preliminary report on the Guangwai Lancaster Chinese Learner Corpus was timely indeed. This new learner corpus – already in excess of a million words in size, and consisting of a roughly 60-40 split between written and spoken materials – follows the tradition of the best learner corpora for English by sampling learners with many different national backgrounds, but also, interestingly, includes some longitudinal data. Once complete, the value of this resource for the study of L2 Chinese interlanguage will be incalculable.

The next presentation was another from colleagues of Richard here at Lancaster: Dr. Paul Rayson and Dr. Scott Piao gave a talk on the extension of the UCREL Semantic Analysis System (USAS) to Chinese. This has been accomplished by means of mapping the vast semantic lexicon originally created for English across to Chinese, initially by automatic matching, and secondarily by manual editing. Scott and Paul, with other colleagues including CASS’s Carmen Dayrell, went on to present this work – along with work on other languages – at the prestigious NAACL HLT 2015 conference, in whose proceedings a write-up has been published.

Prof. Jiajin Xu (Beijing Foreign Studies University) then made a presentation on corpus construction for Chinese. This area has, of, course, been a major locus of activity by Richard over the years: his Lancaster Corpus of Mandarin Chinese (LCMC), a Mandarin match for the Brown corpus family, is one of the best openly-available linguistic resources for that language, and his ZJU Corpus of Translational Chinese (ZCTC) was a key contribution of his research on translation in Chinese . Xu’s talk presented a range of current work building on that foundation, especially the ToRCH (“Texts of Recent Chinese”) family of corpora – a planned Brown-family-style diachronic sequence of snapshot corpora in Chinese from BFSU, starting with the ToRCH2009 edition. Xu rounded out the talk with some case studies of applications for ToRCH, looking first at recent lexical change in Chinese by comparing ToRCH2009 and LCMC, and then at features of translated language in Chinese by comparing ToRCH2009 and ZCTC.

The last presentation of the day was from Dr. Vittorio Tantucci, who has recently completed his PhD at the department of Linguistics and English Language at Lancaster, and who specialises in a number of issues in cognitive linguistic analysis including intersubjectivity and evidentiality. His talk addressed specifically the Mandarin evidential marker 过 guo, and the path it took from a verb meaning ‘to get through, to pass by’ to becoming a verbal grammatical element. He argued that this exemplified a path for an evidential marker to originate from a traversative structure – a phenomenon not noted on the literature on this kind of grammaticalisation, which focuses on two other paths of development, from verbal constructions conveying a result or a completion. Vittorio’s work is extremely valuable, not only in its own right but as a demonstration of the role that corpus-based analysis, and cross-linguistic evidence, has to play on linguistic theory. Given Richard’s own work on the grammar and semantics of aspect in Chinese, a celebration of Richard’s career would not have been complete without an illustration of how this trend in current linguistics continues to develop.

All in all, the event was a magnificent tribute to Richard and his highly productive research career, and a potent reminder of how diverse his contributions to the field have actually been, and of their far-reaching impact among practitioners of Chinese corpus linguistics. The large and lively audience certainly seemed to agree with our assessment!

Our deep thanks go out to all the invited speakers, especially those who travelled long distances to attend – our speaker roster stretched from California in the west, to China in the east.

CASS Corpus Linguistics workshop at the University of Caxias do Sul (UCS, Brazil)

Last month at UCS (Brazil), the CASS Corpus Linguistics workshop found a receptive audience who participated actively and enthusiastically engaged in the discussion. The workshop was run from 27-28 May by CASS members Elena Semino, Vaclav Brezina and Carmen Dayrell, and perfectly organised by the local committee Heloísa Feltes and Ana Pelosi.


From left to right: Carmen Dayrell, Heloísa Feltes, Vaclav Brezina, Elena Semino, and Ana Pelosi

This workshop brought together lecturers, researchers, PhDs and MA research students from various Brazilian universities. It was a positive, invigorating experience for the CASS team and a golden opportunity to discuss the various applications of corpus linguistics methods. We would like to thank UCS for offering all necessary conditions to make this workshop run so smoothly.

The workshop was part of a collaborative project between UK and Brazilian scholars funded by the UK’s ESRC and the Brazilian research agency CONFAP (FAPERGS) which will make use of corpus linguistics techniques to investigate the linguistic representation of urban violence in Brazil. Further details of this project can be found at http://cass.lancs.ac.uk/?page_id=1501.

CASS PhD student in Moscow to attend the XVI April International Academic Conference on Economic and Social Development

I recently got the opportunity to travel to Moscow to attend the XVI April International Academic Conference on Economic and Social Development at the National Research University – Higher School of Economics (HSE). This conference covered a wide variety of fields including Sociology, Geography, and Technology, and, on the last day of the conference, there was a seminar specifically for Linguistics PhD students. The aim of this seminar was to allow students from Russia and other countries to exchange ideas, and to introduce students from around the world to HSE.

At the seminar, there were presentations from 10 PhD students and these covered a variety of Linguistics topics including Grammar, Semantics, Sign Language, and Cognitive Linguistics. There were also some presentations on Corpus Linguistics: one which discussed semantic role labelling for the Russian language based on the Russian FrameBank, and another which discussed building a corpus of Soviet poetry. I found it interesting to see corpus analyses based on the Russian language, and it was also interesting to see the use of the ‘web as corpus’. This introduced me to tools that I haven’t used before, such as the Google N-Gram Viewer.

In the afternoon, I gave a presentation entitled The collocation hypothesis: Evidence from self-paced reading. This was the first time I had ever given a conference presentation and I was really pleased to have an audience that seemed interested in my work. The audience was composed of PhD students, some undergraduate students from the Linguistics Department at HSE, researchers from other fields who had presented at the conference on the previous days, as well as a few senior academics who gave me some really useful feedback.

The conference was held at the central building of HSE and, the day before the seminar, an MA student in Computational Linguistics kindly gave me a tour of the Linguistics Department. It was interesting to see that their classes are all seminar-based and I particularly liked the way they had a common room where all members of the department, including undergraduates, postgraduates, and lecturers, go between classes in order to socialise or do work. Here, I got the chance to speak to some undergraduates and postgraduates and I was shown some of the corpora that were compiled at that department, such as the Corpus of Modern Yiddish, the Bashkir Poetic Corpus, and the Russian Learner Corpus of Academic Writing. I was also told about a project called Tolstoy Digital, which involved making a corpus of Tolstoy’s works. It was interesting to hear about the unique problems that were faced when compiling this corpus. For instance, Tolstoy used an older orthography so this had to be translated to the modern form before the corpus could be tagged and parsed.

When speaking to members of the department, it was also interesting to discuss how some of their work links to some of the work carried out at CASS and the Linguistics Department at Lancaster University. For example, Elena Semino’s work on pain questionnaires seemed to link closely to an article written by members of HSE entitled Towards a typology of pain predicates (Reznikova et al. 2012). This article discusses the way in which the semantic domain of pain is largely composed of words borrowed from other semantic domains.

After showing me around the department, the MA student, Natalia, showed me around some of the main sights in central Moscow. I really appreciated this as I got to see some of Moscow from a local’s perspective as well as getting to visit some of the key sights that I was looking forward to seeing such as the Bolshoi Theatre. Whilst in Moscow, I also went to see Swan Lake at the Kremlin Theatre of Classical Russian Ballet. This was an amazing experience because I had always wanted to see a Russian ballet and, although I had already seen Swan Lake several times, this was definitely the best version I had ever seen. Overall I had a brilliant time in Moscow and I am really grateful for the Higher School of Economics for funding and organising the trip.

Workshop on ‘Metaphor in end of life care’ at St Joseph’s Hospice, London

On 26th September 2014, three members of the CASS-affiliated ‘Metaphor in end of life care’ project team were invited to run a workshop at St Joseph’s Hospice in London. The workshop was attended by 27 participants, including clinical staff, non-clinical staff and volunteers.

Veronika Koller (Lancaster University) introduced the project, including its background, rationale, research questions, data and use of corpus methods in combination with qualitative analysis. Zsófia Demjén (The Open University) and Elena Semino (Lancaster University) presented the findings from the project that are particularly relevant to communication between healthcare professionals and patients nearing the end of their lives. These findings include: how patients diagnosed with terminal cancer use Violence and Journey metaphors to talk about their experiences of illness and treatment; and how patients and healthcare professionals use a variety of metaphors to talk about their mutual relationships. The project team pointed out the different ‘framings’ provided by different uses of metaphor, particularly in terms of the empowerment and disempowerment of patients. They provided evidence that no metaphor is inherently good or bad for all patients, but rather suggested that different metaphors work differently for different people, or even for the same person at different times. In the final session, Veronika Koller introduced the ‘Metaphor Menu’ – a collection of metaphors used by cancer sufferers, which the team are planning to pilot as a resource for newly-diagnosed patients.

A lively discussion followed each presentation, with many members of the audience asking questions and contributing their personal and professional experiences. The workshop received very positive evaluations in anonymous feedback questionnaires: 83% of participants rated the session at 4 or 5 on a 5-point scale (where 1 corresponds to ‘Very poor’ and 5 to ‘Excellent’). Comments included: Very interesting research & resonated with my experience. Food for thought!’ and ‘Will help with my area of care, will help me understand and think about what my patients and relatives are actually telling me. Will make me reflect and respond more appropriately’.

CASS visit to Ghana

On June 24th, I and three other members of CASS spent a week in Accra, Ghana, demonstrating corpus methods and our own research at two universities, the University of Ghana and the recently established Lancaster University Ghana campus in Accra. From the UK it’s just over a six hour flight although thankfully only one hour of time difference. However, travel did involve some advance preparation, with jabs for yellow fever (and a few other things), visa applications and taking anti-malarial pills for a month after the trip. Fortunately, we only encountered one mosquito during the whole trip and none of us were bitten.

Although close together, the two universities we visited have a very different feel to them, the former is a large university spread out over a lot of land, with many departments and buildings, while the latter is (at the moment), a three storey modern-looking grey and red building with the familiar Lancaster logo on it.


Our first trip was to the University of Ghana, where Andrew, Tony and I each gave a lecture to about 90 members of staff and students. Tony talked about the theoretical principles behind corpus linguistics, I discussed (and problematized) sex differences in the British National Corpus and Andrew showed applications of corpus linguistics to field linguistics using Corpus Workbench. The University of Ghana has some alumni members of Lancaster University and it was great to run into Clement Appah and Grace Diabah (formely Bota) again.


Over the following two days, we gave corpus linguistics workshops, which included a two hour lab session where Andrew walked students through setting up a CQPweb account and doing some analysis of the Brown Family of corpora. I suspect this was the highlight of the day for those who attended, who were pleased to get access to many of the corpora we have at Lancaster. Each day we taught about 35 people, including some who had travelled quite long distances to get to us. Four students had driven in that morning from Cape Coast – a journey that we did some of when we went to Kakum National Park on our day off, and that took us over three hours – so we were impressed by their dedication. Tony gave an introduction to corpus linguistics and Vaclav talked about the General Service List for English words and let the students use a tool he had developed for exploring it. I ended each day with a talk on corpus linguistics and discourse analysis.


As I’d mentioned, we had a day off, where we visited Kakum National Park. This gave us an opportunity to see more of Ghana on the drive there, and then we had a great experience in the park, walking across a 350m network of rope bridges (the Kakum Canopy Walk) that were suspended high above the ground – you literally got a bird’s eye view of the tropical rainforest below. It was one of the most memorable experiences I’ve had and I think we all came away with very positive feelings about our trip, and are looking forward to our next visit to Ghana. I also hope that we managed to inspire people to incorporate some corpus linguistics methods into their own research.

Reflections from the Front Line: Sarah Russell on MELC and Twitter

Sarah Russell (Director of Education and Research, Peace Hospice Care and the Hospice of St Francis) attended this month’s Language in End-of Life-Care event, where an audience of approximately 40 healthcare professionals and researchers specialising in palliative and end-of-life care gathered to share their perspectives.

In a new blog post on eHospice, she reflects on this experience, as well as sharing some insight into a tweet chat with @WeNurses, where 128 participants came together to discuss individual experiences, symptom control, communication, recognising dying, family and patient needs, caring, and denial as a coping mechanism.

Read more to learn about Sarah’s experience, and to hear her challenge for everyone (including researchers and health care professionals) by visiting eHospice now.

‘Language in End-of-Life Care’: A user engagement event

On 8th May 2014, the main findings of the CASS-affiliated project ‘Metaphor in End-of-Life Care’ were presented to potential users of the research at the Work Foundation in central London. The event, entitled ‘Language in End-of-Life Care’ attracted an audience of approximately forty participants, consisting primarily of healthcare professionals and researchers specialising in palliative and end-of-life care. Although most participants are based in the UK, international guests joined us from Germany, the Netherlands, Spain and the US.

melc1Professor Sheila Payne (Co-Investigator on the project and Co-Director of Lancaster’s International Observatory on End-of-Life Care), opened proceedings and acted as chair for the day’s activities. Two high-profile invited speakers shared their perspectives on communication in end-of-life care. Professor Lukas Radbruch (Chair of Palliative Medicine, University of Bonn) gave a presentation entitled ‘The search for a final sense of meaning in end-of-life discourses’. Among other things, he emphasized the influence of language and culture on perceptions and attitudes towards end of life and end-of-life care. Professor Dame Barbara Monroe (Chief Executive of St Christopher’s Hospice, London) discussed the main current challenges in hospice care in a talk entitled ‘Listening to patient and professional voices in end-of-life care’. These challenges, she argued, include those posed by a variety of linguistic and communicative barriers.


The methods, data and findings of the ‘Metaphor in End-of-Life Care’ project were introduced by four members of the team: Professor Elena Semino (Principal Investigator), Dr Veronika Koller (Co-Investigator), Dr Jane Demmen (Research Associate) and Dr Zsófia Demjén (former Research Associate, currently at the Open University). The project involves a combination of ‘manual’ and corpus-based methods to investigate the metaphors used to talk about end-of-life care in a 1.5-million-word corpus consisting of interviews with and online forum posts by terminally ill patients, family carers and health professionals. The team introduced the findings from the analysis that are particularly relevant to practitioners in end-of-life care, namely: the use of ‘violence’ and ‘journey’ metaphors by terminally ill patients, and the narratives of ‘good’ and ‘bad’ deaths told by hospice managers in semi-structured interviews. The implications of these findings for end-of-life care were suggested by the team and discussed with the audience. Participants were also invited to discuss selected uses of metaphors from the health professionals’ data, and to consider the potential value of some creative, alternative metaphors for cancer in particular.

melc3The richness of the interactions on the day and the liveliness of the event’s hashtag on Twitter (#melc14) suggest that the event was a success. In the words of a hospice director: ‘everybody at the conference was truly inspired by the potential for change in practice and training!’ Although the funded phase of the project is coming to an end, the contacts made on the day are likely to lead to further collaborative research between the Lancaster team and healthcare professionals in the UK and beyond.

Dispatch from YLMP2014


I recently had the pleasure of travelling to Poland to attend the Young Linguists’ Meeting in Poznań (YLMP), a congress for young linguists who are interested in interdisciplinary research and stepping beyond the realm of traditional linguistic study. Hosted over three days by the Faculty of English at Adam Mickiewicz University, the congress featured over 100 talks by linguists young and old, including plenary lectures by Lancaster’s very own Paul Baker and Jane Sunderland. I was one of three Lancaster students to attend the congress, along with undergraduate Agnes Szafranski and fellow MA student Charis Yang Zhang.

What struck me about the congress, aside from the warm hospitality of the organisers, was the sheer breadth of topics that were covered over the weekend. All of the presenters were more than qualified to describe their work as linguistics, but perhaps for the first time I saw within just how many domains such a discipline can be applied. At least four sessions ran in parallel at any given time, and themes ranged from gender and sexuality to EFL and even psycholinguistics. There were optional workshops as well as six plenary talks. On the second day of the conference, as part of the language and society stream, I presented a corpus-assisted critical discourse analysis of the UK national press reporting of the immediate aftermath of the May 2013 murder of soldier Lee Rigby. I was happy to have a lively and engaged audience who had some really interesting questions for me at the end, and I enjoyed the conversations that followed this at the reception in the evening!

What was most encouraging about the congress was the drive and enthusiasm shared by all of the ‘young linguists’ in attendance. I now feel part of a generation of young minds who are hungry to improve not only our own work but hopefully, in time, the field(s) of linguistics as a whole. After my fantastic experience at the Boya Forum at Beijing Foreign Studies University last autumn, I was happy to spend time again celebrating the work of undergraduate and postgraduate students, and early-career linguists. There was a willingness to listen, to share ideas, and to (constructively) criticise where appropriate, and as a result I left Poznań feeling very optimistic about the future of linguistic study. I look forward to returning to the next edition of YLMP, because from what I saw at this one, there is a new generation of linguists eager to push the investigation of language to the next level.

Discourse, Gender and Sexuality South-South Dialogues Conference

Last week was spent in at Witwatersrand (Wits) University in Johannesburg where I had been invited to give a workshop on corpus methods, as well as a talk on some of my own research. The week was topped off by the first Discourse, Gender and Sexuality South-South Dialogues Conference which was organised by Tommaso Milani. Many of the papers at the conference used qualitative methods (analyses of visual data seemed particularly popular) but there were a few papers, including my own, which used corpus methods.

These included a paper by Megan Edwards who combined a corpus approach with CDA and visual analysis to examine a small corpus of pamphlets found around Johannesburg – these pamphlets advertise remedies for sexual and relationship problems and Megan demonstrated that embedded within the adverts were gendered discourses – relating to notions of ideal masculinity and femininity. This is probably one of the few corpora in existence where the top lexical word is penis.

Another interesting paper was by Sally Hunt who examined corpora of articles about sex work in two South African newspapers, focussing on the period when SA hosted the World Cup. She found that while there was a more balanced set of representations of sex workers than expected, they were still largely represented as immoral and criminalised for their actions while the agency of their clients was largely obscured. Sally is a lecturer at Rhodes University, Grahamstown, and has recently completed the construction of a 1 million word South African corpus, using the Brown family sampling frame.

During the workshop that I hosted at the university I got participants to use AntConc to examine a small corpus of recent newspaper articles about feminists, and a number of interesting patterns emerged from the analyses of concordances and collocates that took place. For example, a representation of feminists as war-mongers or vocally annoying/fierce e.g. shrill, strident etc was very prevalent and perhaps expected, although we were surprised to see a sub-set of words which related feminists to Islam like feminist Taleban and feminist fatwas (killing two ideological birds with one stone). Additionally, it was interesting to see how these negative discourses shouldn’t always be taken at face value. They were sometimes quoted in order to be critical of them, although it was often only with expanded concordance lines that this could be seen. In all, a productive week, and it was good to meet so many people who were interested in finding out more about corpus linguistics.


Keynote at the House of Lords

On 17th October 2013 I spent the afternoon at the House of Lords, giving a keynote for the British Federation of Women Graduates (BFWG). Founded in 1907, BFWG has been providing scholarships for women in their final year of degree study since 1912, and it regularly makes awards from its charity to women graduates undertaking postgraduate study and research. BFWG is committed to promoting women’s opportunities in education and public life; fostering local, national, and international friendships; and improving the lives of women and girls worldwide. As such, it was a great honour to be asked by this wonderful organisation to give a keynote at their annual House of Lords seminar, sponsored by Baroness Randerson of Roath Park. Each year the seminar has a theme, and this year’s was, “A woman’s right to know”. The three invited speakers were:

Dr Shuruq Naguib (Lancaster University): “Muslim women: Gender and religious authority”. This talk discussed how women are represented in the Qu’ran and in Islamic thought throughout history.

Sian West (University of Kent): “Restorative justice: Does it work?” This talk considered the benefits of restorative justice and the role of women as victims or perpetrators in the social context in which they find themselves.

Dr Claire Hardaker (Lancaster University): “Meaning and meanness: Disconnecting the online threat from the offline reality”. In this talk, I covered four major areas: (1) What does the term trolling mean? (2) What motivations seem to prompt individuals to troll? (3) How is trolling carried out? And (4) How do those who troll “rationalise” their behaviour? (The slides for this talk can be accessed here.)

My many thanks to BFWG President Jenny Morley, to Vice-President Gabrielle Suff, to The Baroness Randerson, and to all the guests and attendees who made my visit especially warm, friendly, and hospitable. (Pictures of the seminar and lunch can be found here.)