Exploring New Horizons in Corpus Linguistics: Lectures, Workshops and Partnerships in Shanghai

At the end of October 2024, I had the privilege of visiting Shanghai and Suzhou. This trip was more than just a visitโ€”it was an inspiring journey through the vibrant academic and research landscape of corpus linguistics in Eastern China. I had the chance to engage with leading research institutions such as Shanghai Jiao Tong University, Xi’an Jiaotong-Liverpool University and Shanghai University of International Studies, all at the forefront of applied language research and digital humanities. Together, we explored potential collaborations and how our shared expertise could jointly shape the field of corpus linguistics in China, in the UK and internationally.

Empowering Analytical Skills with #LancsBox X
One of the highlights of this trip was a series of training events co-organized with the ESRC Centre for Corpus Approaches to Social Science, Lancaster University (CASS). These sessions introduced participants to #LancsBox X, an innovative software tool for the analysis of large amounts of language data developed at Lancaster University. The workshops focused on the applications in the digital humanities and data-driven learning. As the demand for skills in data analytics and linguistic analysis continues to grow, tools like #LancsBox provide essential capabilities for researchers and students, equipping them to address the language and communication challenges in future workplaces.

The Future of English?
English as a medium of instruction (EMI) is a global phenomenon with a particular relevance in Chinese higher education. It was very insightful to have the opportunity to discuss early findings of an externally funded project led by Dr Dana Gablasova with one of the international collaborators Dr Tanjun Liu and her team at Xi’an Jiaotong-Liverpool University. This project, supported by the British Council, explores the evolving role of English in education through corpus linguistic techniques.
In my talk on the Future of English, I demonstrated how corpus methods were used to analyse student writing and reading, revealing key language patterns, challenges and practices across different academic contexts. These insights will be used to improve language teaching and adapting English education for future needs globally.

Keynote at the 5th Asia Pacific Corpus Linguistics Conference
A final point on my agenda was delivering a plenary lecture at the 5th Asia Pacific Corpus Linguistics Conference. My talk, entitled Corpus Linguistics at a Crossroads: Data, Contexts, and Frequencies, explored the evolving identity of corpus linguistics in todayโ€™s landscape, especially as large language models (LLMs) and generative AI reshape how we approach language data. As the field adapts to these technological advances, corpus linguistics finds itself at a crossroads. The lecture outlined three foundational pillars of corpus linguistics that continue to define the discipline:

  1. Collecting representative samples of language: Creating representative and balanced corpora that reflect real-world language use.
  2. Understanding words in context: Recognizing the importance of language in specific situations and individual contexts (think of critical reading of concordance lines) to grasp different levels of meaning.
  3. Providing frequency information: Analysing word and phrase frequencies to capture usage patterns in large datasets.

In Shanghai, we took steps toward a future in which language, data analytics and technology converge, bringing new insights into language and society. I look forward to continuing this journey, fostering partnerships, and pushing the boundaries of corpus linguistics on a global scale.