Welcome to our newest CASS PhD student!

It’s the start of a new academic year, and the offices of CASS continue to get busier and busier! This week we welcomed our newest PhD student, Ruth Byrne, to the team. Here’s a bit aout Ruth and her research, in her own words:

Ruth ByrneI’ve just begun the first year of my ESRC-funded PhD, and will be using the British Library’s 19th Century newspaper collection to explore historic attitudes to immigration. I completed my undergraduate and masters’ degrees within the History department at Lancaster.

I’ve always been an avid reader and thrived on close textual analysis. So, although my background has firm roots in History, and not Linguistics, the study of language has naturally woven its way through much of my research. The main focus of my undergraduate study was the shifting media language surrounding the struggle for Indian Independence. Without realising it, I effectively conducted a manual hunt for collocates within lines of concordance. Terms I was not to encounter until I heard about the work of CASS during my MA. Unaware of Corpus Linguistics as an approach, and of how it could have hugely increased my efficiency and rapidity, I was frequently frustrated at the laborious nature of the process which I had chosen to undertake.

Perhaps because I’ve found my own work and interests so hard to categorise, I’ve long been fascinated by the concept of interdisciplinary research. I was thrilled to find out that I’d be joining an experienced team who are pushing the boundaries of Corpus Linguistics as an interdisciplinary research tool, and that I’d be working at the intersection of two departments. I am keen to compare the challenges which face researchers working with corpora to those traditionally faced by historians working with large archives.

Some extra-academic trivia: I’m from a family of wine-merchants and spent most childhood holidays being dragged unwillingly around vineyards. As a result I’ve accumulated a lot of odd knowledge about grape varieties and whisky distilleries. When not working on my thesis, I’ll most likely be hiking up a hill in the Lake District.

MA students all pass with Distinction!

Myself, Róisín, and Gillian were delighted to find out last week that we all passed our MA Language and Linguistics degrees with Distinction. Our degree programme included taking a wide range of modules, followed by two terms spent researching and writing a 25,000 word dissertation. All three of us used this opportunity to conduct pilot or exploratory studies in preparation for our PhD studies, which we are excited to be commencing now! You can see the titles and abstracts of our dissertations below:

Abi Hawtin

Methodological issues in the compilation of written corpora: an exploratory study for Written BNC2014

The Centre for Corpus Approaches to Social Science (CASS) at Lancaster University and Cambridge University Press have made an agreement to collaborate on the creation of a new, publicly accessible corpus of contemporary British English. The corpus will be called BNC2014, and will have two sub-sections: Spoken BNC2014 and Written BNC2014. BNC2014 aims to be an updated version of BNC1994 which, despite its age, is still used as a proxy for present day English. This dissertation is an exploratory study for Written BNC2014. I aim to address several methodological issues which will arise in the construction of Written BNC2014: balance and representativeness, copyright, and e-language. These issues will be explored, and decisions will be reached about how these issues will be dealt with when construction of the corpus begins.

Róisín Knight

Constructing a corpus of children’s writing for researching creative writing assessment: Methodological issues

In my upcoming PhD project, I wish to explore applications of corpus stylistics to Key Stage 3 creative writing assessment in the UK secondary National Curriculum. In order to carry out this research, it is necessary to have access to a corpus of Key Stage 3 students’ writing that has been marked using the National Curriculum criteria. Prior to this MA project, no corpus fulfilled all of these criteria.

This dissertation explores the methodological issues surrounding the construction of such a corpus by achieving three aims. Firstly, all of the design decisions required to construct the corpus are made, and justified. These decisions relate to the three main aspects of the corpus construction: corpus design; transcription; metadata, textual markup and annotation. Secondly, the methodological problems relating to these design decisions are discussed. It is argued that, although several problems exist, the majority can be overcome or mitigated in some way. The impact of problems that cannot be overcome is fairly limited. Thirdly, these design decisions are implemented, through undertaking the construction of the corpus, so far as was possible within the limited time restraints of the project.

Gillian Smith

Using Corpus Methods to Identify Scaffolding in Special Education Needs (SEN) Classrooms

Much research addresses teaching methods in Special Education Needs (SEN) classrooms, where language interventions are vital in providing children with developmental language disorders with language and social skills. Research in this field, however, is often limited by its use of small-scale samples and manual analysis. This study aims to address this problem, through applying a corpus-based method to the study of one teaching method, scaffolding, in SEN classrooms. Not only does this provide a large and therefore more representative sample of language use in SEN classrooms, the main body of this dissertation attempts to clarify and demonstrate that corpus methods may be used to search for scaffolding features within the corpus. This study, therefore, presents a systematic and objective way of searching for the linguistic features of scaffolding, namely questions, predictions and repetitions, within a large body of data. In most cases, this was challenging, however, as definitions of features are vague in psychological and educational literature. Hence, I focus on first clarifying linguistic specifications of these features in teacher language, before identifying how these may be searched for within a corpus. This study demonstrates that corpus-based methods can provide new ways of assessing language use in the SEN classroom, allowing systematic, objective searches for teaching methods in a larger body of data.

CASS PhD student in Moscow to attend the XVI April International Academic Conference on Economic and Social Development

I recently got the opportunity to travel to Moscow to attend the XVI April International Academic Conference on Economic and Social Development at the National Research University – Higher School of Economics (HSE). This conference covered a wide variety of fields including Sociology, Geography, and Technology, and, on the last day of the conference, there was a seminar specifically for Linguistics PhD students. The aim of this seminar was to allow students from Russia and other countries to exchange ideas, and to introduce students from around the world to HSE.

At the seminar, there were presentations from 10 PhD students and these covered a variety of Linguistics topics including Grammar, Semantics, Sign Language, and Cognitive Linguistics. There were also some presentations on Corpus Linguistics: one which discussed semantic role labelling for the Russian language based on the Russian FrameBank, and another which discussed building a corpus of Soviet poetry. I found it interesting to see corpus analyses based on the Russian language, and it was also interesting to see the use of the ‘web as corpus’. This introduced me to tools that I haven’t used before, such as the Google N-Gram Viewer.

In the afternoon, I gave a presentation entitled The collocation hypothesis: Evidence from self-paced reading. This was the first time I had ever given a conference presentation and I was really pleased to have an audience that seemed interested in my work. The audience was composed of PhD students, some undergraduate students from the Linguistics Department at HSE, researchers from other fields who had presented at the conference on the previous days, as well as a few senior academics who gave me some really useful feedback.

The conference was held at the central building of HSE and, the day before the seminar, an MA student in Computational Linguistics kindly gave me a tour of the Linguistics Department. It was interesting to see that their classes are all seminar-based and I particularly liked the way they had a common room where all members of the department, including undergraduates, postgraduates, and lecturers, go between classes in order to socialise or do work. Here, I got the chance to speak to some undergraduates and postgraduates and I was shown some of the corpora that were compiled at that department, such as the Corpus of Modern Yiddish, the Bashkir Poetic Corpus, and the Russian Learner Corpus of Academic Writing. I was also told about a project called Tolstoy Digital, which involved making a corpus of Tolstoy’s works. It was interesting to hear about the unique problems that were faced when compiling this corpus. For instance, Tolstoy used an older orthography so this had to be translated to the modern form before the corpus could be tagged and parsed.

When speaking to members of the department, it was also interesting to discuss how some of their work links to some of the work carried out at CASS and the Linguistics Department at Lancaster University. For example, Elena Semino’s work on pain questionnaires seemed to link closely to an article written by members of HSE entitled Towards a typology of pain predicates (Reznikova et al. 2012). This article discusses the way in which the semantic domain of pain is largely composed of words borrowed from other semantic domains.

After showing me around the department, the MA student, Natalia, showed me around some of the main sights in central Moscow. I really appreciated this as I got to see some of Moscow from a local’s perspective as well as getting to visit some of the key sights that I was looking forward to seeing such as the Bolshoi Theatre. Whilst in Moscow, I also went to see Swan Lake at the Kremlin Theatre of Classical Russian Ballet. This was an amazing experience because I had always wanted to see a Russian ballet and, although I had already seen Swan Lake several times, this was definitely the best version I had ever seen. Overall I had a brilliant time in Moscow and I am really grateful for the Higher School of Economics for funding and organising the trip.