Introducing the CASS Guided Reading Project (Part 1)

In collaboration with the Department of Psychology, CASS is investigating the critical features of guided reading that can benefit the language and literacy skills of typically developing children.

What is guided reading?

Guided reading is a technique used by teachers to support literacy development. The teacher works with a small group of children, typically not more than 6, who are grouped according to ability and who work together on the same text. This ability-grouping enables the teacher to focus on the specific needs of those children, and to provide opportunities for them to develop their understanding of what they read through discussion, as well as their reading fluency. In this project we are investigating the features of effective guided reading, with a particular emphasis on reading comprehension.

Features of guided reading

Teachers aim to bridge the gap between children’s current and potential ability. Research indicates that this is best achieved by using methods that facilitate interaction, rather than by providing explicit instruction alone (e.g., Pianta et al., 2007).

The strategies that teachers can use to support and develop understanding of the text are best described as lying on a continuum, from low challenge strategies – for example, asking children simple yes/no or closed-answer questions – to high challenge strategies, that might require children to explain a character’s motivation and evaluate the text. Low challenge strategies pose more limited constraints on possible answers: they may simply require children to repeat back part of the text or provide a one word response, such as a character’s name. High challenge strategies provide greater opportunity for children to express their own interpretation of the text.

Low challenge questions can be used by the teacher to assess children’s basic level of understanding and are also a good way to encourage children to participate in the session. High challenge questions assess a deeper understanding and more sophisticated comprehension skills. Skilled teachers will adapt questions and their challenge depending on the group and individual children’s level of understanding and responsiveness, with the intent of gradually increasing the responsibility for the children to take turns in leading the discussion. This technique is used to scaffold the discussion.

Our investigation: How is guided reading effective?

Previous studies observing guided reading highlight substantial variability in what teachers do and, therefore, in our understanding of how guided reading can be used to best foster language and literacy skills. A more fine-grained and detailed examination of teacher input and its relation to children’s responses is needed to determine the teacher strategies that are most effective in achieving specific positive outcomes (see Burkins & Croft, 2009; Ford, 2015).

Previous research on this topic has typically taken the form of observational studies, in which researchers have had to laboriously parse and hand-code transcriptions of the teacher-children interactions (a corpus) to identify teacher strategies of interest. Because this is a long and painstaking process, it limits the size of the corpus to one that can be coded within a realistic time window. In this project, we aim to maximise interpretation of these naturalistic classroom interactions using powerful corpus search tools. These enable precise computer-searches for a wide range of language features, and are much faster and more reliable compared to hand-coding. This enables us to create and explore a much larger corpus of guided reading sessions than in previous studies, making a fine-grained analysis possible. For an introduction to corpus search methods, check out this CASS document.

Future blogs will provide more detail about the specific corpus search measurements that CASS are using to identify what makes for effective guided reading. The next (upcoming) blog, however, will explain the motivation for using corpus methods to investigate the effective outcomes of guided reading.

Meet the Author of this blog: Liam Blything

Since July 2016, I have been working as a Senior Research Associate on the CASS guided reading project. My Psychology PhD focused on language acquisition and has been awarded by Lancaster University. It is a great privilege to be working on such an exciting project that answers psychological questions with all these exciting and advanced corpus linguistics methods. I look forward to providing future updates!

 

References

Burkins, J. & Croft, M. M. (2010). Preventing misguided reading: new strategies for guided reading teachers. Thousand Oaks CA: Corwin.

Pianta, R. C., Belsky, J., Houts, R., Morrison, F., & the National Institute of Child Health and Human Development Early Child Care Research Network. (2007). Opportunities to learn in America’s elementary classrooms. Science, 315, 1795–1796.

Ford, M. P. (2015). Guided Reading: What’s New, and What’s Next? North Mankato, MN : Capstone.

 

Controlling the scale and pace of immigration: changes in UK press coverage about migration

The issue of immigration prominently featured in debates leading up to the June 2016 EU Referendum vote. It was argued that too many people were entering the UK, largely from other EU member states. Politicians and media also talked about ‘taking back control’—notably in the contexts of deciding who can enter Britain and enforcing borders. But, as our new Migration Observatory report ‘A Decade of Immigration in the British Press’ reveals through corpus linguistic methods, such language wasn’t necessarily new: in fact, under the coalition government from 2010-2015, the press was increasingly casting migration in terms of its scale or pace. And, the relative importance of ‘limiting’ or ‘controlling’ migration rose over this period, too.

Our report aimed to understand how British press coverage of immigration had changed in the decade leading up to the May 2015 General Election. We built upon previous research done at Lancaster University (headed by CASS Deputy Director Paul Baker) into portrayals of migrant groups. Our corpus of 171,401 items comes from all 19 national UK newspapers (including Sunday versions) that continuously published between January 2006 and May 2015. Using the Sketch Engine, we identified the kinds of modifiers (adjectives) and actions (verbs) associated with the terms ‘immigration’ and ‘migration’.

The modifiers that were most frequently associated with either of these terms included ‘mass’ (making up 15.7% of all modifiers appearing with either word), ‘net’ (15.6%), and ‘illegal’ (11.9%). Closer examination of the top 50 modifiers revealed a group of words related to the scale or pace of migration: in addition to ‘mass’ and ‘net’, these included terms such as ‘uncontrolled’, ‘large-scale’, ‘high’, and ‘unlimited’. Grouping these terms together, and tracking their proportion of all modifiers compared to those related to illegality—which is another prominent way of referring to immigrants—reveals how these terms made up an increasingly larger share of modifiers under both the Labour and coalition governments since 2006. Figure 1 shows how these words made up nearly 40% of all modifiers in 2006, but over 60% in the five months of 2015. Meanwhile, the share of modifiers referring to legal aspects of immigration (‘illegal’, ‘legal’, ‘unlawful’, or ‘irregular’) declined from 22% in 2006 to less than 10% in January-May 2015.

Figure 1.

blog

 

 

 

 

 

 

 

Another way of examining this dimension of ‘scale’ or ‘pace’ is to look at the kinds of actions (verbs) done to ‘immigration’ or ‘migration’. For example, in the sentences ‘the government is reducing migration’ and ‘we should encourage more highly-skilled immigration’, the verbs ‘reduce’ and ‘encourage’ signal some kind of action being done to ‘immigration’ and ‘migration’. In a similar way to Figure 1, we looked at the most frequent verbs associated with either term. A category of words expressing efforts to limit or control movement—what we call ‘limit’ verbs in the report—emerged from the top 50 verbs. These included examples such as ‘control’, ‘tackle’, ‘reduce’, and ‘cap’.

Figure 2 shows how the overall frequency of these limit verbs, indicated by the solid line, rose by about five times between 2006 and the high point in 2014—most notably from 2013. But, as a share of all verbs expressing some action towards ‘immigration’ or ‘migration’, this category was consistently making up 30-40% from 2010 onwards. This suggests that, although these kinds of words weren’t that frequent in absolute terms until 2014, the press had already started moving towards using them from 2010.

Figure 2.

blog1

 

 

 

 

 

 

 

These results show how the kind of language around immigration has changed since 2006. Corpus methods allow us to look at a large amount of text—in this case, over a significant period of time in British politics—in order to put recent rhetoric in its longer context. By doing so, researchers contribute concrete evidence about how the British press has actually talked about migrants and migration. Such evidence opens timely and important debates about the role of the press in public discussion (how does information presented through media impact public opinion?) and the extent to which press outputs should be scrutinised.

About the author: William Allen is a Research Officer with The Migration Observatory and the Centre on Migration, Policy, and Society (COMPAS), both based at the University of Oxford. His research focuses on the ways that media, public opinion, and policymaking on migration interact. He also is interested in the ways that migration statistics and research evidence is used in non-academic settings, especially through data visualisations.

New CASS PhD student!

CASS is delighted to welcome new PhD student Andressa Gomide to the centre, where she will be working on data visualization in corpus linguistics. Continue reading to find out more about Andressa!


I am in the first year of a my PhD in Linguistics, which is focused on data visualizations for corpus tools. Being a research student at CASS, I am looking forward to gaining a better understanding of how different fields of study use corpus tools in their research.

IMG_4188

I’ve been involved with corpus linguistics since 2011, when I started my undergraduate research program on leaner corpora. Since then, I have developed a strong interest in corpus studies, which led me to devote my BA and my MA to this theme. I completed both my BA and my MA at the Universidade Federal de Minas Gerais in Brazil.

Aside from my interest in linguistics, I also enjoy outdoor activities such as cycling and hiking.

CASS goes to the Wellcome Trust!

Earlier this month I represented CASS in a workshop, hosted by the Wellcome Trust, which was designed to explore the language surrounding patient data. The remit of this workshop was to report back to the Trust on what might be the best ways to communicate to patients about their data, their rights respecting their data, and issues surrounding privacy and anonymity. The workshop comprised nine participants who all communicated with the public as part of their jobs, including journalists, bloggers, a speech writer, a poet, and a linguist (no prizes for guessing who the latter was…). On a personal note, I had prepared for this event from the perspective of a researcher of health communication. However, the backgrounds of the other participants meant that I realised very quickly that my role in this event would not be so specific, so niche, but was instead much broader, as “the linguist” or even “the academic”.

Our remit was to come up with a vocabulary for communication about patient data that would be easier for patients to understand. As it turned out, this wasn’t too difficult, since most of the language surrounding patient data is waffly at its best, and overly-technical and incomprehensible at its worst. One of the most notable recommendations we made concerned the phrase ‘patient data’ itself, which we thought might carry connotations of science and research, and perhaps disengage the public, and so recommended that the phrase ‘patient health information’ might sound less technical and more 14876085_10154608287875070_1645281813_otransparent. We undertook a series of tasks which ranged from sticking post-it notes on whiteboards and windows, to role play exercises and editing official documents and newspaper articles. What struck me, and what the diversity of these tasks demonstrated particularly well, was how the suitability of our suggested terms could only really be assessed once we took the words off the post-it notes and inserted them into real-life communicative situations, such as medical consultations, patient information leaflets, newspaper articles, and even talk shows.

The most powerful message I took away from the workshop was that close consideration of linguistic choices in the rhetoric surrounding health is vital for health care providers to improve the ways that they communicate with the public. To this end, as a collection of methods that facilitate the analysis of large amounts of authentic language data in and across a variety of texts and contexts, corpus linguistics has an important role to play in providing such knowledge in the future. Corpus linguistic studies of health-related communication are currently small in number, but continue to grow apace. Although the health-related research that is being undertaken within CASS, such as Beyond the Checkbox and Metaphor in End of Life Care, go some way to showcasing the rich fruits that corpus-based studies of health communication can bear, there is still a long way to go. In particular, future projects in this area should strive to engage consumers of health research not only in terms of our findings, but also the (corpus) methods that we have used to get there.

Further Trinity Lancaster Corpus research: Examiner strategies

This month saw a further development in the corpus analyses: the examiners. Let me introduce myself, my name is Cathy Taylor and I’m responsible for examiner training at Trinity and was very pleased to be asked to do some corpus research into the strategies the examiners use when communicating with the test takers.

In the GESE exams the examiner and candidate co-construct the interaction throughout the exam. The examiner doesn’t work from a rigid interlocutor framework provided by Trinity but instead has a flexible test plan which allows them to choose from a variety of questioning and elicitation strategies. They can then respond more meaningfully to the candidate and cover the language requirements and communication skills appropriate for the level. The rationale behind this approach is to reflect as closely as possible what happens in conversations in real life. Another benefit of the flexible framework is that the examiner can use a variety of techniques to probe the extent of the candidate’s competence in English and allow them to demonstrate what they can do with the language. If you’re interested more information can be found in Trinity’s speaking and listening tests: Theoretical background and research.

After some deliberation and very useful tips from the corpus transcriber, Ruth Avon, I decided to concentrate my research on the opening gambit for the conversation task at Grade 6, B1 CEFR. There is a standard rubric the examiner says to introduce the subject area ‘Now we’re going to talk about something different, let’s talk about…learning a foreign language.’  Following this, the examiner uses their test plan to select the most appropriate opening strategy for each candidate. There’s a choice of six subject areas for the conversation task listed for each grade in the Exam information booklet.

Before beginning the conversation examiners have strategies to check that the candidate has understood and to give them thinking time. The approaches below are typical.

  1. E: ‘Let’s talk about learning a foreign language…’
    C: ‘yes’
    E:Do you think English is an easy language?’ 
  1. E: ‘Let ‘s talk about learning a foreign language’
    C: ‘It’s an interesting topic’
    E: ‘Yes uhu do you need a teacher?
  1. It’s very common for the examiner to use pausing strategies which gives thinking time:
    E: ‘Let ‘s talk about learning a foreign language erm why are you learning English?’
    C: ‘Er I ‘m learning English for work erm I ‘m a statistician.’

There are a range of opening strategies for the conversation task:

  • Personal questions: ‘Why are you learning English?’ ‘Why is English important to you?’
  • More general question: ‘How important is it to learn a foreign language these days?’
  • The examiner gives a personal statement to frame the question: ‘I want to learn Chinese (to a Chinese candidate)…what do I have to do to learn Chinese?’
  • The examiner may choose a more discursive statement to start the conversation: ‘Some people say that English is not going to be important in the future and we should learn Chinese (to a Chinese candidate).’
  • The candidate sometimes takes the lead:
  • Examiner: ‘Let’s talk about learning a foreign language’
  • Candidate: ‘Okay, okay I really want to learn a lo = er learn a lot of = foreign languages’

A salient feature of all the interactions is the amount of back channelling the examiners do e.g. ‘erm, mm’  etc. This indicates that the examiner is actively listening to the candidate and encouraging them to continue. For example:

E: ‘Let’s talk about learning a foreign language, if you want to improve your English what is the best way?
C: ‘Well I think that when you see programmes in English’
E: ‘mm
C: ‘without the subtitles’
E: ‘mm’
C: ‘it’s a good way or listening to music in other language
E: ‘mm
C: ‘it’s a good way and and this way I have learned too much

When the corpus was initially discussed it was clear that one of the aims should be to use the findings for our examiner professional development programme.  Using this very small dataset we can develop worksheets which prompt examiners to reflect on their exam techniques using real examples of examiner and candidate interaction.

My research is in its initial stages and the next step is to analyse different strategies and how these validate the exam construct. I’m also interested in examiner strategies at the same transition point at the higher levels, i.e. grade 7 and above, B2, C1 and C2 CEFR. Do the strategies change and if so, how?

It’s been fascinating working with the corpus data and I look forward to doing more in the future.

Continue reading

Birmingham ERP Boot Camp

Last week I attended a 5-day ERP Boot Camp at the University of Birmingham, and this was an incredible opportunity for me to learn from ERP experts and get specific advice for running my next ERP experiments. The workshop was led by two of the most renowned ERP researchers in the world, namely Professor Steven Luck and Dr Emily Kappenman. Luck and Kappenman are both part of the Centre for Mind and Brain at the University of California, Davis, which is one of the world’s leading centres for research into cognitive neuroscience. They are both among the set of researchers who set the publicationjen workshop blog 1 guidelines and recommendations for conducting EEG research (Keil et al. 2014), and Luck is also the developer of ERPLAB, which is a MATLAB Toolbox designed specifically for ERP data analysis. Moreover, Luck is the author of the authoritative book entitled An Introduction to the Event-Related Potential Technique. Before attending the ERP Boot Camp, most of the knowledge that I had about ERPs came from this book. Therefore, I am extremely grateful that I have had this opportunity to learn from the authorities in the field, especially since Luck and Kappenman bring the ERP Boot Camp to the University of Birmingham just once every three years.

There were two parts to the ERP Boot Camp: 2.5 days of lectures covering the theoretical aspects of ERP research (led by Steven Luck), and 2.5 days of practical workshops which involved demonstrations of the main data acquisition and analysis steps, followed by independent data analysis work using ERPLAB (led by Emily Kappenman). Day 1 of the Boot Camp provided an overview of different experimental paradigms and different ERP components, which are defined as voltage changes that reflect a particular neural or psychological process (e.g. the N400 component reflects the processing of meaning and the P600 component reflects the processing of structure). Most of the electrical activity in the brain that can be detected by scalp electrodes comes from the surface of the cortex but, in the lecture on ERP components, I was amazed to find out that there are some ERP components that actually reflect brain stem activity. These components are known as auditory brainstem responses. I also learnt about how individual differences between participants are typically the result of differences in cortical folding and differences in skull thickness, rather than reflecting any functional differences, and I learnt how ERP components from one domain such as language can be used to illuminate psychological processes in other domains such as memory. From this first day at the Boot Camp, I started to gain a much deeper conceptual understanding of the theoretical basis of ERP research, causing me to think of questions that hadn’t even occurred to me before.

Day 2 of the Boot Camp covered the principles of electricity and magnetism, the practical steps involved in processing an EEG dataset, and the most effective ways of circumventing and minimizing the problems that are inevitably faced by all ERP researchers. On this day I also learnt the importance of taking ERP measurements from difference waves rather than from the raw ERP waveforms. This is invaluable knowledge to have when analysing the data from my next experiments. In addition, I gained some concrete advice on stimulus presentation which I will take into account when editing my stimuli.

On day 3 of the Boot Camp, we were shown examples of ‘bad’ experimental designs and we were asked to identify the factors that made them problematic. Similarly, we discussed how to identify problematic results just by looking at the waveforms. These was really useful exercises in helping me to critically evaluate ERP studies, which will be useful both when reading published articles and when thinking about my own experimental design.

From the outset of the Boot Camp, we were encouraged to ask questions at any time, andJen workshop blog 2 this was particularly useful when it came to the practical sessions as we were able to use our own data and ask specific questions relating to our own experiments. I came prepared with questions that I had wanted to know the answers to for a long time, as well as additional questions that I had thought of throughout the Boot Camp, and I was given clear answers to every one of these questions.

Furthermore, as well as acquiring both theoretical and practical knowledge from the scheduled lectures and workshops, I also gained a lot from talking to the other ERP researchers who were attending the Boot Camp. A large proportion of attendees focused on language as their main research area, while others focused on clinical psychology or other areas of psychology such as memory or perception. I found it really interesting to hear the differences of opinion between those who were primarily linguists and those who were primarily psychologists. For instance, when discussing the word-by-word presentation of sentences in ERP experiments, the psychologists stated that each word should immediately replace the previous word, whereas the linguists concluded that it is best to present a blank white screen between each word. Conversations such as this made it very apparent that many of the aspects of ERP research are not standardised, and so it is up to the researcher to decide what is best for their experiment based on what is known about ERPs and what is conventional in their particular area of research.

Attending this ERP Boot Camp was a fantastic opportunity to learn from some of the best ERP researchers in the world. I now have a much more thorough understanding of the theoretical basis of ERP research, and I have an extensive list of practical suggestions that I can apply to my next experiments. I thoroughly enjoyed every aspect of the workshop and I am very grateful to CASS for funding the trip.

Spoken BNC2014 book announcement

We are excited to announce a forthcoming book which will be published as part of the BNC2014 logoRoutledge Advances in Corpus Linguistics series. “Corpus Approaches to Contemporary British Speech: Sociolinguistic Studies of the Spoken BNC2014” (edited by Vaclav Brezina, Robbie Love and Karin Aijmer) will feature a collection of research which is currently being undertaken by the recipients of the Spoken BNC2014 Early Access data grants.

With exclusive early access to approximately five million words of Spoken BNC2014 data, the book’s contributors will present a range of innovative studies which each analyse the corpus from a sociolinguistic perspective.

Following the public release of the complete Spoken BNC2014 (approximately ten million words) in late 2017, the book is anticipated to follow shortly thereafter. The agreement of the book with Routledge joins a previously announced special issue of the International Journal of Corpus Linguistics (IJCL), which will feature a range of work by other recipients of the Spoken BNC2014 Early Access data grants.

 

Participants needed for psycholinguistic experiment!

My PhD research combines methods from corpus linguistics and psychology in order to find out more about how language is processed in the brain. The method that I use from psychology is known as electroencephalography (EEG), and this involves placing electrodes across a participant’s scalp in order to detect some of the electrical activity of the brain. More specifically, I use the event-related potential (ERP) technique, which involves measuring the electrical activity of the brain in response to particular stimuli. When I carried out my pilot study earlier this year, this was the first time the EEG/ERP method had been used in the Department of Linguistics and Language, making it a really exciting project to get involved with.

Having completed my pilot study and obtained some really interesting results, I have refined my methods and hypotheses and I am now ready to recruit participants for my next two experiments. For one experiment which will take place in late August, I am looking for 16 native speakers of Mandarin Chinese; for another experiment which will take place in October, I am looking for 16 native speakers of English. I would really appreciate hearing from anyone who is interested in taking part! The whole procedure takes about 1 hour; it takes about 20-30 minutes for me to attach all of the electrodes, and the experiment itself takes an additional 20-30 minutes.

If you do decide to take part, you will wear a headcap containing 64 plastic electrode holders which the electrodes are clipped into, as well as 6 electrodes around your eyes and 2 electrodes behind your ears. The electrodes make contact with your skin via a conductive gel which enables some of the electrical signals in your brain to propagate to the electrode wires and into the AD-box, where the electrical signal is amplified and converted from analog to digital format. The amplified signals are then transmitted to the USB2 receiver via a fibre-optic cable, before being relayed onto the data acquisition computer where your brainwaves can be viewed as a continuous waveform. Before starting the experiment, I will ask you to blink, clench your teeth, and move your head from left to right so that you can see how these movements affect the observed waveform.

jen expermient

The experiment itself involves reading real language data that has been extracted from the British National Corpus. This consists of sentences which are presented word-by-word on a computer screen. After reading each sentence, you will be asked to respond to a true/false statement based on the sentence that you have just read.

Before conducting my pilot study, I carried out a number of test-runs on other postgraduate students and each one of them found it to be a really interesting experience. For instance, Gillian Smith, another PhD research student in CASS, agreed to take part in one of my test-runs and here she describes her experience as a participant:

“Getting to be involved in Jen’s experiment was a great opportunity! Having never participated in such a study before, I found the whole process (which Jen explained extremely well) very interesting. I particularly enjoyed being able to look at my brainwaves after, which is something I have never experienced. Likewise, having electrodes on my head was a lovely novelty.”

gill jen experiment


I would really like to hear from any native speakers of Mandarin Chinese or native speakers of English who would be interested in taking part in one of these experiments. Please email j.j.hughes(Replace this parenthesis with the @ sign)lancaster.ac.uk to express interest and to receive more information.

CASS goes to Weihai!

 

China 1

Between the 28th July and the 2nd August, Carmen Dayrell and I represented CASS at the 3rd Sino-UK Summer School of Corpus Linguistics. The summer school was organised by Beijing Foreign Studies University and was hosted at the Weihai campus of the University of Shandong, China. A research symposium followed the summer school on the 3rd August where we presented our research to representatives from both universities. The research symposium gave us a taste of how corpus linguistics is used in a different culture and we heard papers on a range of different topics, such as Alzheimer’s research, work on translations, Chinese medicine, and analyses of media discourse.

Our summer school sessions introduced students to corpus linguistics and gave them an overview of the discipline’s development within a UK context. We also discussed the range of projects ongoing at CASS and foregrounded the interdisciplinary focus of the Centre’s work. After the formal lectures, we ran hands-on sessions demonstrating how to use Graphcoll and CQPweb and conducted seminars using material from the Climate Change and Discourses of Distressed Communities projects to test the students’ frequency, keywords, and concordance analysis skills. The students really engaged with the sessions and were particularly taken with Graphcoll. They enjoyed doing the practical sessions, which they said were different to how they usually learned. Everyone in the classroom worked really hard and asked great questions that showed how interested they were in Lancaster’s tools.

China 2

Weihai is an absolutely beautiful place. The university sits with a sandy beach on one side and a mountain on the other. Because of this, Weihai campus is considered to have good Fung Shui. The place itself was described as a small city by those who live here, but ‘small’ is relative when compared to cities the size of Lancaster. Carmen and I enjoyed our time in China (despite a long journey involving flight cancellations and a trip to a Beijing hotel in the middle of the night) and loved seeing how well the students took to corpus linguistics and the materials that we prepared for them. The trip was a great success and we look forward to future collaborations between Lancaster and Beijing Foreign Studies University.

China 3

Dealing with Optical Character Recognition errors in Victorian newspapers

CASS PhD student, Amelia Joulain-Jay, has been researching to what extent OCR errors are a problem when researching historical texts, and whether these errors can be corrected. Amelia’s work has recently been featured in a very interesting blog post on the British Library’s website – you can read the full post here.