ESRC Postdoctoral Fellowship: The psychological validity of non-adjacent collocations

Having recently completed my PhD in CASS, I am really excited to announce that I have been awarded an ESRC Postdoctoral Fellowship for the upcoming academic year.

My research focuses on finding neurophysiological evidence for the existence of collocations, i.e. sequences of two or more words where the words are statistically highly likely to occur together. There are a lot of different types of collocation, and the different types vary along the dimensions of fixedness and compositionality. Idioms, for example, are highly fixed in the sense that one word cannot typically be substituted for another word. They are also non-compositional, which means that the meaning of the expression cannot be derived from knowing the meaning of the component words.

Previous studies investigating the psychological validity of collocation have tended to focus on idioms and other highly fixed expressions. However, this massively limits the generalizability of the findings. In my research, I therefore use a much more fluid conceptualization of collocation, where sequences of words can be considered to be collocational even if they are not fixed, and even if the meaning of the expression is highly transparent. For example, the word pair clinical trials is a collocation, despite lacking the properties of fixedness and non-compositionality, because the word trials is highly likely to follow the word clinical. In this way, I focus on the transition probabilities between words; the transition probability of clinical trials (as measured in a corpus) is much higher than the transition probability of clinical devices, even though the latter word pair is completely acceptable in English, both in terms of meaning and grammar.

In my research, I extract collocational word pairs such as clinical trials from the written BNC1994. I then construct matched non-collocational word pairs such as clinical devices, embed the two sets of word pairs into corpus-derived sentences, and then ask participants to read these sentences on a computer screen while electrodes attached to their scalp detect some of their brain activity. This method of recording the electrical activity of the brain using scalp electrodes is known as electroencephalography, or EEG. More specifically, I use the event-related potential (ERP) technique of analysing brainwave data, where the brain activity is measured in response to a particular stimulus (in this case, collocational and non-collocational word pairs).

My PhD consisted of four ERP experiments. In the first two experiments, I investigated whether or not collocations and non-collocations are processed differently (at the neural level) by native speakers of English. In the third experiment, I did the same but with non-native speakers of English. Then, having found that there are indeed neurophysiological differences in the way that collocations and non-collocations are processed by both native and non-native speakers, I then conducted a fourth experiment to investigate which measures of collocation strength most closely correlate with the brain response. The results of this experiment have really important implications for the field of corpus linguistics, as I found that the two most widely-used measures of collocation strength (namely log-likelihood and mutual information) are actually the two that seem to have the least psychological validity.

The ESRC Postdoctoral Fellowship is unique in that, although it allows for the completion of additional research, the main focus is actually on disseminating the results of the PhD. Thus, during my year as an ESRC Postdoctoral Fellow, I intend to publish the results of my PhD research in high-impact journals in the fields of corpus linguistics and cognitive neuroscience. I will also present my findings at conferences in both of these fields, and I will attend training workshops in other neuroscientific methods.

The additional research that I intend to do during the term of the Fellowship will build upon my PhD work by using the ERP technique to investigate whether or not the neurophysiological difference in the processing of collocations vs. non-collocations is still apparent when the (non-)collocations contain intervening words. For instance, I want to find out whether or not the collocation take seriously is still recognized as such by the brain when there is one intervening word (e.g. take something seriously) or two intervening words (e.g. take the matter seriously), and so on.

Investigating the processing of these non-adjacent collocations is important for the development of linguistic theory. While my PhD thesis focused on word pairs rather than longer sequences of words in order to reduce the number of factors that might influence how the word sequences were processed, making it feasible to conduct controlled experiments, this is actually a very narrow way of conceptualizing the notion of collocation; in practice, words are considered to form collocations when they occur in one another’s vicinity even if there are several intervening words, and even if the words do not always occur in the same order. I will therefore use the results of this additional research to inform the design of research questions and methods for future work engaging with yet more varied types of collocational pattern. This will have important implications for our understanding of how language works in the mind.

I would like to conclude by expressing my gratitude to the ESRC for providing funding for this Fellowship. I am very grateful to be given this opportunity to disseminate the results of my PhD thesis, and I am very excited to carry out further research on the psychological validity of collocation.