Discourse, Gender and Sexuality South-South Dialogues Conference

Last week was spent in at Witwatersrand (Wits) University in Johannesburg where I had been invited to give a workshop on corpus methods, as well as a talk on some of my own research. The week was topped off by the first Discourse, Gender and Sexuality South-South Dialogues Conference which was organised by Tommaso Milani. Many of the papers at the conference used qualitative methods (analyses of visual data seemed particularly popular) but there were a few papers, including my own, which used corpus methods.

These included a paper by Megan Edwards who combined a corpus approach with CDA and visual analysis to examine a small corpus of pamphlets found around Johannesburg – these pamphlets advertise remedies for sexual and relationship problems and Megan demonstrated that embedded within the adverts were gendered discourses – relating to notions of ideal masculinity and femininity. This is probably one of the few corpora in existence where the top lexical word is penis.

Another interesting paper was by Sally Hunt who examined corpora of articles about sex work in two South African newspapers, focussing on the period when SA hosted the World Cup. She found that while there was a more balanced set of representations of sex workers than expected, they were still largely represented as immoral and criminalised for their actions while the agency of their clients was largely obscured. Sally is a lecturer at Rhodes University, Grahamstown, and has recently completed the construction of a 1 million word South African corpus, using the Brown family sampling frame.

During the workshop that I hosted at the university I got participants to use AntConc to examine a small corpus of recent newspaper articles about feminists, and a number of interesting patterns emerged from the analyses of concordances and collocates that took place. For example, a representation of feminists as war-mongers or vocally annoying/fierce e.g. shrill, strident etc was very prevalent and perhaps expected, although we were surprised to see a sub-set of words which related feminists to Islam like feminist Taleban and feminist fatwas (killing two ideological birds with one stone). Additionally, it was interesting to see how these negative discourses shouldn’t always be taken at face value. They were sometimes quoted in order to be critical of them, although it was often only with expanded concordance lines that this could be seen. In all, a productive week, and it was good to meet so many people who were interested in finding out more about corpus linguistics.

southafrica

Keynote at the House of Lords

On 17th October 2013 I spent the afternoon at the House of Lords, giving a keynote for the British Federation of Women Graduates (BFWG). Founded in 1907, BFWG has been providing scholarships for women in their final year of degree study since 1912, and it regularly makes awards from its charity to women graduates undertaking postgraduate study and research. BFWG is committed to promoting women’s opportunities in education and public life; fostering local, national, and international friendships; and improving the lives of women and girls worldwide. As such, it was a great honour to be asked by this wonderful organisation to give a keynote at their annual House of Lords seminar, sponsored by Baroness Randerson of Roath Park. Each year the seminar has a theme, and this year’s was, “A woman’s right to know”. The three invited speakers were:

Dr Shuruq Naguib (Lancaster University): “Muslim women: Gender and religious authority”. This talk discussed how women are represented in the Qu’ran and in Islamic thought throughout history.

Sian West (University of Kent): “Restorative justice: Does it work?” This talk considered the benefits of restorative justice and the role of women as victims or perpetrators in the social context in which they find themselves.

Dr Claire Hardaker (Lancaster University): “Meaning and meanness: Disconnecting the online threat from the offline reality”. In this talk, I covered four major areas: (1) What does the term trolling mean? (2) What motivations seem to prompt individuals to troll? (3) How is trolling carried out? And (4) How do those who troll “rationalise” their behaviour? (The slides for this talk can be accessed here.)

My many thanks to BFWG President Jenny Morley, to Vice-President Gabrielle Suff, to The Baroness Randerson, and to all the guests and attendees who made my visit especially warm, friendly, and hospitable. (Pictures of the seminar and lunch can be found here.)

Is translated Chinese still Chinese?

Is translated Chinese still Chinese? Do translated English and translated Chinese have anything in common? Can the properties observed on the basis of translational English in contrast to comparable non-translated English be generalised to other translational languages? These interesting questions were explored in Dr Richard Xiao’s keynote lecture entitled “Translation universal hypotheses reevaluated from the Chinese perspective”, delivered at the joint meeting of the 11th International Congress of the Brazilian Association of Researchers in Translation (ABRAPT) and the 5th International Congress of Translators, held at the Federal University of Santa Catarina (UFSC), Florianópolis on 23-26 September 2013.

Corpus-based Translation Studies focuses on translation as a product by comparing comparable corpora of translated and non-translated texts. A number of distinctive features of translations have been posited including, for example, explicitation, simplification, normalisation, levelling out (convergence), source language interference, and under-representation of target language unique items.

Nevertheless, research of this area has until recently been confined largely to translational English and closely related European languages. If the features of translational language that have been reported on the basis of these languages are to be generalised as “translation universals”, the language pairs involved must not be restricted to English and closely related European languages. Clearly, evidence from a genetically distant language pair such as English and Chinese is arguably more convincing, if not indispensable.

Richard’s presentation reevaluated the largely English-based translation universal hypotheses from the perspective of translational Chinese, on the basis of a systematic empirical study of the lexical and grammatical properties of translational Chinese represented in a one-million-word balanced corpus of translated texts in contrast with a comparable corpus of native Chinese texts.

The conference was organised by the Brazilian Association of Researchers in Translation (ABRAPT). During the conference, Richard also gave a talk about corpus-based translation studies at the Roundtable “Translation and Interdisciplinarity”.

Politeness and impoliteness in digital communication: Corpus-related explorations

Post-event review of the one-day workshop at Lancaster University

Topics don’t come much hotter than the forms of impoliteness or aggression that are associated with digital communication – flaming, trolling, cyberbullying, and so on. Yet academia has done surprisingly little to pull together experts in social interaction (especially (im)politeness) and experts in the new media, let alone experts in corpus-related work. That is, until last Friday, when the Corpus Approaches to Social Science Centre (@CorpusSocialSci) invited fifteen such people from diverse backgrounds (from law to psychology) gathered together for an intense one-day workshop.

CASS workshop cropped

The scope of the workshop was broad. One cannot very well study impoliteness without considering politeness, since merely failing to be polite in a particular context could be taken as impoliteness. Similarly, the range of digital communication types – email, blogs, texts, tweets and so on – presents a varied terrain to navigate. And then there are plenty of corpus-related approaches and notions, including collocation, keywords, word sketches, etc.

Andrew Kehoe (@ayjaykay), Ursula Lutzky (@UrsulaLutzky) and Matt Gee (@mattbgee) kicked off the day with a talk on swearwords and swearing, based on their 628-million-word Birmingham Blog Corpus. Amongst other things, they showed how internet swearword/profanity filters would work rather better if they incorporated notions like collocation. For example, knowing the words that typically accompany items like balls and tart can help disambiguate neutral usages (e.g. “tennis balls”, “lemon tart”) from less salubrious usages! (See more research from Andrew here, from Ursula here, and from Matt here.)

With Ruth Page’s (@ruthtweetpage) presentation, came a switch from blogs to Twitter. Using corpus-related techniques, Ruth revealed the characteristics of corporate tweets. Given that the word sorry turns out to be the seventh most characteristic or keyword for corporate tweets, it was not surprising that Ruth focused on apologies. She reveals that corporate tweets tend to avoid stating a problem or giving an explanation (thus avoiding damage to their reputation), but are accompanied by offers of repair and attempts to build – at least superficially – rapport. (See more research from Ruth here.)

Last of the morning was Caroline Tagg’s (@carotagg) presentation, and with this came another shift in medium, from Twitter to text messages. Focusing on convention and creativity, Caroline pointed out that, contrary to popular opinion, heavily abbreviated messages are not in fact the norm, and that when abbreviations do occur, they are often driven by communicative needs, e.g. using creativity to foster interest and engagement. Surveying the functions of texts, Caroline established that maintenance of friendship is key. And corpus-related techniques revealed the supporting evidence: politeness formulae were particularly frequent, including the salutation have a good one, the hedge a bit for the invitation, and for further contact, give us a bell. (See more research from Caroline here.)

With participants refuelled by lunch, Claire Hardaker (@DrClaireH) and I presented a smorgasbord of relevant issues. As an opening shot, we displayed frequencies showing that the stereotypical emblems of British politeness, words such as please, thank you, sorry, excuse me, can you X, tend not to be frequent in any digital media variety, relative to spoken conversation (as represented in the British National Corpus). Perhaps this accounts for why at least some sectors of the British public find digital media barren of politeness. This is not to say that politeness does not take place, but it seems to take place through different means – consider the list of politeness items derived by Caroline above. And there was an exception: sorry was the only item that occurred with greater frequency in some digital media. This, of course, nicely ties in with Ruth’s focus on apologies. The bulk of my and Clare’s presentation revolved around using corpus techniques to help establish: (1) definitions (e.g. what is trolling?), (2) strategies and formulae (e.g. what is the linguistic substance of trolling?) and (3) evaluations (e.g. what or who is considered rude?). Importantly, we showed that corpus-related approaches are not just lists of numbers, but can integrate qualitative analyses. (See more research from me here, and from Claire here.)

With encroaching presentation fatigue, the group decamped and went to at a computer lab. Paul Rayson (@perayson) introduced some corpus tools, notably WMatrix, of which he is the architect. Amanda Potts (@watchedpotts) then put everybody through their paces – gently of course! – giving everybody the opportunity of valuable hands-on experience.

Back in our discussion room and refreshed by various caffeinated beverages, we spent an hour reflecting on a range of issues. The conversation moved towards corpora that include annotations (interpretative information). Such annotations could be a way of helping to analyse images, context, etc., creating an incredibly rich dataset that could only be interrogated by computer (see here, for instance). I noted that this end of corpus work was not far removed from using Atlas or Nudist. Snapchat came up in discussion, not only because it involves images (though they can include text), but also because it raises issues of data accessibility (how do you get hold of a record of this communication, if one of its essential features is that it dissolves within a narrow timeframe?). The thorny problem of ethics was discussed (e.g. data being used in ways that were not signaled when original user agreements were completed).

Though exhausting, it was a hugely rewarding and enjoyable day. Often those rewards came in the form of vibrant contributions from each and every participant. Darren Reed, for example, pointed out that sometimes what we were dealing with is neither digital text nor digital image, but a digital act. Retweeting somebody, for example, could be taken as a “tweet act” with politeness implications.

Notes from the 3rd annual Boya Forum 2013 Undergraduate Conference

If, six months ago, you had told me that an assignment I was writing during my undergraduate degree would eventually send me to China for the weekend, I wouldn’t have believed you. However, that is exactly what I found myself doing last weekend, when I travelled to Beijing Foreign Studies University to present at the 3rd annual Boya Forum 2013 undergraduate conference. I was one of two students from Lancaster University sent there to present at the event, which aimed to celebrate the undergraduate research abilities of students in the areas of English literature, translation studies, media and communication studies, cultural studies, international and area studies and, most relevant to my work, language studies. The participants represented a total of 27 universities, and coming from Lancaster I was from one of only three universities from outside of China; the others being Columbia University in New York and Rollins College in Florida.

The conference ran four concurrent panels of talks at any given time, meaning that in just one day we produced a total of 70 individual presentations. It was an intense day of talks and discussions that ran from the early morning right through into the evening, and my talk was right at the end of the day so I knew I would have a job of trying to keep my audience’s attention. I presented a corpus-based critical discourse analysis of a Parliament debate about the Marriage (Same Sex Couples) Bill, which seems to have been my party trick over the summer (I gave a poster of this at the Corpus Linguistics 2013 conference in July and presented about it at a PhD course in Copenhagen in August). Afterwards I was posed some really interesting questions about my work from both the professor who acted as “commentator” for the session and from other students in attendance. It was a great opportunity to reflect on my work and think about what I might do differently the next time I do a similar piece of analysis. It was also really great to see four or five other presentations from Chinese students who had used corpus-based techniques in their research, and to be able to discuss how our approaches differ.

At the end of the day there was a closing ceremony where the professors from BFSU awarded prizes for the best presentations of the conference, based on the ratings of the commentators from each panel. I was very happy to be one of nine recipients of a “First Prize for Best Presentation” award and an official BFSU jacket to match. I wore it proudly on the journey back to Lancaster.

The organisers of the Boya Forum 2013 undergraduate conference should be proud of what they are doing. As a recently graduated BA student I completely agree that the research potential of undergraduate students, particularly in arts and social science-based disciplines, should be valued and celebrated more. Events like this are a brilliant way of showing undergraduate students that their work is valued beyond the difference between a first and a 2:1. This was the first year of the conference’s short history that students from outside of China had contributed to the event, and it was great to hear that the organisers hope to invite an even wider international presence next year. Though, unfortunately, I will no longer qualify to present at next time, I look forward to hearing about more undergraduate students from Lancaster and elsewhere travelling to Beijing to present at Boya Forum 2014. It certainly was a fantastic experience, and I am extremely grateful to CASS and BFSU for jointly funding my visit.

ESRC Summer School in Corpus Approaches to Social Science 2013: feedback

In the week of 16th – 19th July 2013, CASS organised the first Summer school for PhD students and post-doctoral researchers in social science disciplines with an interest in the methods of corpus linguistics. Twenty participants from 15 different Higher Education institutions form the UK and overseas (Israel, Brazil, Poland, Czech Republic, Italy) attended the event.  The following is a summary of the feedback we received from the participants at the end of the event (this summary is based on 16 returned surveys).

  • All participants agreed that the quality of the Summer School sessions was high.
  • All participants agreed that the Summer School had a friendly atmosphere.
  • All but one participant said that they were confident to apply corpus methods in their own work after having attended the Summer School.

In particular, the participants appreciated the practical, hands-on approach (including lab sessions), engaging lectures, and the fact that the Summer school was free of charge.

Did you miss this year’s summer school? Check back regularly for information on dates for next year, as well as information on how to apply.

Challenging Homophobia and Homophobic Bullying through Children’s Literature: a Parliamentary event

On July 16th 2013 I hosted an event supported by ESRC/CASS and the Lancaster University FASS-Enterprise Centre on Challenging Homophobia and Homophobic Bullying through Children’s Literature.

The event aimed to start a conversation about the use of children’s literature as a resource for effectively challenging homophobia and homophobic bullying and included attendees ranging from MPs and charity spokespersons to prominent academics and educational practitioners to children’s publishers and literature retailers. All who attended were experienced in issues of homophobia and homophobic bullying or with issues relating to inclusive children’s literature.

003a

The 2-hour event, which took around 6 months of organisation to bring together, included 6 presentations and a roundtable discussion, and turned out to be a success both in terms of an opportunity for knowledge exchange and networking.

009a

The presentations were structured into 2 sessions. The first session focussed on issues of homophobia and homophobic bullying. The second session focussed on issues of using children’s literature as a means for addressing issues of inclusion.

Continue reading

Official launch of the ESRC Centre for Corpus Approaches to Social Science

The official opening of the £4.1 million ESRC Centre for Corpus Approaches to Social Science (CASS) took place on Tuesday, 23 July 2013, at the start of the seventh international Corpus Linguistics 2013 conference attended by more than 300 delegates. Delegates representing dozens of universities around the world convened with civil servants to honour the past, promote the present, and celebrate the future of corpus methods in the social sciences.

Former Home Secretary Charles Clarke was among several special guests at the launch event including representatives from the Ministry of Justice, the Home Office and the Environment Agency. Mr. Clarke said a few words to the audience of scholars and other users of research, stressing the importance of investigating language in the context of society, as well as continuing to foster and nurture interdisciplinary collaborative links in social science research.

With such a large and influential crowd gathered, we took the opportunity to showcase a variety of new and exciting research featuring corpus methods applied to the social sciences to a wide network of people. A range of researchers from Lancaster and much further afield were invited to give poster presentations highlighting their current work, which offers a variety of exciting contributions ranging from methodological advances to increased social understanding, and greater emphasis on interdisciplinarity in academia. Poster presenters included Mike Scott, Alan Partington, Ute Römer, Kevin Harvey, Elena Semino, Veronika Koller, Ramesh Krishnamurthy, Alan Partington, Alison Sealey, Andrew Salway, Paul Rayson, Steve Young, Jonathan Culpeper, Paul Baker, Rachelle Vessey, Charlotte Taylor, Anna Marchi, Catherine Chorley, Costas Gabrielatos, and Robbie Love. The posters proved great fodder for stimulating conversation about the future potentials of corpus linguistics and corpus approaches to social science.

Click below to see the full gallery of photos from the evening.

Further explorations in ‘the Muslim world’

Doing a ten minute presentation is pretty tough – you have to be equally ruthless about what you leave out and what you include. But the benefits are potentially great – if you can present an idea well in ten minutes you are pretty sure that you will have your viewer’s attention. As anybody who has lectured knows, with longer talks, no matter how strong your delivery, attention starts to wander for some in the audience as the talk progresses! So when I had the opportunity to do a talk of 10-18 minutes for Lancaster TEDx, I immediately went for the option of 10 minutes. It was a nice challenge for me and I thought that the brevity of the talk would help me to get my message across. So I beavered away for a few weeks putting things in and taking things out, thinking about key messages and marshalling my data: if my TEDx talk looks spontaneous …. it was not. In fact I imagine few of them really are, in spite of them being presented in such a way as to make it appear that they are. A lot of work goes into them – and that is just from the speakers. The crew who organized and filmed the event at Lancaster worked amazingly hard as well.

So was it worth it? Well, I have had many kind notes since I did the talk thanking me for it. I have also had a fair number of views of my talk on-line and many, many more likes than dislikes. So for me the answer is an emphatic ‘yes’, it was worth it. Many thanks to all who have viewed and publicised my talk.

Reading the comments has been an interesting experience – many are appreciative. Yet some simply show that some of the argument was ignored or not picked up by the watcher – so a watcher asks if religious identity is important to athletic performance in response to a point I make about the failure of the UK press to report on Mo Farrah’s Muslim identity. Though I thought I made it clear that that identity is one Farrah himself says is central to his athletic achievements and hence, yes, it is relevant, it seems that perhaps my optimism that a ten minute talk would deal with attention span issues was misplaced! For some of these mistaken queries other commenters set the record straight, which is kind of them.

Of slightly more interest are some of the questions that get thrown up – I will consider three here. Firstly: what about the term the West? I was glad this was picked up by a viewer as we discuss that in the book that my talk is based upon (Baker, Gabrielatos and McEnery, 2012:131-132). As a self-referential term it does have a role to play in setting up the ‘us’ that is opposed to the ‘them’ of the Muslim world. Another viewer asks whether Muslim world is just a neutral term used to define a culturally homogeneous region. This is a dangerous argument. It takes us to the precipice of the very ‘us and them’ distinction I was discussing. It is dangerous precisely because it is simplistic in nature, as it implies an homogeneous and distinct other (there are non-Muslims who live in the so-called Muslim world, for example – the area referred to is not homogeneous in oh so many ways). It also misses the point – if this was a simply neutral referring expression perhaps the ‘us and them’ distinction would not be so powerful. The problem is it is a very powerful term for generating an ‘us and them’ distinction because it sets Muslims in opposition to non-Muslims in the language and, as noted, it homogenizes Muslims  – they are all the same and the reporting of the views of the Muslim world entrench this monolithic view also (see Baker, Gabrielatos and McEnery, 2012:130). Finally, the same viewer wonders why I did not talk about the change of meaning of words over time. The answer to that one is easy – sadly, as shown in the later part of the talk, the attitudes I was talking about have not changed over time, even though I would have been happy to say that they had if this was true. The viewer also uses the word ‘gay’ as an interesting example of change in meaning over time – well, that would have been another talk to give. A lot of nonsense is spoken about this world – it is usually presented as a word that had a simple, innocent, meaning until another, less innocent meaning came along and spoilt it, a view hilariously lampooned by Stephen Fry and Hugh Laurie in this sketch:

However, this is not true – gay had far from innocent meanings in the past – a quick perusal of Jonathan Green’s excellent Chambers Slang Dictionary shows that. So yes, a discussion of word meaning change over time would have been interesting and debunking a few myths about the word gay would have been fun too – but that was not what my talk was about, so I shall leave the matter there. Maybe for a future TEDx? Who knows.

So – ten minute talks have their pluses and minuses. They are great for getting your message out and, by and large, I am happy with how my talk went. I found the experience of giving a TEDx talk a very positive one and many other people clearly enjoyed it also.  Best of all, it has made people think about and discuss their use of language, and that is something which always pleases me!

Watch my full TEDxLancasterU talk here:

Two approaches to keywords

On July 4th, 2013, I gave a presentation on keywords at a meeting of the Keywords Project at Jesus College, Cambridge University. The Keywords Project uses Raymond Williams’ concept of keywords as being socially prominent words (e.g. art, industry, media or society) that are capable of bearing interlocking, yet sometimes contradictory contemporary meanings, and the group meets a couple of times each year to discuss new keywords that have emerged in society. The group carry out analysis using a variety of different methods, involving deriving etymologies from the Oxford English Dictionary, making use of Google n-grams, referring to academic research on particular concepts and investigating corpora.

I was invited to give an alternative (or rather, complementary) perspective that was more focussed on around corpus linguistics. I discussed how the concept of keywords differs greatly in CL, and how keyness can be extended to include tagged words, semantic or grammatical groups of words, multi-word units or even punctuation marks. Using various reference corpora, I showed how keyness techniques could be used to aid the identification of potential emerging keywords, while concordancing and collocational analysis could help to to identify the range of meanings around a word at a given point in time.

For more information, see http://keywords.pitt.edu/index.html