-
Log Ratio – an informal introduction
In the latest version of CQPweb (v 3.1.7) a new statistic for keywords, collocations and lockwords is introduced, called Log Ratio. “Log Ratio” is actually my own made-up abbreviated title for something which is more precisely defined as either the binary log of the ratio of relative frequencies or the binary log of the relative…
-
Using version control software for corpus construction
There are two problems that often come up in collaborative efforts towards corpus construction. First, how do two or more people pool their efforts simultaneously on this kind of work – sharing the data as it develops without working at cross-purposes, repeating effort, or ending up with incompatible versions of the corpus? Second, how do…
-
A new version of EEBO on CQPweb
The version of the EEBO-TCP data that has been available on Lancaster University’s CQPweb server is now rather old (the TCP project adds text to the collection on a rolling basis), and, more importantly, does not contain any annotations. Recently I have devoted some time to running a newer version through UCREL’s standard annotation tools and then mounting the resulting dataset…
-
Visiting With The Brown Family
In 2011 I gave a plenary talk on how American English is changing over time (contrasting it with British English), using the Brown Family of corpora. Each member of the Brown family consists of a corpus of 1 million words of written, published, standard English, divided into 500 files each of about 2000 words each.…
Search A Keyword
CASS Briefings
CASS: Briefings is a series of short, quick reads on the work being done at the ESRC/CASS research centre at Lancaster University, UK.
Recent Post
- Open Advanced Methods Research Group
- Exploring New Horizons in Corpus Linguistics: Lectures, Workshops and Partnerships in Shanghai
- CASS’s innovation programme: New features in #LancsBox X
- Words, words, words: A new Frequency Dictionary of British English
- Language Data Analysis training: live from Lancaster Castle
Tags
brazil claire hardaker elena semino islam Islamophobia learner language metaphor metaphor in end of life care paul baker spoken BNC2014 tony mcenery Trinity Lancaster Spoken Learner Corpus trolling twitter vaclav brezina
Categories
- Ambassadors
- Anatomy of a troll
- Applications of corpus linguistics
- Big data media analysis and the representation of urban violence in Brazil
- Blogs
- BNC2014
- Call for Papers
- CASS Affiliated Projects
- CASS Briefing
- CFIE
- Challenge Panel
- Changing Climates
- Comparable and Parallel Corpus Approaches to the Third Code
- Distressed Communities
- DOOM
- Events
- General
- Hate Speech
- Healthcare
- iCourts
- Jobs
- L2 language corpora
- Lancs Box
- learner corpora
- MA in Corpus linguistics
- Maritime Security and Piracy Discourses
- Media
- MELC
- MOOC
- News
- Newspapers
- Newspapers poverty and long-term change
- Post Event Summaries
- Research
- Spatial Humanities
- Spoken BNC2014
- Trinity
- Uncategorized
- Understanding Corporate Communications
- Urban violence