Event Information:

  • Wed

    Readability analysis as an exploration of linguistic complexity

    2.00-3.00pmLancaster University, Management School LT7

    Lecture by Professor Detmar Meurers

    The analysis of readability has traditionally relied on surface properties of language, such as average sentence and word lengths and specific word lists. At the same time, there is a long tradition analyzing the Complexity, Accuracy, and Fluency (CAF) of language produced by language learners in second language acquisition (SLA) research. Reusing SLA measures of learner language complexity to analyze readability, Sowmya Vajjala and I explored which aspects of linguistic modeling can successfully be employed to predict the readability of a native language text. Using various machine learning setups and corpora, we show that a broad range of linguistic properties are highly indicative of the readability of documents, from graded readers to web pages and TV programs targeting different age groups. The readability model using our full linguistic feature set currently is the best non-commercial readability model available for English (and second overall, with the commercial ETS model coming in first), based on the performance on the Common Core State Standard data set.

    The fact that we found readability to be reflected in a wide range of linguistic aspects also has consequences for text simplification, where we are interested in identifying for which sentences which kind of simplification would be worthwhile. To support such research, we show that our text readability models can meaningfully be applied to individual sentences.The talk will try to trace the ideas sketched above based on the joint paper with Sowmya Vajjala listed below, which are downloadable from In case there is something you'd be particularly interested in, just send me an email so I can try to give it more time.


    Event website:


    Who can attend: Anyone