Wed28May20142.00-3.00pmLancaster University, Management School LT7
Readability analysis as an exploration of linguistic complexity
Lecture by Professor Detmar Meurers
The analysis of readability has traditionally relied on surface properties of language, such as average sentence and word lengths and specific word lists. At the same time, there is a long tradition analyzing the Complexity, Accuracy, and Fluency (CAF) of language produced by language learners in second language acquisition (SLA) research. Reusing SLA measures of learner language complexity to analyze readability, Sowmya Vajjala and I explored which aspects of linguistic modeling can successfully be employed to predict the readability of a native language text. Using various machine learning setups and corpora, we show that a broad range of linguistic properties are highly indicative of the readability of documents, from graded readers to web pages and TV programs targeting different age groups. The readability model using our full linguistic feature set currently is the best non-commercial readability model available for English (and second overall, with the commercial ETS model coming in first), based on the performance on the Common Core State Standard data set.
The fact that we found readability to be reflected in a wide range of linguistic aspects also has consequences for text simplification, where we are interested in identifying for which sentences which kind of simplification would be worthwhile. To support such research, we show that our text readability models can meaningfully be applied to individual sentences.The talk will try to trace the ideas sketched above based on the joint paper with Sowmya Vajjala listed below, which are downloadable from http://purl.org/dm/papers In case there is something you'd be particularly interested in, just send me an email so I can try to give it more time.
- Sowmya Vajjala and Detmar Meurers (to appear) "Readability Assessment for Text Simplification: From Analyzing Documents to Identifying Sentential Simplifications". International Journal of Applied Linguistics, Special Issue on Current Research in Readability and Text Simplification edited by Thomas François & Delphine Bernhard.
- Sowmya Vajjala and Detmar Meurers (2014) "Assessing the relative reading level of sentence pairs for text simplification". Proceedings of EACL. Gothenburg, Sweden.
- Sowmya Vajjala and Detmar Meurers (2014) "Exploring Measures of 'Readability' for Spoken Language: Analyzing linguistic features of subtitles to identify age-specific TV programs. Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), EACL. Gothenburg, Sweden.
- Sowmya Vajjala and Detmar Meurers (2013) "On The Applicability of Readability Models to Web Texts." Proceedings of the Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), ACL. Sofia, Bulgaria.
- Julia Hancke, Sowmya Vajjala and Detmar Meurers (2012) "Readability Classification for German using lexical, syntactic, and morphological features". Proceedings of COLING, Mumbai, India.
- Sowmya Vajjala and Detmar Meurers (2012) "On Improving the Accuracy of Readability Classification using Insights from Second Language Acquisition". Proceedings of BEA7, ACL. Montreal, Canada.
Event website: http://www.lancaster.ac.uk/fass/groups/sllat/programme.html
Who can attend: Anyone