40th Anniversary of the Language and Computation Group


Recently I was given the chance to attend the 40th anniversary of the Language and Computation (LAC) group at The University of Essex. As an Essex alumni I was invited to present my work with CASS on Financial Narrative Processing (FNP) part of the ESRC funded project . Slides are available online here.

The event celebrates 40 years of the Language and Computation (LAC) group: an interdisciplinary group created to foster interaction between researchers working on Computational Linguistics within the University of Essex.

There were 16 talks by Essex University alumnus and connections including Yorick Wilks, Patrick Hanks, Stephen Pulman and Anne de Roeck. http://lac.essex.ac.uk/2016-computationallinguistics40

The two day workshop started with Doug Arnold from the Department of Language and Linguistics at Essex. He started by presenting the history and the beginning of the LAC group which started with the arrival of Yorick Wilks in the late 70s and others from Language and Linguistics, this includes Stephen Pulman, Mike Bray, Ray Turner and Anne de Roeck. According to Doug the introduction of the cognitive studies center and the Eurotra project in the 80s led to the introduction of the Computational Linguistics MA paving the way towards the emergence of Language and Computation. Something I always wondered about.

The workshop referred to the beginning of some of the most influential conferences and associations in computational linguistics such as CoLing, EACL and ESSLLI. It also showed the influence of the world events around that period and the struggle researchers and academics had to go through, especially during the cold war and the many university crises around the UK during the 80s and the 90s. Having finished my PhD in 2012 it never crossed my mind how difficult it would have been for researchers and academics to progress under such intriguing situations during that time.

Doug went on to point out how the introduction of the World Wide Web in the mid 90s and the development of technology and computers helped to rapidly advance and reshape the field. This helped in closing the gap between Computation and Linguistics and the problem of field identity between Computational Linguists coming from a Computing or Linguistics background. We now live surrounded by rapid technologies and solid networks infrastructure which makes communications and data processing a problem no more. I was astonished when Stephen Pulman mentioned how they used to wait a few days for the only machine in the department to compile a few lines-of-code of LISP.

The presence of Big Data processing in 2010 and the rapid need for resourcing, crowd-sourcing and interpreting big data added more challenges but interesting opportunities to computational linguists. Something I very much agree with considering the vast amount of data available online these days.

Doug ended his talk by pointing out that in general Computational Linguistics is a difficult field; computational linguists are expected to be experts in many areas, concluding that training computational linguists is deemed to be a challenging and difficult task. As a computational linguist this rings a bell. For example, and as someone from a computing background, I find it difficult to understand how part of speech taggers work without being versed in the grammatical aspect of the language of study.

Doug’s talk was followed by compelling and very informative talks from Yorick Wilks, Mike Rosner and Patrick Hanks.

Yorick opened with “Linguistics is still an interesting topic” narrating his experience in moving from Linguistics towards Computing and the challenge imposed by the UK system compared to other countries such as France, Russia and Italy where Chomsky had little influence. This reminded me of Peter Norivg’s response to Chomsky’s criticism of empirical theory where he said and I quote: “I think Chomsky is wrong to push the needle so far towards theory over facts”.

In his talk, Yorick referred to Lancaster University and the remarkable work by Geoffrey Leech and the build up of the CLAWS tagger, which was one of the earliest statistical taggers to ever reach the USA.

“What is meaning?” was Patrick Hanks talk’s opening and went into discussing word ambiguity saying: “most words are hopelessly ambiguous!”.  Patrick briefly discussed the ‘double helix’ rule system or the Theory of Norms and Exploitations (TNE), which enables creative use of language when speakers and writers make new meanings, while at the same time relying on a core of shared conventions for mutual understanding. His work on pattern and phraseologies is of great interest in an attempt to answer the ”why this perfectly valid English sentence fits in a single pattern?” question.

This was followed by interesting talks from ‘Essexians’ working in different universities and firms across the globe. This included recent work on Computational Linguistics (CL), Natural Language Processing (NLP) and Machine Learning (ML). One of those was a collaboration work between Essex University and Signal– a startup company in London.

The event closed with more socialising, drinks and dinner at a Nepalese restaurant in Colchester, courtesy of the LAC group.

In general I found the event very interesting, well organised and rich in terms of historical evidences on the beginning of Language and Computation. It was also of great interest to know about current work and state-of-the-art in CL, NLP and ML presented by the event attendances.

I would very much like to thank The Language and Computation group at Essex Universities for the invitation and their time and effort organising this wonderful event.

Mahmoud El-Haj

Senior Research Associate

CASS, Lancaster University