A Journey into Transcription, Part 3: Clarity

As audio transcribers we listen to sound.  Of primary importance is the clarity of the sound.



The quality of being clear (‘easy to perceive, understand, or interpret’), in particular:

  • The quality of being coherent and intelligible
  • The quality of being easy to hear; sharpness of sound
  • The quality of purity

Let’s consider these qualities and their relevance to the audio transcriber.

The quality of being coherent and intelligible

All of us, when engaged in discussion and conversation, want our language to be coherent and intelligible.  However, for the transcriber listening to a recording, its clarity in the sense of being coherent and intelligible is something of a paradox; it is simultaneously useful and yet also to be ignored.

Naturally, we know that our brains are programmed to attempt to organise and make sense of language.  In this sense, context can often present the transcriber with an invaluable clue to making out words which may be difficult to hear in a recording.

At the initial drafting stage of transcription what we hear at first can turn out to be quite different when we re-listen, edit and proofread the transcript with the glorious benefit of wider context to assist us.  Here are a few of the more entertaining examples:

you wear glasses becomes yoga classes

it’s among the becomes it’s a manga [comic]

yes she was becomes H G Wells

whisking gently becomes whiskey J&B [discussing a recipe!]

However, since the raison d’être of  this corpus is as a basis for research into the language of learners, part of the skill here is in not being distracted by our knowledge of grammatical rules and the surrounding context.

The audio transcriber’s task is to hear what the learner actually says; this may not always be what they (or we) think or expect might be logical or appropriate (or desirable!).  Indeed, the transcription conventions are designed specifically to minimise the possibility of this happening during the transcription process.  In the context of a Graded Examination in Spoken English (GESE) the students (and, on rare occasion, the examiners) can, and sometimes do, say anything!

Below are a few examples of wrong words and non-words which are to be transcribed, alongside words which may have been intended by the speaker:

A Journey into Transcription, Part 2: Getting Started

The action of teaching a person or animal a particular skill or type of behaviour.

So how to begin?  With experts as our guides (and thankfully no animals in sight!)…

The Context:  The first week was to be dedicated to training.  We began by watching a short video clip of a Trinity examination in progress.  Although our day-to-day work is based purely on audio recordings, we really appreciated having this quick peak into the world of the examination room.  Being able to picture the scene when listening to exam recordings somehow brings the spoken language to life.

Picture this: a desk with a friendly examiner seated at one side; tape recorder in situ and possibly a fan whirring (quietly, we hope) in the background;  a pile of papers (perhaps held down by a paperweight); and then, most importantly for us in this research into learner language, a student seated on the  other side of the desk;  some nervous, some shy, some confident, some excited, some reluctant to speak and a rare few who might even have felt quite at home seated on the other side of the desk! 

Time spent viewing this clip was truly a valuable introduction to the context of this research and the real world to which the audio transcriber is privy on a daily basis.

What next?  Enthusiastic to get started, headsets on, foot pedals down…

Practice File:  We started with a practice recording that had been transcribed previously, applying to it our first set of transcription conventions.  (These have subsequently been altered and updated  on numerous occasions.)  This was an extremely valuable process – in listening separately and together to sections of the recording and in comparing our own transcripts with each other and with the original, we quickly realised the range of subtleties that are involved in this task.  The aim, of course, is for transcribers to do as little interpretation as possible and to be able to apply the conventions in a more or less uniform manner, thus making  the transcription process as straightforward as possible.  This, after all, is what will enable us to build a reliable corpus of words that are actually uttered.  Whilst the technology now exists to generate text from spoken words, the accuracy of the text produced does not come close to that produced by a real-life human transcriber.

Key to this task is the fact that it is unlike transcription in other working environments; we are not seeking to produce grammatically correct punctuated documents such as you might find on a BBC website when you want to review that radio programme you heard, or perhaps missed.  In spoken language there are only utterances and our job is to record every utterance precisely by following the given conventions, the only punctuation in sight being apostrophes and the odd question mark.  So is that syllable a word ending, a false start to another word, perhaps a filler used intentionally to maintain a turn in conversation, or perhaps an involuntary sound? All these are natural features of spoken discourse.  Tackling this challenge and striving to produce a document that represents as accurately as is humanly possible the words actually uttered by each individual speaker – once again, here is the challenge that makes our job enjoyable and rewarding.

And finally… A Transcriber’s  Thought For The Day:

I tried to catch some fog.  I mist.

A Journey into Transcription, Part 1: Our Approach

To Transcribe:
to put (thoughts, speech, or data) into written or printed form
mid 16th century (in the sense ‘make a copy in writing’):
from Latin transcribere, from trans- ‘across’ + scribere ‘write’

In September 2013 we applied for the post of Audio Transcriber in the CASS Office in the Department of Linguistics and English Language here at Lancaster University.  The job description seemed straightforward; to transcribe audio tape materials according to a predefined scheme and to undertake other appropriate duties as directed.  And the person specification?  As you would expect, a list of essential/desirable skills including working effectively as part of a team; the ability to learn and apply schemes (more of that later); and the ability to work with a range of accents and dialects of English (this is the fun part!).

We say the post of Audio Transcriber since, as far as we knew, only one post was available.  How wonderful to find ourselves both appointed (long may the funding last!); the opportunity to establish a slick working team, as well as to consult when problems arise and, not least, to celebrate the successes (yes, transcribing is a rewarding job!) are a huge benefit not only to ourselves in our work but also to the success of project as a whole.  In the ESRC Centre for Corpus Approaches to Social Science, it must be the corpus that is at the heart of the centre.  Knowing that we play a key role within the team working together to develop this corpus, we take great pride in what we do.  After all, our listening skills, our focus on accuracy and our meticulous attention to detail have the potential to help develop a corpus of excellent quality, and this will make a vital contribution to the validity of the all the research that will follow.  Quite simply, it is this which makes our job so enjoyable and rewarding.

Our day-to-day work involves transcribing recordings of oral examinations taken by learners of English as a second language at elementary, intermediate and advanced stages.  The examinations have been carried out by Trinity College London and have taken place in various countries; Spain, Mexico, Italy, China, India and Sri Lanka so far.  Each language and each stage have their own unique features.

Seven months and 1.5 million words later (Stage One completed and celebrated with colleagues and cake!), we were delighted to be invited to contribute a BLOG documenting our experience as transcribers.  Over the coming months we plan to describe and discuss various aspects of the job.  The aim is to offer an insight to other transcribers and researchers about this particular process.

Look out for the next instalment on Getting Started!

And finally… A Transcriber’s  Thought For The Day:

They told me I had type A blood, but it was a type-O.