Annotated Texts

Peter Austin has a great post about text in grammars and how closely they correspond to what was actually said.

It strikes me that we are not being very explicit about genre, purpose of recording, and purpose of publication. If we were talking about a conversation, for example, there would be no question that *everything* would be reproduced – hesitations, back-channelling, repairs, pauses, codeswitches, the lot. But such transcripts can be quite difficult to read, especially when interlinearisation and free translations are added.

Most descriptive linguists don’t work with conversation data, though – they work with narratives. Narratives are often recorded as much for the information they contain as the linguistic structures they exhibit. I suspect that this a hang-over from the days of descriptive linguistic anthropology and the Boasian tradition.

Individual storytellers have different styles and different levels of fluency (independent of their fluency in the language). Speakers themselves may want to edit out some of the hesitations and code-switches in order to present a text which conforms more closely to the constraints of written genres (that is, making the text a written document, not a transcription of a spoken document).

The problems that Peter talks about arise when we treat a written text and a transcription as the same type of document (or mistake one for the other).

A question of this type has come up fairly often with the Laves materials. These are texts which were dictated; we do not have any original sound files. The texts have little punctuation. If we were to make a faithful reproduction of those texts, it is already one or two steps removed from the original performance. It would be like trying to recreate an authentic 19th performance of a Bach cantata. There are reasons to do such a thing, but there are also reasons not to.

The approach I’ve taken in the Laves materials a classical apparatus criticus – brief annotations at the bottom of the page for textual emendations, major spelling variants, etc, and endnotes for interpretive comments to guide the reader. It’s not ideal – I suspect it’s almost as irritating for those not familiar with this sort of textual work as reading a conversation transcript. But it is a compromise between an almost uninterpretable string of words and a glossed text with hazy relations to the ‘original’.

6 responses to “Annotated Texts

  1. I think it’s important to remember that most speakers, when presenting their language to linguists, want to present the best possible examples of their languages by using good stories and reducing errors. So when Peter Austin was talking about a speaker telling him to fix this and fix that after he’d already spoken the text, I understand and think it’s entirely acceptable. Like Laves, I’ve also transcribed a dictated text that wasn’t recorded. We made a good little community resource out of it and no community members were perturbed that I had no idea where to punctuate and weren’t too bummed that there wasn’t a recording. What was more important was that the most authoritative speaker provided the text and we made a good resource out of it.

    It’s not all about what the linguists want to capture and how linguist present their stuff. Speakers aren’t passive – they are giving us the texts they are giving us for a reason and are telling us the form they want it in.

    Imagine if English was spoken by only a handful of people and a linguist came to document it… would we present to them our best literature and cinema to document English with, or tapes of Big Brother housemates talking rubbish.

    I’ve gone a bit off topic, but it’s something I’ve thought about in the past few years with the increase in popularity of documenting conversation and everyday language in Indigenous Languages. It’s become rather trendy, but I feel a bit sorry for the deadly orators out there in the bush who have great stories and texts to present to linguists, but the linguist might be more interested in recording the orator’s kids discuss the weekend football or humbugging each other for money!

  2. Sorry Wamut,
    This is a furfy.
    I recorded hundreds of formally elicited narratives when I was in the Kimberley and also in Wadeye. Some are of them are wonderful. But more often than not the storytelling is not so interesting because I couldn’t understand the story. When I could understand the stories I tried not to ruin them by saying things in English. Or even in Language I tried not to say anything because i reckoned people don’t want to listen to a whitefella speaking language. You try telling a story to someone who doesn’t understand it, or even worse to someone who does understand it but sits there like a mute. Boring!!

    On the other hand about 60 or 70% of my conversational corpus consists of unelicited naturally occurring storytelling. One after the other. It is gold! Pure theatre. Like everything we do in conversation we design our talk for our recipients, stories included.
    Don’t get me wrong, formal narratives are fundamentally important aspect to an oral tradition. But so are conversational narratives every bit as important. It’s in natural conversation that great storytellers hone their craft.

  3. I hear what your saying and see your point.

    Naturally occurring narratives would be great.

    When you were starting out in your fieldwork, did informants have any difficulty in seeing the value of being recorded while ‘just talking’ rather than ‘performing narratives’? I guess what I’m thinking about is language speakers who aren’t so clear on the value of recording natural speech rather than recording stories from the canon of their oral tradition…

  4. Yeah at first they did have trouble understanding the importance of unsolicited talking. But once they work with you on they texts, they pretty soon get it. I’m certainly not saying you shouldn’t collect formal narratives. Quite the contrary. See here for instance, http://blogs.usyd.edu.au/elac/2006/09/how_about_a_cuppa_tea_on_techn.html
    I think formal narratives and natural conversation should be both incorporated into language documentation, if possible.

  5. Wamut says: “Peter Austin was talking about a speaker telling him to fix this and fix that after he’d already spoken the text, I understand and think it’s entirely acceptable”

    My point was not to say that this is not acceptable, but rather to say that in the process of getting from transcription to publishable text we linguists (and speakers) should be documenting the decisions we make and how we make them, being explicit about why the published version differs from the spoken or dictated one. There can be various reasons why people want to ‘clean up’ transcriptions. For example, Susan Penfield once told me that modern speakers of a Native American language she works on who transcribe old legacy tapes leave out lots of clitics and other material because their variety does not include them. They told her the old people “used too many words”.

    To put not too fine a point on it, I also suspect that some doctoring of texts happens as well when researchers find stuff that doesn’t fit the grammar they have constructed, for whatever reason. I know of one case in very recent history, in fact, where a student was told to do exactly this by a well-known researcher who was the student’s supervisor.

  6. There’s another issue of ‘unintended doctoring’ too. In early transcriptions it’s common to make a heap of mistakes, Those mistakes get fixed as the linguist internalises the grammar of the language. It’s not uncommon to transcribe based partly on one’s internalisations (that’s a vital part of processing; it’s how we perceive word chunks, how we can store strings in memory for transcription, etc). If something doesn’t fit, then it could be an error, and if someone’s not sure of their transcription skills, they might well assume they got it wrong. (I blame the fact that people don’t get good ear training in phonetics classes anymore. The day they abolished the perception exams was a very very bad one for field linguistics.)

Leave a comment