Elan transcription mode

The good people who maintain Elan recently announced a new version, with a new transcription mode (and some other goodies too). I’ve been using it for a few days now.

For the most part, it’s working well. It is a definite improvement over the annotation mode for rapid transcription. Cutting down on the navigation between annotations and between tiers produces a noticeable time-saving, as well as a minor decrease in frustration with the program which is definitely worth it. I use Elan for all transcription now so I’m pretty pleased with this. The table interface is nice, navigation is easy, and starting and stopping the audio with tab is also very useful. I like being able to keep the current annotation in the centre too.

Segmentation mode has also improved and is more intuitive and easier to use.  The program also seems (though it may be my imagination) to be running a little faster and coping better with audio on my mac.

There are a couple of things that I think could be improved for usability. One is the barrier that the naming conventions in the program imposes. When I first started using Elan I could never keep straight the differences between ‘included in’, ‘time subdivision’, and ‘symbolic subdivision’. The explanation in the manual was pretty difficult to follow.  Now, it turns out that I did not interpret the different types in the same way that the authors of the program did. This means that my free translations are not of the correct type to show up automatically in the annotation mode (they need to be ‘symbolic associations’). I can see the justification for treating them as ‘symbolic associations’ (though I maintain that this terminology is not user-friendly), though I can also see perfectly reasonable arguments for having free translations as ‘included in’ or ‘time subdivision’, depending on what the unit of the parent tier is. After all, the parent tier unit could be either intonation units (useful for transcription but not full sentences) or sentences. Ideally one would represent both, but that’s a big job, especially when one can’t read off the ‘sentences’ from the wave form in segmentation mode. One has to transcribe the IUs first, then merge those annotations into a set of larger ones; but we would want the sentence level annotations to be the parent of the IU translations, and that, I think, one can’t do. [Please tell me if I’m wrong…]

Changing the data type for free translations is not a problem for new transcriptions but it is incredibly irritating for the large number of partially complete transcript files I have in my collection. I’ll need to change quite a few files by hand. However, because you can’t change a tier type once the tier has information in it, and you can’t apparently paste information into a tier of type symbolic association, this is going to be a big job. I’m not sure if it’s worth doing, since almost all my searching is done on the language transcription tier anyway. It’ll introduce a major inconsistency into the transcripts, but unless there’s an easier way to change the type of a tier so that it is linked, I think a better use of my time is doing more transcription…

I haven’t worried too much about the impenetrability of instructions  in the past, but while I’m on that topic, I have to say that the instructions of the new annotation mode have some of the same issues. There are a lot of screenshots, and there’s a lot of detail, which is great, but the tier names in the example are extremely difficult to remember, and the tier types are also pretty unintuitive. For example, why have ‘po’ for ‘practical orthoraphy’ but ‘tf’ for ‘free translation’? Why reverse some names but not others? In the past I have not worry too much about things like this, because I don’t have a problem learning new software and I usually don’t mind spending a bit of time figuring out what the program is wanted to say (or I figure it out without reading the manual in the first place, more often). But now I am spending more time teaching this software as part of classes, I’m increasingly finding that very complex naming conventions that are not easily rememberable are really unhelpful in teaching. It really increases the take-up time needed and the amount of time needed to explain things in class, because I have to explain the ontology as well as how to use the program. That would be ok if I taught a separate class on fieldwork software, but I can’t do that, and I don’t really want to.

Elan is not, of course, the only piece of software to have a confusing ontology and set of naming conventions (how long was it before Praat got an “open” file button??) But it’s something that could be improved, I think. (Of course, for current users who have made the effort to learn the existing ontology, changing it will merely annoy…)

Anyway, back to the review. One thing I noticed with this mode is that it makes navigation within the annotation more difficult, at least on a Mac. It is quite difficult to move between words within the annotation, because some of the shortcuts (e.g. option + right) are now defined to have the user moved between table cells in the annotation. This makes proofing transcripts a bit difficult, or growing back and correcting an earlier error. It’s also possible to make the program crash if one has many short annotations and navigates too fast between them. Stopping and starting with the Tab button also seems to be a little unreliable, and one of my files got stuck in loop mode. The automatic playback doesn’t always work either.

I would like it if the waveform window showed a little more than just the annotation, say half a second on each side. As it is, it is quite difficult to select the start of the annotation to play from the beginning (though I see now that one can use Shift-Tab to play from the beginning). I would also find it useful to have some idea about how far through the file I am. One can get a sense from this that the scroll bar on the right-hand side but the thing in the main annotation mode which shows the density of annotation points and where one is in the file I find very useful. I think it might also be useful if when one stopped and started the annotation with tab, the playback started from .25 second (or something similar) before the current cursor position. Unless I’m very quick with the ‘stop’ button I often miss the first part of the next word. This would also be very useful for segments created with the ‘segmentation’ mode, which have no buffer either side. This makes them hard to use as subtitles for movies because the annotations don’t stay on the screen long enough.

Speaking of silence recogniser, this is area where tier copying is problematic. If I create annotations using this mode, I can’t then copy those annotations to a tier with a translation type as dependent. I have to create a new tier, or reassociate the translation tier with the channel1 tier, and then rename it. This seems to defeat the purpose of using a template, since I basically have to set up the tiers from scratch for each file. [While I’m talking about help and the audio recognizer, why does it appear under ‘screen display’ in the help files? That’s not very intuitive. Also, there’s something wrong with the search function in the built-in help; it doesn’t find ‘recognizer’ or ‘silence’.] There’s an ‘add tier’ button but it doesn’t seem to do anything.

Finally, I think I would like to see keyboard shortcuts for navigating between the different modes. Currently, there are buttons at the top of the screen in annotation mode for navigating between ‘grid’, ‘text’, subtitles’, ‘audio recogniser’, ‘lexicon’ ‘metadata’ and ‘controls’. Now, I don’t need to run the recogniser more than once per file (why isn’t that a ‘mode’?)

So, all in all, I’m impressed with the transcription mode. I wish the instructions about tier setups had been a little clearer 5 years ago when I set up my templates, but too late to change that now. There are also some issues with tiers which make the program not as easy to use as it might be. Elan is still very much the best transcription program for fieldwork and it could be even better with some more attention paid to usability.

3 responses to “Elan transcription mode

  1. Thank you for this review! We’ll be considering the issues you raise. Couple of quick comments:

    1. I have no opinion about the naming of the different linguistic stereotypes, but there is a very simple reason that Transcription Mode (TM) can only use the Symbolic Association type: that is the only type in which there is a guaranteed one-to-one, time-aligned relationship between parent/child or sister annotations, and therefore the only one that can be displayed in a table without complications (both Included In and Symbolic Subdivision allow one-to-many relationships, so for those it is non-trivial to display the parent annotation in Column 1 and the possibly multiple children in Column 2).

    2. I totally agree that it is quite a PITA to change tier types at present, especially if there are existing annotations and the stereotypes aren’t compatible. Sometimes the only workaround is to copy the tier (Tier > Copy Tier), which allows you to change the type — but that takes way too many clicks especially if you want to convert multiple files. The user interface could be much simplified at this point and it would be great to be able to bulk edit files.

    3. The waveform does already show a little more than just the annotation on both sides (so that the user can check whether the boundaries are correct and if needed switch to annotation mode to edit). But perhaps that should be still a little more?

    4. The tier names in the manual could be simplified I suppose; they are just an example anyway. They’re taken from a template that we are using internally at the MPI. (Incidentally, the reason tf “free translation” has the order flipped is that there is also tl “literal translation” and tn “translation in national language” and we thought it nicer to have all translation types start with “t”.)

    5. We’ll forward the points on the silence recognizer to the AVATECH people, who are currently thinking about improving the user interface and adding more recognizers. I quite agree that it would be better as a mode. Conceptually it is more related to Segmentation than to Annotation, so it might find a place there.

  2. Thanks Mark!
    1) was more a general grumble at my own bad choice 6 years ago than querying the choice of that type. I can see why that would be needed here.

    Regarding 2), do you think that’s something that’s likely to happen in the next year, say? If so, I would hold off changing lots of files.

  3. Regarding 2), be sure to file a request with Han. The more users make clear that they would appreciate a better interface for (bulk) editing tier types, the higher the issue will be placed on the priority list I suppose. Since transcription mode is a good reason for many users (myself included) to update their old files, I imagine this is a widely shared need.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s