Given that I’ve now been using TshwaneLex for a few weeks I thought I’d post an update on how it’s going. In general, it’s going pretty well.
- If you’re going to change the DTD (and you’ll probably need to, if you have custom information or a trilingual dictionary), it’s important to plan in advance and ideally document what you’ve done. It’s easy to introduce inconsistencies and they are a pain to change later on (it’s possible, but it’s easier not to have to). For example, I ended up with the scientific name field in two places. I reimported the data to the correct field instance, checked it was ok, then deleted the unwanted field).
- The field structure enforcement is worth the time it took to import the data. Generally sorting stuff out has been pretty good.
- I’ve set up a few versions of the dictionary, including a wordlist, a full version, and a Toolbox export (which puts the backslash codes back before the items). This has been easier than modifying CCT exports by hand for Toolbox, although the difference isn’t all that big.
- Editing is taking longer. The navigation is a little more time-consuming, since adding fields is a hierarchical process, but compared to the amount of time it’s taking to edit the damn dictionary because of inconsistent entry in the first place, it’s probably worth the tradeoff.
- I really like being able to work on the English – Yan-nhangu section of the dictionary at the same time. This takes some doing in Toolbox (well, it requires the creation of two lexica…) I wish I could work on the Dhuwal section too, but since i know that language much less well than English or Yan-nhangu, the processing and addition of information won’t be as intensive. I suspect, since there’s a high degree of syntactic and semantic isomorphism between Yan-nhangu and Dhuwal, I’ll be able to reverse the Dhuwal and Yan-nhangu glosses, change the audience and layout but keep most of the structure. That’s not true for the English part of the dictionary.