For the curious, here is a map of the languages in the full database, color-coded by number of items. As you can see, there’s considerable variation, but there are also a good number of languages with substantial holdings.


Counts of sources in Australian lexical database, as at August 19, 2015


Pama-Nyungan language locations

As noted in a previous post, I’ve started to put some of the results of my Pama-Nyungan prehistory grant on my lab web site, at One of the recent updates is a language map. The data are not new; this map was released in about 2011 (though with updates since). It is released through a wordpress plugin on the site, which allows easy embedding of maps into sites. I highly recommend it for its ease of use, except for the fact that it doesn’t seem  to render in Chrome on a Mac (at least, not on my mac).

Comments on language locations, names, etc, on the map are very welcome. Please use the comment form on the map’s page.

Release of the Australian lexical database

I will be releasing the part tranche of data in the 775,000 word lexical database in Mid-August. It will most probably be available as a series of downloads at In order to download the data, you will need to register and agree to some terms and conditions. More about that once the data are released.

In the meantime, I will be doing a series of posts about features of the dataset and some of its uses. I hope this will encourage others to contribute data, or to allow us to make data readily available.

‘Grammar Boot Camp’ at Yale

I will be holding a summer ‘grammar boot camp’ next year (2015), from June 1 to June 26. The idea is to have up to four advanced undergraduate students work intensively on existing high-quality archival field notes and recordings with the aim of producing a publishable sketch grammar. Students will receive a stipend and travel expenses to come to Yale.

This project is funded by the National Science Foundation’s Research Experiences for Undergraduates program; as such, applicants are limited to US citizens or permanent residents. Students who have graduated in Spring 2015 will be eligible to apply. The targeted cohort is undergraduates who will have just finished either their junior or senior year.

Applications will be accepted towards the end of 2014 and applicants will be notified about the result in mid-February. Students will need to show some evidence of prior research experience (e.g. through an RA-ship or by having a senior thesis in progress) and some familiarity with language documentation procedures (e.g. through having taken a field methods class or equivalent, such as attendance at a CoLang summer school). Applicants will need to show attention to detail and ability to focus on a project for a sustained period. The application will require a letter from the student and two letters of support from faculty.

The materials to be worked on will be from an Australian Aboriginal language from Western Australia and will include both print materials and audio files. It is probable that the ‘print’ materials will already be digitized and in Toolbox.

Students will meet twice a day as a group with me to discuss analyses and writing. They will spend the rest of the time working with the materials in the department. They will receive regular detailed feedback on the analysis and writing. Familiarity with Australian languages is not required but I would expect that successful applicants would do some reading of grammars of related languages (which would be provided) prior to the start of the boot camp.

More formal application information will be sent out later, but for now I just wanted to let everyone know about the opportunity so potential students can keep it in mind when planning their course schedules and plans for the coming year.

Please forward to anyone you think would be interested and feel free to contact me with any questions.

How many languages? (2)

I’ve updated the list of how many languages were spoken in Australia at European settlement. Thanks to Barry Alpher, Greg Dickson, Aidan Wilson and JC Verstraete for comments.

How many languages were spoken in Australia?

For years, I’ve been using the figure of approximately 250 Aboriginal languages spoken at the time of European settlement, of which roughly 150 were Pama-Nyungan. I recently had the chance to clean up my list of standard language names, which means that I finally got a fairly accurate estimate of how many languages there actually were. This includes some “languages” that we would probably treat as mutually intelligible varieties if we were being very strict, but on the “Swedish, Danish, and Norwegian are separate languages” model, I am comfortable treating languages like Dhuwal and Dhuwala as distinct. Some of the decisions are a bit arbitrary, though.

Here are the figures:

  • 363 languages in Australia, 364 if we include Meryam Mir, which is a Papuan language spoken in Australian territory. The number goes up by 7 if we include Tasmanian languages, but my database only includes the mainland.
  • 275 of those languages are Pama-Nyungan.
  • I am working with 30 primary subgroups and 5 isolates, within Pama-Nyungan.

You are free to use it for your own (non-commercial) purposes, and I would be very happy to hear about corrections, additions, subtractions, etc. If you want a list of languages, this is, if I say so myself, a far better list to use than the Ethnologue’s. Edited: you now need to contact me for permission to use the list. Sorry about that.

Australian Language Polygons and new Centroid files

I’ve finished a *draft* google earth (.kmz) file with locations of Australian languages, organised by family and subgroup.

Some things to note:

  • You may use these files for education and research purposes only.
  • NO commercial use under any circumstances without my written permission.
  • NO republication any any circumstances without my written permission.
  • You may quote from these files. Please use the following citation: Bowern, C. (2011). Centroid Coordinates for Australian Languages v2.0. Google Earth .kmz file, available from
  • These files represent my compilation of many available sources, but are known to be deficient in a number of areas. Some sources are irreconcilable. This work is unsuitable for use as evidence in Native Title (land) claims.
  • Please do not repost or circulate these files. Send interested people to this page. I will be updating the files from time to time.
  • Please let me know of errors! The easiest way to do this is to change the polygon or centroid point for the language(s) you are correcting, and send me that item as a kml file.
  • If you use derivatives of this file (e.g. you calculate language areas from it, convert it to ArcGIS, etc), that’s fine, but please send me a copy of the derivative file