Google Drive is dangerous

Following yet another box.com sync failure, I decided to migrate my lab’s digital files to google drive. The reasons for picking them were that Yale offers unlimited storage, and that many of my students use the web version of google docs to draft materials. We were having issues that students weren’t syncing their files with box regularly.

So, about a week ago I copied a lot of folders from box to drive, and removed the materials from box once it was clear that google drive had synced. All the files are showing on the google drive web site.

However, google drive offline did not sync more than 3 folders deep, even though it recognized the directory structure. The folders are simply empty. Moreover, it seems to have had a lot of trouble with files which aren’t .docx, .xlsx, .pdf or .mp4. Given that all the materials are on the web drive, there’s been no data loss. I just need to resync or at worst re-download the folders, right?

Not so simple. There’s no way to force google drive to resync. Signing out and singing in again is supposed to “encourage” it to sync again, but if the local copy of the file structure is corrupt, that won’t help. So, we download the folder off the web version, right? Again, not so simple. There’s a 2gig file limit on downloading, and if the limit is reached, google drive produces an error, but it also just goes ahead with the download.

I am obsessive about backing up in multiple locations as well as web backups, so I haven’t lost any data. It’s just a bit of a pain to re-integrate the recently modified files.

And it’s safe to say that we will *not* be using google drive for actual backing up. We will probably continue to use it for sharing work in progress and writing the articles associated with the projects, but for actual data curatorship, we’ll be going with dropbox.

Conference talk on grammar boot camps

I run a grammar boot camp every year, where a small group of students write a grammar of a language in a month. Last year it was Ngalia, and this year (starting in a few weeks) it’ll be Cundalee Wangka and Kuwarra. I also ran a year-long grammar group to pilot the idea in 2013, using materials from Tjupan. All four languages are varieties of the Wati subgroup of Pama-Nyungan and all the books are based on fieldwork conducted by Sue Hanson.

At the recent Wanala Conference run by the Goldfields Language Centre, Anaí Navarro, Matthew Tyler and I did a video presentation about the boot camp, its aims, methods, and results. Here’s a link to the video: https://drive.google.com/a/yale.edu/file/d/0ByIoQcheKNw2RGx4amVjNjhLOUk/view. Warning that it’s 190mb and 22 minutes long.

Talk slides

This week I’ve been giving three talks in the Department of Linguistics at UC Berkeley. It’s been a very stimulating week, with lots of good feedback, brainstorming for new directions, and problem troubleshooting. I’ve also met with many of the graduate students (including two who were my students as undergraduates and who worked on the data that led to some of the results presented in the talks) to hear about their work.

I’m posting slides for two of the talks here. On Monday, I gave an overview of the Pama-Nyungan project and talked about how the tree was created, what it implies to take an ‘evolutionary’ view of language (in this framework), and some off-shoots of the project (MondayPhylogenetics powerpoint). On Wednesday, I talked about one further extension, using the tree to investigate the evolution of colo(u)r terminology. (WednesdayColor powerpoint.)

The other talk, on Tuesday, was on using my Bardi corpus to study life-span changes and variation. At the end of the talk when I was chatting, one of the sociolinguists expressed surprise that I hadn’t anonymized the identities of the Bardi speakers. I hadn’t thought about it. As a fieldworker working on Bardi, I did talk about this general point with the people I worked with, and all were keen to be acknowledged for their work on the language, and recognized among the key custodians of their culture. However, with the work on aging and variation, I am no longer talking directly about Bardi as a language, and more about the properties of the speech of individual speakers, and the more I thought about it, the less comfortable I am with putting that up online without talking to Bardi people about it first. It’s not just a matter of anonymizing the slides, because there are so few speakers, anyone who has any of my other Bardi work will be able to easily work out who I’m talking about. There will be a paper (soon I hope) on using forced alignment in field research, so some of the results will be in that paper, probably now along with a discussion of the ethics of same.

 

Giving Directions in Bardi

I was recently quoted in a National Geographic article talking about research on absolute frames of reference. I mentioned some data from Bardi but the only description is brief, and it’s in my 2012 [unfortunately incredibly expensive] reference grammar. Here’s a summary of how to talk about directions in Bardi.

Bardi people use several different ways to talk about directions and relative location. This in itself is not unusual. Here are the systems I have data for:

  • left and right
  • compass points
  • deixis, relative proximity
  • tidal-based directions
  • directions using place names

Left and Right

Bardi does have words for ‘left’ and ‘right’: they are aarlgoodoo and joorroonggoo respectively. They aren’t much used, however. In story-telling (my main source of direction terms), the only common use is when talking about the direction of boomerang throwing, where it refers to the direction of curvature of the boomerang’s path. Joorroonggoo also means ‘straight’, and that is its more common meaning.

Joorroongg-ondarr morr arrjamb ngandankal.
The road was straight, so I walked on it. (ie, it was a straight path [to where I was headed], so I went down it.)

Compass Points

Bardi has a system of directions roughly equivalent to English compass points and based on the direction of prevailing winds. Ardi is ‘north’ (or, more properly, a bit east of north), baarnarr is ‘east’ alang is ‘south’, and goolarr is west.

Barnoorarra nyalab jarri Ardiyooloon injoonoo, jamb biila injoonoo goolarr.
He’s been to Ardiyooloon on the eastern side, and he went again to the west.

These compass points are mostly used for general directions, not for directions to specific places (for that, a sequence of place names is used, as I’ll describe below). They aren’t used for smaller frames of reference (so, you wouldn’t say something like ‘the dog is on the north side of the house’, like some languages without relative frames of reference have). The following example is typical.

Aalga ardi wirr iyarrmin.
The sun rises in the northeast.

Deixis

For examples like describing the relative placement of a dog and a house, Bardi speakers typically use deictic markers like ‘here’, ‘there’, or ‘this side’ and ‘the other side’.

Ginyinggon roowil innyana barda nyoonoo, nyanbooroonony daab innyana biinyba.
Then he started walking to the other side of the marsh. (literally, then he walked there, on the other side he climbed up [through] the marsh).

Tidal marking

Bardi has two adverbs, joordarrarr ‘with the tide’ and arrinar ‘against the tide’.

Joodarrarr angarrgalalij Bawoordoongan.
We went with the tide to Bawoordoo (near Swan Point).

They are also used in giving directions. Given that Bardi country is in an area with very swift currents and a 10 metre inter-tidal range at king tides, it is not surprising that Bardi people know a great deal about water navigation, tidal movements, and how to sail in tricky waters. A lot of Bardi food traditionally also depended on the tides (such as good times for reefing, fishing and hunting).

Directions by Place Names

Finally, a very common way of giving directions involves providing a chain of place names between the speaker’s current location and the goal. Bardi country is incredibly rich in named places; the dictionary has well over 500 names recorded, including 100 names on Sunday Island (Iwany) alone; Iwany is just over a mile across and nearly 2 miles from north to south, so it’s not exactly huge. This way of giving directions is, of course, not easy to follow if you don’t know the places. In traditional Bardi society, however, everyone would have known them, and so talking about directions in that way is a way of reinforcing the names and their sequence.

Tasmanian language data

The CHIRILA database contains materials from the Aboriginal languages of Tasmania. The excel spreadsheets contain all the records from Plomley’s (1976) Tasmanian language data, and additional spreadsheets contain explanatory data about the speakers represented in the text, the regions where data were recorded, and who the recorders were. This is the data used in Bowern (2012).

A word of warning is warranted here. This is not easy data to use; there’s a steep learning curve both for understanding the original transcription conventions, Plomley’s groupings, and the abbreviations.

See http://www.pamanyungan.net/2016/02/tasmanian-language-data/ for downloads.

Introducing CHIRILA

I am very pleased to announce that the first phase of CHIRILA (Contemporary and Historical Resources for the Indigenous Languages of Australia) has been released. This represents approximately 180,000 words from 155 different Australian languages. It is a subset of the full database (of approx 780,000 items); eventually I hope to be able to release most of the data. Currently, the first phase is that for which we have explicit permission, or which is already in the public domain.
The material is hosted at pamanyungan.net/chirila; please see the web site for more information about the contents of the database, how to download data, what formats are available, and the like. We do not provide a web interface to the data; you download it and use excel or a database program to read the files.
We hope the data will be useful to researchers, community members, and others with an interest in Australia’s Indigenous language heritage.
pamanyungan.net/chirila also includes access to the preprint of a paper describing the database (both the online and full versions).

Second Annual Summer Grammar Bootcamp!

I will be holding a summer ‘grammar boot camp’ from July 5 to July 29, 2016. The idea is to have up to four advanced undergraduate students work intensively on existing high-quality archival field notes and recordings with the aim of producing a publishable sketch grammar. Students will receive a stipend and travel expenses to come to Yale. This follows from a very successful first bootcamp in 2015.

This project is funded by the National Science Foundation’s Research Experiences for Undergraduates program; as such, applicants are limited to US citizens or permanent residents. Students who have graduated in Spring 2016 will be eligible to apply. That is, the targeted cohort is undergraduates who will have just finished either their junior or senior year.

The materials to be worked on will be from an Australian Aboriginal language from Western Australia and will include both print materials and audio files. It is probable that the ‘print’ materials will already be digitized and in Toolbox.

Students will meet once a day as a group with me to discuss analyses and writing. They will spend the rest of the time working with the materials in the Linguistics department. They will receive regular detailed feedback on the analysis and writing. Familiarity with Australian languages is not required but I would expect that successful applicants would do some reading of grammars of related languages prior to the start of the boot camp.

Applications for the boot camp are now open. The deadline for applications is January 22, 2016, and applicants will be notified of the result in mid-February.

To apply, please send the following materials electronically:

. a letter of application, describing your experience in linguistics, including research experience, your future plans, and why you’d like to join the boot camp.
. a writing sample, such as a linguistics term paper
. course transcript (this can be an unofficial transcript)

Please send materials as file attachments to bootcamp@pamanyungan.net, cc’ed to claire.bowern@yale.edu. Applications will be acknowledged within 2 days – if you don’t get an acknowledgment, please let me know.

Please also arrange for one or two letters of recommendation/support from faculty to be sent to the same email addresses, also by January 22.

Students will need to show some evidence of prior research experience (e.g. through an RA-ship or by having a senior thesis in progress) and some familiarity with language documentation procedures (e.g. through having taken a field methods class or equivalent, such as having attended CoLang or a LSA Institute class). Applicants will need to show attention to detail and ability to focus on a project for a sustained period. Students will need to be able to travel to New Haven for the entire period of the boot camp and should expect to work solely on this project during that time, including some evenings and weekends.