Author Archives: Steve Cassidy

Alveo Services Restored

I’m pleased to report that the Alveo server is now fully restored and all services should be working again as normal.  AAF login is working again and password reset emails are now being delivered.

There is some work still in progress. In particular the Galaxy server will be updated soon with some more tools for manipulating speech data.  We have been building tools to support workflows involving forced-alignment with MAUS and formant tracking with the Emu wrassp toolkit.  These are now mostly working and we will deploy them as soon as we can.  The use of Galaxy for speech and language analysis is a new development and we are still working out the best way to build tools and chain them together.  When we have some tools available we’ll invite you to experiment and provide feedback so that we can hopefully build something that is generally useful to the community.

 

Server Status Update

An update on the new server deployment.  The Alveo repository is now re-installed on new infrastructure at NCI Canberra.   All collections are re-ingested and should be available as before but there are a couple of unresolved issues that we are still working on.

  • AAF logins are not yet working so you will need to login with a username/password if you have one
  • we’re not able to send mail from the server so you will not be able to get password reminders or create new accounts

Unfortunately, in combination these problems might block many users who previously used AAF login to access Alveo. We are working on both issues and hope to have them resolved next week.

The ingest of the full Austalk collection was interrupted at some point and so not all of the collection is present.   We will be re-ingesting this collection this weekend (19-20 Nov) so hopefully it will be fully available next week.

One new collection is now available, MAVA is a collection of Audio-Visual read speech from a single speaker collected by Vincent Aubanel from Western Sydney University.

I will post further updates as things change.

Alveo Server Outage

As of this morning (1st November) the Alveo server is offline. We are currently moving the server from its previous home at Intersect in Sydney to the facilities of NCI in Canberra.   We had hoped to have a seamless transition between the two services but unfortunately the new server is not quite ready.

We will bring Alveo back online as soon as possible.  All user accounts and collections should be maintained.

One major addition will be that for the first time we will have the full Austalk collection on Alveo.  We’ve been working on finalising this collection for some time and this is the first opportunity we’ve had to get the entire collection ingested.  When the server returns you should see over 850,000 items in the Austalk collection.

Uploading Data to Alveo

When we set out to build Alveo the aim was always that it should be a repository for new collections contributed by researchers; however, the initial impetus was to get a number of older collections ingested and build the platform capabilities.  New collections were added to Alveo via a back-end process that only the developers could run.

We have since worked on adding the hooks into the API to allow new collections, items and documents to be added to the Alveo repository.  This extended API has now been deployed on the main system and we have extended the pyalveo library to allow scripts to be written that add new data.   I recently used this facility to add the first contributed collection to Alveo: a collection of children’s speech data.  This blog post describes the script that I wrote to do this by way of a bit of a tutorial on the process. Continue reading

Report from SocioPhonAus 2016 Brisbane

I was invited to give a presentation on Alveo and Austalk at First workshop on Sociophonetic Variability in the English Varieties of Australia held at Griffith University in Brisbane in June.   The workshop, organised by Gerry Docherty and Janet Fletcher, was supported by the Centre of Excellence for the Dynamics of Language was attended by phoneticians from around the country with a keynote given by Prof. Jonathan Harrington who flew in from Munich.

Continue reading

Accessing Austalk in Alveo

Austalk is a large collection of spoken Australian English collected in the last few years at sites around Australia. When the collection is complete it will have close to 1000 speakers each with a range of recordings from isolated words to interview and map task recordings.  Alveo contains most of the data and will have the complete corpus when collection and data processing is complete.
Continue reading