Author Archives: Steve Cassidy

Accessing AusTalk in Alveo

AusTalk is a large collection of spoken Australian English collected in the last few years at sites around Australia. When the collection is complete it will have close to 1000 speakers each with a range of recordings from isolated words to interview and map task recordings.  Alveo contains most of the data and will have the complete corpus when collection and data processing is complete.
Continue reading

Alveo Hackfest

Preceding the official launch of Alveo on July 1 we will be holding a Hackfest for a hands-on day with Alveo.   We hope the outcome of the day will be some exciting ideas and maybe even the start of some interesting research outcomes using data from the Alveo repository.

Continue reading

LREC2014: The Alveo Virtual Laboratory: A Web Based Repository API

The Alveo Virtual Laboratory is an eResearch project funded
under the Australian Government NeCTAR program to build a platform for collaborative eResearch around
data representing human communication and the tools that researchers use in their analysis. The human
communication science field is broadly defined to encompass the study of language from various
perspectives but also includes research on music and various other forms of human expression.
This paper outlines the core architecture of the Alveo and in
particular, highlights the web based API that provides access to data and
tools to authenticated users.

Creative Commons License
This work by Steve Cassidy, Dominique Estival, Tim Jones, Peter Sefton, Denis Burnham and Jared Berghold is licensed under a Creative Commons Attribution 4.0 International License.

Training a Speech Recogniser with HCS vLab

I just received a report from Matt Atcheson, one of our HDR testers at UWA, with the results of some work he’s done on evaluating the HTK integration with the HCS vLab.  Matt used my template Python interface to download audio files from the vLab and feed them to the HTK training algorithms to train a digit string recogniser.   He was then able to test the recogniser on unknown data also downloaded from the vLab.

The results were interesting:

Using the full set of digit recordings that I could find (about 940 of them), setting aside 10% for testing, and with a grammar that constrains transcripts to exactly four digits, I get  about 99% word accuracy, and about 95% sentence accuracy.

====================== HTK Results Analysis =======================
  Date: Tue Jan 28 21:08:50 2014
  Ref : >ntu/hcsvlab_api_testing_matt/digitrec/data/testing_files/testref1.mlf
  Rec : >buntu/hcsvlab_api_testing_matt/digitrec/data/testing_files/recout.mlf
———————— Overall Results ————————–
SENT: %Correct=94.74 [H=90, S=5, N=95]
WORD: %Corr=98.95, Acc=98.42 [H=564, D=3, S=3, I=3, N=570]
Matt also gave us some good feedback based on his experiments.  If there are other testers interested in trying to repeat this experiment or explore a bit on their own, Matt’s code is available on BitBucket.
To run his experiments, Matt made use of a virtual machine on the Nectar Research Cloud.  Any Australian researcher can login to the cloud and get a free allocation of virtual machines.  We’ve made a VM image (called ‘HCSvLab Tools’, listed in the Public list of snapshots on your dashboard) that has HTK, DeMoLib and INDRI pre-installed; as a user, you can create your own instance of this image and start working with these tools.

AeRO UX Review: HCS vLab

[This User Experience review of the HCS VLab code is posted with the permission of the Author and AeRO, the group who commissioned the review]

Reviewer:     Sam Wolski,
              eResearch Services, Griffith University

OS:           OSX 10.8.2

Browser:      Chrome 29.0.1547.65
Test Case(s): Supplied ‘HCS vLab Testing August’ document.

Preliminary Comments:

The HCS vLab is easily one of the best interfaces I’ve come across in Australian research projects and eResearch applications. The Bootstrap framework is a great development platform and the workflows and interfaces of the HCS vLab have been integrated well to form a beautifully clean and usable application. The following feedback is intended to provide a list of small improvements to the application. Continue reading