Author Archives: Steve Cassidy

Alveo Hackfest

Preceding the official launch of Alveo on July 1 we will be holding a Hackfest for a hands-on day with Alveo.   We hope the outcome of the day will be some exciting ideas and maybe even the start of some interesting research outcomes using data from the Alveo repository.

Continue reading

LREC2014: The Alveo Virtual Laboratory: A Web Based Repository API

The Alveo Virtual Laboratory is an eResearch project funded
under the Australian Government NeCTAR program to build a platform for collaborative eResearch around
data representing human communication and the tools that researchers use in their analysis. The human
communication science field is broadly defined to encompass the study of language from various
perspectives but also includes research on music and various other forms of human expression.
This paper outlines the core architecture of the Alveo and in
particular, highlights the web based API that provides access to data and
tools to authenticated users.

Creative Commons License
This work by Steve Cassidy, Dominique Estival, Tim Jones, Peter Sefton, Denis Burnham and Jared Berghold is licensed under a Creative Commons Attribution 4.0 International License.

Training a Speech Recogniser with HCS vLab

I just received a report from Matt Atcheson, one of our HDR testers at UWA, with the results of some work he’s done on evaluating the HTK integration with the HCS vLab.  Matt used my template Python interface to download audio files from the vLab and feed them to the HTK training algorithms to train a digit string recogniser.   He was then able to test the recogniser on unknown data also downloaded from the vLab.

The results were interesting:

Using the full set of digit recordings that I could find (about 940 of them), setting aside 10% for testing, and with a grammar that constrains transcripts to exactly four digits, I get  about 99% word accuracy, and about 95% sentence accuracy.

====================== HTK Results Analysis =======================
  Date: Tue Jan 28 21:08:50 2014
  Ref : >ntu/hcsvlab_api_testing_matt/digitrec/data/testing_files/testref1.mlf
  Rec : >buntu/hcsvlab_api_testing_matt/digitrec/data/testing_files/recout.mlf
———————— Overall Results ————————–
SENT: %Correct=94.74 [H=90, S=5, N=95]
WORD: %Corr=98.95, Acc=98.42 [H=564, D=3, S=3, I=3, N=570]
Matt also gave us some good feedback based on his experiments.  If there are other testers interested in trying to repeat this experiment or explore a bit on their own, Matt’s code is available on BitBucket.
To run his experiments, Matt made use of a virtual machine on the Nectar Research Cloud.  Any Australian researcher can login to the cloud and get a free allocation of virtual machines.  We’ve made a VM image (called ‘HCSvLab Tools’, listed in the Public list of snapshots on your dashboard) that has HTK, DeMoLib and INDRI pre-installed; as a user, you can create your own instance of this image and start working with these tools.

AeRO UX Review: HCS vLab

[This User Experience review of the HCS VLab code is posted with the permission of the Author and AeRO, the group who commissioned the review]

Reviewer:     Sam Wolski,
              eResearch Services, Griffith University

OS:           OSX 10.8.2

Browser:      Chrome 29.0.1547.65
Test Case(s): Supplied ‘HCS vLab Testing August’ document.

Preliminary Comments:

The HCS vLab is easily one of the best interfaces I’ve come across in Australian research projects and eResearch applications. The Bootstrap framework is a great development platform and the workflows and interfaces of the HCS vLab have been integrated well to form a beautifully clean and usable application. The following feedback is intended to provide a list of small improvements to the application. Continue reading

Q & A after demo of the HCS vLab at the Annual meeting of the Australian Linguistics Society (ALS) 2013

This page presents the answers to questions raised during the presentation on the HCS vLab at the Annual meeting of the Australian Linguistics Society (ALS 2013) in Melbourne on Friday 04/10/2013.

1) Is it possible to search by audio type and data types, e.g. sentences, words? For instance, in the Mitchell and Delbridge data, that information is in the file names for the original data.

Answer: We could simply search the item name (the item name comes from the filename) to search by audio type and data type. This type of searching is not currently possible, but general metadata search functionality is being built into the system. Someone who knows more about each of the data sources could also help us improve the ingestion of metadata for that source.

2) Is it possible to use the search box function on the main page to search the metadata fields (e.g. location of recording, or origin of speaker) not just the Item text contents.

Answer: We are currently developing this functionality.

3) Is there funding available to support researchers interested in contributing legacy data to the HCS vLab? For instance, people working on Australian languages might submit their data to PARADISEC and the data could enter the HCS vLab indirectly that way.

Answer: There is currently no funding available, but we will set up a process for taking data, clean it and ingest it and this process will be documented with the final release of the HCS vLab. Submitting data to PARADISEC was indeed the path envisaged for Australian languages data, but it means we need to put in place a way to regularly update the PARADISEC collection ingested in the HCS vLab. This may be put in place as part of Phase II (i.e. after 01/07/14).

3) Is it possible to change the name of an Item List in the Discovery Interface, not just in Galaxy?

Answer: We will add support for renaming of item lists.

4) There were browser issues with viewing EOPAS in earlier versions of the HCS vLab.

Answer: The issues should be resolved in the new version of the HCS vLab.

5) Will Praat be available in the HCS vLab?

Answer: Praat is not part of the set of tools slated for Phase I of the HCS vLab project, but we agree it would be good if we could find a way to include it. We are keeping a list of tools people have said they want and which we will consider for inclusion in Phase II (from July 2014). For now, users can download data files and use them in Praat. If users need to add the annotation files, we could add support for converting annotations to Praat format, or write a widget converting JSON-LD (our current format) to Praat format.

6) Could ultrasound and EEG (any electronic data, really) be put into the HCS vLab and then be available for analysis there?

Answer: This is something we would like to have and which should already be possible as there shouldn’t be anything special about these files which would prevent them from being added.

7) Will ELAN be included in the tools? Many linguists use ELAN, e.g. with video for sign language research.

Answer: We already have some ELAN annotations in the EOPAS datasets, so to some extent we are supporting it.

8) For linguists who work with historical sources (e.g. manuscripts and colonial letters), can we have PDF scan of the original source rather than typed up version as ‘Primary Data’?

Answer: We agree that they shouldn’t be considered “Primary Data” but the typed versions of the files are listed as “Original” by the collection creators. There are some PDF files in PARADISEC and it would be possible for researchers to add PDF scans to any of the AusNC collections.

9) Is there a vLab FAQ page we can add the questions and their answers.

Answer: There is now! We will add to it as more questions come in.

10) Is there an HCS vLab mailing list?

Answer: Sorry, not yet. But watch this space.

Updated Demo Screencast

We’ve moved on a little since the first demo video. This screencast shows some of the new features as of the end of July including the AVOCES corpus, inline text and audio previews of items and the use of item lists. The demo also includes examples of using a simple concordance and frequency search based on an item list.