Hosted by the Big Data Processing and Mining Group at UWA, 13 researchers, including 6 members of the Alveo Steering Committee met for 2 days to discuss further improvements to the platform and how they will use Alveo in their own research. Several researchers and post-graduate students from UWA and Curtin University participated in the workshop and presented their own projects.
The first day (Monday 27 June) started with a general introduction session. After a few remarks by Prof Denis Burnham (Project Director), Dr Dominique Estival (Alveo Project Manager) gave a short presentation of the history and current status of the project and A/Prof Steve Cassidy (Project Owner) gave a technical overview of the platform and the tools. Michelle Barker, Nectar Deputy Director for Research Software Infrastructure, then joined via video-conference to give an overview of the Virtual Lab program and an outlook on the Research Infrastructure Roadmap. The other participants then presented their own research projects, some of which already use Alveo or AusTalk., and Steve Cassidy gave a more in-depth demonstration of Alveo and the Galaxy tools.
Most of the afternoon was dedicated to discussing how Alveo would be used in those projects and the specific needs of each researcher, e.g. better data upload facility, specific workflows for speech processing, backend access to Galaxy, visualisation of results, linguistics-based search facilities through corpora, annotation levels and version control of annotations, adding dictionaries and ontologies, and inter-operability. The day ended with a session to canvas the opportunities for Alveo, including international collaborations, e.g. the Language Application Grid (www.lappsgrid.org ) and Camomile (camomile.sourceforge.net), the current joint project with FAIMS, and setting up student projects at Melbourne or UWA. The discussion over an excellent dinner on the waterfront led to further suggestions taken up the next day
The second day (Tuesday 28 June) started with brainstorming about decentralising the work on Alveo. The main suggestions were to: organise the Alveo Developers Network (including publicising GitHub and the Galaxy developers network); support research projects using Alveo; use Alveo in teaching; propose student projects; set up a Shared Task (ALTA 2017); pursue Australian and international collaborations. Michael Haugh (UQ) presented the AusNC project and how Alveo will allow consolidating its holdings, improving data access and allowing more sophisticated analysis. Steve Cassidy presented the Trove Names project, Named Entity Recognition on the Australian National Library Trove dataset, with results fed back into HuNI (another Nectar VL). In the final session, the aims of each project were further defined and funds were allocated.
List of projects:
Researcher | Project aims |
---|---|
Hywel Stoakes | Wrapping Speech Tools in Galaxy to get end-to-end workflow, from speech to labelled data |
An Iterative Implementation of MAUS: A model for Australian Languages | |
Karin Verspoor | User Data Upload facility |
Tom Anderson | Pre-processing in Galaxy, Speech dictionary |
Trent Lewis | Unsupervised Analysis and Mapping of Speech Characteristics |
Wei Liu | Prototype Workflow in Galaxy for NER + Visualisation |
Roberto Togneri | Wrapping Speech Tools in Galaxy |
Michael Haugh | Consolidate AusNC/Alveo holdings |
Layered annotations through Corpus Workbench | |
Denis Burnham | IDS corpus in Alveo and Key-word spotter tool in Galaxy |
Julien Epps | Audio-visual analysis of emotional speech |
Jane Simpson | Building a Corpus of Varieties of Kriol |
Felicity Cox | Creaky Voice in Australian English |