Alveo Hackfest

Preceding the official launch of Alveo on July 1 we will be holding a Hackfest for a hands-on day with Alveo.   We hope the outcome of the day will be some exciting ideas and maybe even the start of some interesting research outcomes using data from the Alveo repository.

The Hackfest will be held at the Female Orphan School, a historic building that is part of the UWS Parramatta campus. It will run from 9.30am to 6pm with breaks for refreshments and lunch.  The day will be loosly structured but will begin with a welcome from Prof. Denis Burnham and a tutorial overview of Alveo presented by Dominique Estival and Steve Cassidy.  The rest of the day will be turned over to the participants. Support will be available and we will try to pair up programmers and non-programmers as needed to see what outcomes we can generate.

While participants are encouraged to bring their own problems and ideas to the Hackfest, some possible projects for the day might be:

  • Word frequency – compute a word frequency table from an item list or contrast on a given metadata facet
  • Word cloud – display a word cloud given an item list, same sort of contrasts as frequency lists
  • Collocations – display common collocations from an item list, compute collocations of a given word
  • <span”>Look at the occurrence of keywords or phrases in COOEE texts over time – some kind of timeline display to show occurrences or frequency  eg. like Google NGrams
  • Look at occurrence after POS tagging the text (eg. like as a verb)
  • A search engine over a fixed ‘web’ (eg. clueweb but we could demo on a smaller collection) to use as a benchmark source in evaluating question answering systems that use the web as a resource
  • With an item list from Mitchell & Delbridge query for vowels, calculate formants  and plot vowel space, use this to compare vowel spaces for different groups of speakers
  • Pass an Austalk item list through MAUS, collect the text grid files, then do the above
  • Use a SPARQL query to find speakers with some particular properties, then find their hVd words and make an item list, could feed into one of the above analyses
  • Given an item list, calculate pitch track (or other track) and plot each item separately

All of these things should be possible but the key outcome of the day will be taking ideas from the participants and exploring what we can do to realise them.

If you have questions about the Hackfest please contact Steve Cassidy (