Research projects

Generalisation

A long-standing theoretical interest in our group focuses on the topic of generalisation - that is, how information learned on a "training set" of samples is transferred to novel inputs. Investigations on generalisation in Deep Neural Networks (DNNs) are particularly relevant in light of recent progress in the deep learning  field,  and is an important part of our current work. In 2018/19, MuST students are studying various theories of generalisation in the context of the problem domains that interest us, investigating how factors such as learning algorithms and network structures affect generalisation. They are using popular deep learning tools (such as Pytorch and Tensorflow) to explore specific questions on new and existing data sets.
 
 

The CAIR Project

MuST is a node of the Centre for Artificial Intelligence Research (CAIR), a South African research network that conducts foundational, directed and applied research into various aspects of Artificial Intelligence.
  

The Speech Transcription Platform Project

The speech transcription platform project, supported by the Department of Arts and Culture, entailed the development of a Web-based platform that would enable users with varying degrees of sophistication to easily and quickly transform speech in the South African languages to text, with the assistance of the latest in speech recognition technology.
  

The Babel Project

The Babel project was an international collaborative project aimed at solving the spoken term detection task in previously unstudied languages. MuST  was  part of the BabelOn consortium consisting of BBN (USA), BUT (Czech Republic), LIMSI (France), MIT (USA) and Johns Hopkins University (USA).
 

Google Text To Speech Project

MuST collaborated with Google to create new voices for four South African languages.  Different female speakers with similar voice profiles were selected as voice contributors. This allowed for voices to be mixed in such a way that the resulting voice would not sound like any specific individual. The datasets were released for further use.
  

 

 

The SADE Project

 
The SADE project, supported by the Department of Arts and Culture, the Technology Innovation Agency and commercial partners, developed a fully South African directory inquiries system, specifically built to deal with South African accents and names.
 

The V-BAT Project

The V-BAT project was carried out in collaboration with the Web Foundation and One World South Asia and funded by the Rockefeller Foundation. It investigated the applicability of speech technology in helplines designed to assist small farmers in India with relevant, reliable and up-to-date information.
  

The VOICES Project

The EU-funded VOICES project developed speech technology, use cases and business models involving voice technology in two application domains (health and agriculture) in West Africa. MuST researchers were particularly involved in developing and assessing speech recognition and speech synthesis in two of Mali’s languages, namely Bambara and Bomu.

The Lwazi Project

The Lwazi project, funded by the South African government, was a large-scale project to develop speech technologies, and their applications, for South Africa’s eleven official languages. This project was carried out under the leadership of the CSIR Meraka Institute and produced a wide range of open-source tools, public-domain resources and generally accessible applications. MuST and the CSIR Meraka Institute also collaborated on resource development for the National Centre for Human Language Technologies (NCHLT) in developing extensive, high-quality speech resources for public use. Many of these resources are now available for public download from the Resource Management Agency (RMA).