Current research projects

1. Generalisation in Deep Learning 

A long-standing theoretical interest in our group focuses on the topic of generalisation - that is, how information learned on a training set of samples is transferred to novel inputs. Investigations on generalisation in Deep Neural Networks (DNNs) are particularly relevant in light of recent progress in the deep learning field, and is an important part of our current work. Currently, MUST students are studying the essence of the learning process of different types of deep networks, investigating how factors such as learning algorithms and network structures affect generalisation. We typically use popular deep learning tools (such as Pytorch and Tensorflow) to explore specific questions on new and existing data sets. 


2. Speech and language processing

MUST collaborates with Saigen to solve real-world speech analytics problems through the application of deep learning techniques. Examples include using DNNs to improve automatic speech recognition and speaker diarisation (identification of individual speakers) on heavily compressed or poor quality audio files, and the study of DNN-based word embeddings for more accurate language modelling.

Speech processing tools are not only applicable to human speech and language: with Prof Jaco Versfeld from Stellenbosch University we are investigating the use of end-to-end DNN systems for whale call detection and classification. Audio data collected through passive acoustic monitors in False Bay sheds light on the movement of local whale species, information that supports marine monitoring and conservation. Our role is to develop tools to contribute to this process.



MUST hosts the CAIR Deep Learning group, a node of the Centre for Artificial Intelligence Research (CAIR), a distributed South African research network spanning eight universities. CAIR researchers conduct foundational, directed and applied research into various aspects of Artificial Intelligence. It is a DSI initiative hosted by the CSIR.

4. NITheCS

Machine learning in support of Theoretical and Computational Sciences is a research programme of the National Institute for Theoretical and Computational Sciences (NITheCS). Collaborators include researchers from diverse fields at UCT, UKZN, UJ, SU, and NWU. 

Coordinated by MUST, the programme has two streams: 
- Machine learning research: development of new, specialised ML techniques.
- Machine learning as a tool: applying ML for scientific modelling applications.
The current focus of the programme is on knowledge discovery in time series data, and on creating a forum for cross-cutting projects executed in other focus areas where projects rely on machine learning expertise.    

5. Space weather prediction

Together with the Space Science directorate of the South African National Space Agency (SANSA), we investigate the applicability of DNNs for modelling various space weather phenomena, such as  predicting geomagnetic disturbances from solar wind parameters, or predicting the eruption of solar flares from images of the sun. An important goal of this work is ‘knowledge inference’: studying ways in which the developed models can be interpreted in novel ways in order to shed light on the underlying phenomena.



6. Industry applications of deep learning

DNNs have wide application in industry. As part of our industry programme, we investigate the ability of DNN-based tools to solve very specific tasks in close collaboration with external domain partners. In these cases, the industry partner contributes the domain knowledge, and MUST researchers tailor deep learning models to address the specific challenges posed by the task.

Jonker Sailplanes

-  Sailplane cross-country performance optimisation

In collaboration with Jonker Sailplanes, we are developing a framework for optimising sailplane cross-country performance with DNNs. Traditional sailplane performance optimisation is a slow and computationally expensive task as it requires the integration of multiple simulation packages and function evaluations for multiple span-wise stations (parts of a wing) under multiple flight conditions. We are replacing the computationally expensive simulation packages with lower fidelity DNN-based models, in order to reduce the time spent on the preliminary sailplane design and optimisation phase. Dr Johan Bosman, an aerodynamics engineer at Jonker Sailplanes and senior lecturer at the School for Mechanical Engineering of the Faculty of Engineering, co-supervises the study. 


-  Traffic flow prediction

Intelligent transportation systems have attracted increasing attention in recent times, partly because a large part of the working population spends several hours a day in congested traffic.  Transport systems furthermore contribute approximately 30% to total global emissions. Together with Prof Alwyn Hoffman and the NWU Intelligent Systems Group, we develop models for the prediction of future traffic behaviour, as part of the Sanral Research Panel activities.  This will enable more effective measures to reduce congestion and guide the design of a more streamlined road network.


-  Channel state estimation

Together with Prof Albert Helberg and the NWU Telenet Group, we investigate the abilities of deep learning methods in the telecommunications domain.  Specifically, we consider the applicability of generative adversarial networks to channel estimation and equalisation over wireless channels such as WiFi. We hope to further extend the application and usefulness of deep learning methods to additional telecommunication tasks.

 Completed projects

The Babel Project

The Babel project was an international collaborative project aimed at solving the spoken term detection task in previously unstudied languages. MUST  was  part of the BabelOn consortium consisting of BBN (USA), BUT (Czech Republic), LIMSI (France), MIT (USA) and Johns Hopkins University (USA).

The Speech Transcription Platform Project

The speech transcription platform project, supported by the Department of Arts and Culture, entailed the development of a Web-based platform that would enable users with varying degrees of sophistication to easily and quickly transform speech in the South African languages to text, with the assistance of the latest in speech recognition technology.

 Google Text To Speech Project

MUST collaborated with Google to create new voices for four South African languages.  Different female speakers with similar voice profiles were selected as voice contributors. This allowed for voices to be mixed in such a way that the resulting voice would not sound like any specific individual. The datasets were released for further use.


The SADE Project

The SADE project, supported by the Department of Arts and Culture, the Technology Innovation Agency and commercial partners, developed a fully South African directory inquiries system, specifically built to deal with South African accents and names.

The V-BAT Project

The V-BAT project was carried out in collaboration with the Web Foundation and One World South Asia and funded by the Rockefeller Foundation. It investigated the applicability of speech technology in helplines designed to assist small farmers in India with relevant, reliable and up-to-date information.

The VOICES Project

The EU-funded VOICES project developed speech technology, use cases and business models involving voice technology in two application domains (health and agriculture) in West Africa. MUST researchers were particularly involved in developing and assessing speech recognition and speech synthesis in two of Mali’s languages, namely Bambara and Bomu.

The Lwazi Project

The Lwazi project, funded by the South African government, was a large-scale project to develop speech technologies, and their applications, for South Africa’s eleven official languages. This project was carried out under the leadership of the CSIR Meraka Institute and produced a wide range of open-source tools, public-domain resources and generally accessible applications. MUST and the CSIR Meraka Institute also collaborated on resource development for the National Centre for Human Language Technologies (NCHLT) in developing extensive, high-quality speech resources for public use. Many of these resources are now available for public download from the Resource Management Agency (RMA).