Title

Keynote 7: Speech over simple phones: the internet of the orals

Abstract/Description

The Internet is a gateway to information and connectivity for literate, affluent, and tech-enabled people. Unfortunately, with 4.4 billion offline individuals, more than half of the world remains deprived of this facility. Dubbed oral, such populations include people who are low-literate, low-income, tech naïve, visually impaired, linguistically and socially marginalized, geographically remote, and native speakers of unwritten languages. An attribute common to 73% of these people is that they have access to some kind of a mobile phone. Recent years have seen a growth in speech interfaces available over simple mobile phones as spoken dialog systems and Interactive Voice Response (IVR) systems to provide information and connectivity to such populations. However, speech over phones comes with challenges of its own including hurdles of user training, motivation, spread, large-scale content moderation and being inclusive towards marginalized populations. Another major challenge is that modern Speech and Natural Language Processing techniques mostly exclude languages of developing regions. My talk will focus on recent advances that employ entertainment to overcome these obstacles. These techniques employ entertainment as a viral conduit to spread development-related information to under-served populations. This enables speech-based telephone services to be used as vehicles for large-scale dissemination of development-related information, pulling information in the form of real-time surveys, performing randomized controlled trials and demographic studies, and measuring the knowledge impact of information campaigns at scale and in real-time. Such services are also being used to provide voice-based social connectivity to the target populations providing them with a voice and social identity, and to rapidly gather large spontaneous speech corpora for the local languages of developing regions. Such data enables the development of speech recognition, spoken-term detection, speaker ID, and noise classification systems (among others) for such languages.

Location

Lecture Hall A (Aman Tower, 12th floor)

Session Theme

Keynote Session V

Session Type

Keynote Speech

Start Date

17-11-2019 11:20 AM

End Date

17-11-2019 12:00 PM

Share

COinS
 
Nov 17th, 11:20 AM Nov 17th, 12:00 PM

Keynote 7: Speech over simple phones: the internet of the orals

Lecture Hall A (Aman Tower, 12th floor)

The Internet is a gateway to information and connectivity for literate, affluent, and tech-enabled people. Unfortunately, with 4.4 billion offline individuals, more than half of the world remains deprived of this facility. Dubbed oral, such populations include people who are low-literate, low-income, tech naïve, visually impaired, linguistically and socially marginalized, geographically remote, and native speakers of unwritten languages. An attribute common to 73% of these people is that they have access to some kind of a mobile phone. Recent years have seen a growth in speech interfaces available over simple mobile phones as spoken dialog systems and Interactive Voice Response (IVR) systems to provide information and connectivity to such populations. However, speech over phones comes with challenges of its own including hurdles of user training, motivation, spread, large-scale content moderation and being inclusive towards marginalized populations. Another major challenge is that modern Speech and Natural Language Processing techniques mostly exclude languages of developing regions. My talk will focus on recent advances that employ entertainment to overcome these obstacles. These techniques employ entertainment as a viral conduit to spread development-related information to under-served populations. This enables speech-based telephone services to be used as vehicles for large-scale dissemination of development-related information, pulling information in the form of real-time surveys, performing randomized controlled trials and demographic studies, and measuring the knowledge impact of information campaigns at scale and in real-time. Such services are also being used to provide voice-based social connectivity to the target populations providing them with a voice and social identity, and to rapidly gather large spontaneous speech corpora for the local languages of developing regions. Such data enables the development of speech recognition, spoken-term detection, speaker ID, and noise classification systems (among others) for such languages.