Giving computers a voice

- EN - DE
(Illustration: Ray Oranges)
(Illustration: Ray Oranges)

From Alexa and Siri to translation programs and computer-generated news, anything seems possible these days.The Media Technology Center is searching for applications that could lend a hand with day-to-day editorial work.

Every time you talk to Siri on your phone and ask a question or give a command, you are communicating with artificial intelligence. The only problem is that this intelligence has its limits. In fact, compared to human intelligence, Siri could even be described as fairly stupid, says Ryan Cotterell, a professor who has worked at ETH Zurich since February 2020. Appointed through the ETH media technology initiative as a Professor of Computer Science, Cotterell brings together linguistics, automated language processing and artificial intelligence. "The only reason Siri works is because people typically use very simple questions and commands when they speak to their phone," he says.

Cotterell insists that we shouldn’t expect the same from AI as we do from human intelligence. None of us have any trouble learning our native language, he says, and English speakers can intuitively spot grammatical mistakes in an English sentence. Yet computer programs still struggle to identify whether an English sentence is grammatically correct or not - and that’s because a language processing program works very differently to the human brain. "No translator has ever had to learn the sheer number of words we need to train a translation program," he says.

The Swiss German challenge

Modern translation programs learn using big data, honing their abilities with millions of pairs of sentences. Yet coming up with multiple alternatives for translating an individual sentence is a lot harder. Human translators can do it easily, but translation programs typically offer just one solution. Cotterell hopes to change that: "We want users to have multiple options rather than just being presented with one result. That would allow users to choose the best-fit sentence for each specific context." Yet developing a viable algorithm for this purpose is no easy task, he cautions.

A further challenge is creating translation programs and voice assistants for languages that are only used by relatively small numbers of people. "It’s very hard to develop a good system for languages that are low on data," says Cotterell. Hence his enthusiasm for a voice assistant program that speaks Swiss dialects, which was developed by the Media Technology Center (MTC) at ETH Zurich.

This is a truly remarkable achievement, not only because there are so many regional variants of Swiss dialect, but also because these languages lack a standardised form of spelling. The MTC’s voice assistant has been fluent in a Bernese dialect called "Bärndütsch" since 2019, and further dialects are now in the pipeline. To develop their Swiss German assistant, researchers partnered with Swiss Radio and Television (SRF). The benefit of technologies that translate standard German into Swiss German or read local news and weather in specific dialects is their ability to provide regional authenticity - even when automatically converting text to speech.

A computer-generated media experience

The only caveat is that the same methods could also be used to generate filter bubbles and fake news. Earlier this summer, news headlines were dominated by cutting-edge language-processing AI from the Californian company OpenAI. Known as GPT-3, this massive language model overshadows everything that has come before. "The dimensions are so huge that it would be impossible for universities to build or even test it," says Cotterell. One of the reasons the system attracted so much attention was the potential risk of AI-generated fake news. Given just a few sample news items, GPT-3 can generate plausible news stories in English. It looks like Ryan Cotterell and his fellow researchers at the Media Technology Center still have plenty of work ahead of them.

Support for innovative Swiss Media Centre

Ryan Cotterell serves as an Academia Expert at ETH Zurich’s Media Technology Center (MTC). Support for the professorship and the Center comes from the media companies Ringier, TX Group (formerly known as Tamedia), SRG SSR and NZZ, the Swiss media association VSM and other partners.

www.ethz-foundation.ch/en/­media-technology

Martina Märki