Library Automation and Digital Archive
LONTAR
Fakultas Ilmu Komputer
Universitas Indonesia

Pencarian Sederhana

Find Similar Add to Favorite

Call Number SEM-372
Collection Type Indeks Artikel prosiding/Sem
Title Indonesian Audio-Visual Speech Corpus for Multimedia Automatic Speech Recognation. Hal 381-386
Author Muhammad Rizky Aulia Rahman Maulana, Mohamad Ivan Fanany;
Publisher ICACSIS 2017 International Conference on Advanced Computer Science and Information System.
Subject
Location
Lokasi : Perpustakaan Fakultas Ilmu Komputer
Nomor Panggil ID Koleksi Status
SEM-372 TERSEDIA
Tidak ada review pada koleksi ini: 47384
Abstract- Advancement of automatic speech recognition (ASR) relies heavily on the availability of the data, even more so for deep learning ASR system which is at the forefront to accommodate such need, ranging from single modal corpus with several exceptions on visual speech decoding, to multimodal corpus which provides the need booth modalities. Multimodal inherently multi modal in the very first place. Despite the importance, none of this corpus was built for Indonesian language, resulting in the little to no development of visual-only or multimodal ASR system. This research is an attempt to solve that problems by constructing AVID, an Indonesian audio-visual speech corpus for multimodal ASR. The corpus consist of 10 speakers speaking 1,040 sentences with a simple structure, resulting in 10,400 videos of spoken sentences. To the best of our knowledge, AVID is the first audio-visual speech corpus for the Indonesian language which is designed for multimodal ASR . AVID was heavily tested and contains overall low errors in booth modality tests, which indicates the high quality and suitability of the corpus for building multimodal ASR system.