Library Automation and Digital Archive
LONTAR
Fakultas Ilmu Komputer
Universitas Indonesia

Pencarian Sederhana

Find Similar Add to Favorite

Call Number SEM-372
Collection Type Indeks Artikel prosiding/Sem
Title Sentence-level Indonesian Lip Reading with Spatiotemporal CNN and Gated RNN. Hal 375-380
Author Muhammad Rizky Aulia Rahman Maulana, Mohamad Ivan Fanany;
Publisher ICACSIS 2017 International Conference on Advanced Computer Science and Information System
Subject
Location
Lokasi : Perpustakaan Fakultas Ilmu Komputer
Nomor Panggil ID Koleksi Status
SEM-372 TERSEDIA
Tidak ada review pada koleksi ini: 47380
Abstract- It is widely known that visual cues play in important role in speech, especially in disambiguating confusable phonemes or as a means for "hearing" visually. Interpreting speech only through visual signal is called lip reading. Lip reading has several potential application as a complementary modality to speech recognition or as purely visual speech recognition, which gives rises to silent interface, which by itself has numerous practical application. Although the overwhelming potential of such system, research on lip reading for the Indonesian language was extremely limited, with setting still very distant form the real world. This research is an attempt to make a lip reading model that supports a variable-length sentence as its input. We build the model using deep learning, specifically spatiotemporal Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) that both respectively form spatiotemporal feature extractor and character level sentence decoder. During the process , we also investigate whether knowledge on lip reading on other language affects the acquisition of a different language. To the best of our knowledge, our model was the first sentence level Indonesian language lip reading that supports variable-length input. Our model achieved superhuman performance on all metrics, with almost 2X better word accuracy.