Library Automation and Digital Archive
LONTAR
Fakultas Ilmu Komputer
Universitas Indonesia

Pencarian Sederhana

Find Similar Add to Favorite

Call Number SEM-369
Collection Type Indeks Artikel prosiding/Sem
Title Named Entity Recognition In Begali (hal 9-12)
Author Asif Ekbal Departement Of Computer Science and Engineering Jadavpur University,Kolkata,India;
Publisher Satellite workshop on language artifical intelligence and computer science for natural processing applications (LAICS-NLP) October 19 2006
Subject
Location
Lokasi : Perpustakaan Fakultas Ilmu Komputer
Nomor Panggil ID Koleksi Status
SEM-369 TERSEDIA
Tidak ada review pada koleksi ini: 45912
A tangged bengali news corpus,developed from the web, has been used in this work for the recognition of named entities (NEs) in bengali language. A supervised learning method has been adopted to develop two different models if a nemd entity recognition(NER) system, one (model A) whitout using any linguistic features and the other (model B) by incorporating linguistic features. The different tags in the new corpus help to identify the seed data. The training corpus is initialy tagged against the different seed data and a lexical contextual speed patternn is generated for each tag. The entire training corpus is shallow parsed to identify the occurence of these intial seed patterns. in a position where the context or a part of aech seed pattern matches, the sistem predict the boundary of a named entity and further patterns are generated though brostrapping. patterns that occur in the entire training corpus above a certain threshold frekuency are considered as the final set of patterns learnt from the trainig corpus. The test corpus is shallow parsed to identify the occurence of these patterns and estinate the named entities . Model have been tested with two news document (gold standard test sets) and their result have been compared in terms of evaluation parameters.