Call Number | SEM-219 |
Collection Type | Indeks Artikel prosiding/Sem |
Title | IMPLEMENTATION OF EFFECTIVE FOCUSED WEB CRAWLER |
Author | Ivan M. Siregar', Tjong Wan Sen, Ramero Forester Carlo; |
Publisher | Seminar Nasional Informatika 2012 (SNIF) |
Subject | web crawler |
Location |
Nomor Panggil | ID Koleksi | Status |
---|---|---|
SEM-219 | TERSEDIA |
A Web crawler is a search engine or a web spider that traverses the World Wide Web and downloads the web pages regarding the particular topics searches. On the other hand, while surfing the internet it is difficult to deal with irrelevant pages and predicting which link will lead to a quality pages. Since nowadays accurate information is an essential of daily lives, developing an effective web crawler or search engines is a big responsible. This paper will try to implement an effective crawling method into web crawler so that the web crawler will return more accurate result regarding the particular topics. This focused crawling method will check the similarity of web pages to particular topics keywords using a similarity function and priorities of extracted links are also calculated based on the Meta data of the links. The paper also proposed the focused crawling method so it can traverse the irrelevant pages that are found during the crawling operation to improve the coverage of specific topics. The focused crawling method also improves in reducing the crawling time since it only seeks links selectively to relevant pages, rather than to exploit all regions of the web. Keywords: web crawler