Data mining is a core process in KDD (Knowledge Discovery in Databases) that extracts the knowledge from a large databases. One of the methods in data mining is called classification. The method searches for model classification that can differentiate clasess' labels. Bayesian Networks is one technique that can be used to build classification model. Bayesian Network component consist of two DAG structures that describe the causality between data attribute and a table fill with condition probability base on the previous attribute. Many algorithms have been expanded to construct Bayesian Network structure, either for complete databases or incomplete databases (missing value is existed). CB* Algorithm which combines two approaches: dependency analysis and search-scoring, is one algorithm to construct bayesian network structure from incomplete databases. This alorithm consist of two phases with phases one is to produce node orderng and phase two is to construct DAG structure from bayesian network. The obective of this research is to evaluate CB* Algorithm from its function point of view that is able to generate node ordering for producing structure which is markov equivalent tom original structure, and able to construct bayesian network from incomplete databases. The amount of missing values has no influence to the bayesian network structure. The research also included the analysis of the capability of the algorithm to construct bayesian network structure whithout prior information. Base on the experiment, it is shown that the algorithm works well for the tasks. The main point of this researchis no matter how many missing value occured in database, CB* Algorithm can still produce bayesian network structure.
Keywords: Data Mining, Classification, Bayesian Network, Missing Value, Node Ordering
|
|