Due two the wide spread of digital libraries, digital comeras, and the increase access to WWW by individuals, the number of digital images that exist pose a great challenge. Easy access to such collections requires an index to facilitate random access to individual images and navigation of these image. As these images are not annotated or associated with descriptions, existing systems represent the images by their extracted low level features. In this paper, we demonstrate two image mining tasks, namely image classification and image clustering that are preliminary step in facilitating navigation and indexing. These tasks are based on the extraction of the color distribution of images then these color distributions are resresented as time series. To make the representation more effective and efficient for the data mining tasks, we have chosen to represent the time series by a new representation called SAX (Symbolic Aggregate approXimation) (14). SAX based representation is very effective because it reduces dimensionality and it lower bounds the distance measure. We demonstrate by our experiment the feasibility of our approach.
|
|