A modern digital library should be providing effective integrated acces to disprate information sources. Therefore the extraction of semi-structured or structured information - for instance in XML format - from free text is one of the great challenges in its realization.The work presented here is part of a wider initiative aiming at the design and development of tools and techniques for an indonesian digital library. In this paper, we present the blueprint of our research on information extraction for the indonesian language. We report the first result of our experiments on nme entity recognition and co-reference resolution.