Projection Techniques for Document Maps

This thesis compares different vector space projection techniques for creating two–dimensional maps out of text document collections. I describe the process from raw text information to similarity maps and implemented a working prototype.

September 2005
University of Osnabrück
Supervisors: Dr. Petra Ludewig, Dr. habil. Helmar Gust

Document mapping is a recently developed sub–discipline of interactive information visualization. The core idea is to combine traditional cartographic techniques with today’s possibilities for automated analysis of text data in an interactive interface. It is hypothesized that this form of presentation for text documents facilitates a quick perception of the similarity of their contents. Hence, it can constitute a valuable addition to traditional browsing and search methods.

Numerous techniques for creating these maps have been developed, like cluster visualization or document networks. This work presents methods which calculate a coordinate configuration in two–dimensional space in order to express inter–document similarities via spatial proximity. Some of the presented techniques have been implemented in the ASADO system, which is presented at the end of this work.