ABSTRACT

The efficiency and effectiveness of topic or user query-related document retrieval can be improved by clustering similar documents, by dimensionality reduction techniques and by incorporating parallel strategies. This paper explores the use of unsupervised learning with neural network-based clustering algorithms, several methods of feature reduction and the application of NoW Architectures, a kind of low-cost parallel architecture. Our experiments on six different corpora show that both the computational cost and the lexicon size can be reduced without degrading the overall retrieval performance.