Introduction

doi:10.1201/9781315373966-7

ABSTRACT

Big data in high dimensions that have complex structures have been emerging steadily and rapidly over the last few years. Data analysis faces the challenge to discover meaningful patterns in this ocean of data. In this book we focus on a relatively new data analysis method: biclustering (Cheng and Church, 2000). In contrast to cluster data analysis that aims to group variables in a data matrix which belongs to a certain global pattern in the data, biclustering is a data analysis method which is designed to detect local patterns in the data.

For example, let us assume that the data matrix consists of supermarket clients (the columns) and products (the rows). The matrix has an entry equal to one at the intersection of a client column and product row if this client bought this product in a single visit. Otherwise the entries are zero. In cluster analysis, we aim to find a group of clients that behave in the same way across all products. Using biclustering methods, we aim to find a group of clients that behave in the same way across a subset of products. Thus, our focus is shifted from a global pattern in the data matrix (similar behaviour across all columns) to a local pattern (similar behaviour across a subset of columns).