ABSTRACT

Over the last few years, cloud computing has become a popular option for data analysis for several reasons. Modern day technologies can produce, capture and store a large amount of data. However, the data analysis framework or the computational power has not developed at the same speed to analyse all these data. Currently, there is an increase in interest to use High Performance Computing (HPC) tools (e.g. using computer clusters, parallelization of data analysis programs, etc.) to handle big data structures. These HPC tools can be used in a physically available computer cluster. However, powerful hardwares are expensive to buy and maintain. In general, there are two different scenarios: one needs one machine with a certain amount of storage, memory, etc., or a cluster of multiple machines with certain specifications is required to perform the data analysis. Performing the analysis using cloud computing is another option in which publicly available computation resources are used to perform the data analysis. In this chapter, we illustrate the usage

Figure 22.1 Setting up the biclust package in Amazon Web Services.