ABSTRACT

Chapter 2 discussed the biological concepts that microarrays exploit for their usage. These biological mechanisms were explained in broad terms along with a detailed description of their technological manifestations such that an understanding of the basics of complementary deoxyribonucleic acid (cDNA) microarray systems is clarified. As discussed in Section 2.5, our proposed Copasetic Microarray Analysis (CMA) framework attempts to operate in a truly blind manner. This means (excluding the failsafe measures of the Image Layout (IL) and Image Structure (IS) components) the framework operates without the need of manual assistance. The framework should perform the analysis tasks such that they are comparable to the GenePix type process as discussed in Section 2.4. However, due to the high signal variability seen across the test dataset (when working directly with the raw microarray imagery), identifying gene spots can be counterproductive at this early stage. Therefore, rather than use the raw image, it is suggested that producing views of the image data such that emphasis is placed on certain frequencies or regions of interest would not only be advantageous, but more effective in terms of the overall goal. This multi-view analysis process is managed by the application of the Image

Transformation Engine (ITE) component to the microarray image data. In the current implementation, the ITE component generates an image designed to enhance the positions of the gene spots. Although this positional information is partial in some areas of the newly generated image surface, it is fit for identification purposes at this stage. As was highlighted in Chapter 2 the CMA framework’s processes are de-

signed such that the requirements of manual intervention are minimized. This design constraint creates a non-trivial search problem however. The microarray images are embedded with gene spots for the particular experimental run as required. However, due to the nature of the biological process and the desired caveat of no prior domain knowledge, the gene spot positions must be determined by an algorithmic method. In order for the gene spot addressing stage to return appropriate results, the raw microarray image should first be pre-processed to remove as much artifact noise as possible. The noise removal

An

process should function such that the gene spots in the image are clarified while the background noise and other artifacts are either left as they are, or reduced in severity. Once this so-called enhancement process is complete, later stages of the framework can use the new surface information to begin the process of building up a more accurate picture of the gene spot locations. The aim of the chapter is to explain how the proposed enhancement process

functions. In the context of the framework as highlighted in Section 2.5, this process of gene spot enhancement is encapsulated in the ITE component as seen in Figure 3.1.