chapter  4
◾ Genome-Associated Data

Prediction of operon structure has traditionally been based on observing smaller-than-normal intergene gaps. Cotranscription also implies only one ribosomal binding site (Shine-Dalgarno pattern from Chapter 2) associated with the most 5′ gene, and o©en a stem-loop structure to terminate transcription at the 3′ end. Relatively high false-positive and false-negative rates for these signals (which are very short patterns) mean that other techniques need to be used to improve the accuracy of operon prediction. Predictions can be measured against data such as coexpression analysis (e.g., microarrays) or RNA evidence. ™e most accurate extant operon prediction so©ware (~94%), called STRING (Taboada et al., 2010), uses intergenic distance, plus functional clustering of genes based on protein interaction data.