ABSTRACT

Since the genetic contribution to variance in the risk of breast cancer cannot be explained by mutations in known susceptibility genes, a number of investigators reasoned that common variants could be responsible (the common diseasecommon variant hypothesis). The development of technologies that rapidly genotype large numbers of individuals at large numbers of loci gave rise to the era of genome-wide association studies (GWAS) that were intended to test this hypothesis and defi ne the responsible variants. GWAS are essentially case-control studies that compare the frequencies of hundred of thousands to millions of different genetic variants in affected and unaffected individuals (5). The approach has proven to be enormously successful, and as of the second quarter of 2011 there were 1449 published associations for 237 traits, including breast cancer (https://www.genome.gov/ gwastudies/; accessed April 14, 2012). Nonetheless, there are signifi cant limitations to GWAS. First, the number of comparisons requires a high level of stringency in reporting statistical signifi cance, to avoid false discovery. For discovery studies, a P value for comparison of 10−7 or greater is usually required (6), and even at this level a number of loci that are signifi cant in the fi rst stage of a study do not replicate in subsequent validation phases. There are two implications of this. The fi rst is that a number of loci that do infl uence risk may fail to rise to the level of signifi cance required, and thus be missed. The second is that the loci that are identifi ed may have associated odds ratios that are at the higher end of their actual effect sizes. This phenomenon is known as “the winner’s curse,” and has implications for the use of published odds ratios in risk prediction models. Another limitation of GWAS is that they are not able to identify associations with rare variants (minor allele frequencies less than 5%), even with large numbers of cases and controls. This has implications for the problem of “missing heritability,” to be discussed later. Third, it appears that the variants discovered through GWAS are not, in most cases, the causal variants that are actually responsible for the increase in risk. In fact, the exploration of mechanisms by which common

and European-Americans from the United States. Two SNPs (rs13387042, 2q35 and rs3803662, 16q12) showed signifi cant associations in all analyses. This study was particularly signifi - cant as it was the fi rst to note that these variants were associated with estrogen receptor (ER) positive breast cancer, but not ER negative cancer, suggesting that the infl uence of common variants on breast cancer risk could be subtype specifi c. A follow-up analysis from the Breast Cancer Association Consortium, however, found that rs13387042 was associated with increased risks of both ER-positive and ER-negative breast cancer (12). Stacey et al. also noted a signal from an SNP in the 5q12 which did not reach genome-wide signifi cance in their replication phases, but that was also somewhat correlated in GWAS from other groups. In a follow-up analysis of 5028 cases and 32,090 controls of European ancestry, the relevance of 5p12 SNPs rs4415084 and rs10941679 was demonstrated more defi nitively (13). Again, the association was confi ned to ER-positive breast cancer. In fact, a follow-up study by the Breast Cancer Association Consortium suggested that the association was largely with progesterone receptor positive cancers, with a stronger association with lower grade tumors (14). In this analysis, the group also evaluated the subtype specifi city of rs1219648 in FGFR2 and found that the risk of this SNP was also confi ned to ER-positive cancer.