ABSTRACT

Studies of twins have been used to establish the importance of both genetic and environmental factors in the development of human traits and disease. A key requirement for twin studies is an accurate and efficient method for distinguishing identical (monozygotic, MZ) twin pairs from fraternal (dizygotic, DZ) pairs. This chapter compares three methods for combining self-report items used to classify pair zygosity: a clinical algorithm, logistic regression, and classification trees using the R program RPART (Therneau & Atkinson, 1997). These methods were applied to 16 self-report items on physical similarity, childhood environmental similarity, height and weight obtained from 554 adult twin pairs from the Virginia Adult Twin Study of Psychiatric and Substance Use Disorders (Kendler & Prescott 2006). The classification accuracy of each method was evaluated against zygosity obtained by genotyping. All three methods obtained high levels of accuracy, ranging from 90 to 93%. The three items contributing most to accurate classification were: the twins’ report of how often as children they were mistaken for each other by strangers; whether they were “as alike as two peas in a pod;” and their opinion about whether they were identical. Items for environmental treatment, height and weight contributed little to the solutions, although the classification tree solution specific to females did use height and weight differences between twins. Overall, the results indicate the items commonly used in twin studies to assign zygosity give accurate assignment for the majority of twin pairs assessed as adults. If replicated, the algorithms based on regression trees may provide some additional accuracy for pairs that cannot be classified due to inconsistent responses to these items.