In 1998, we reported the findings from a three-year collaboration to implement and evaluate the GenScope™ software and associated curriculum in 20 secondary science classrooms and 4 comparison classrooms (Hickey, Kindfield, Wolfe, & Heidenberg, 1998). This paper provides a more comprehensive summary and interpretation of that research and presents a follow-up study that addressed four issues left unresolved in the prior research. While GenScope was shown to be as effective or more effective than conventional curricula for enhancing genetics reasoning skills, these issues may have diminished the relative gains for the GenScope classrooms and clouded interpretation of the results. The first issue concerned the challenges students faced when independently completing the GenScope activities in the school computer lab (a problem that was exacerbated by the demise of Macintosh computers in secondary schools). A second associated issue concerned the difficulty of identifying "fair" comparison classrooms in within-teacher contrasts because of "carryover" from the GenScope curriculum into comparison classrooms. The third issue concerned the need for further refinement and organization of the GenScope curriculum. The fourth issue concerned the impact of one key aspect of that curriculum, a set of formative assessments known as Dragon Investigations designed to use the familiar GenScope dragons to scaffold the kind of reasoning assessed in our NewWorm assessment. While the Dragon Investigations were shown to increase performance substantially, we had yet to show they were supporting genuine domain reasoning gains-rather than compromising the evidential validity of our summative assessment.