ABSTRACT

Another difference between the access patterns of clinical data and research data lies in the inherent unpredictability of how data is discovered and used in research. Unlike a clinical operations, where access is very well defined, research data often have to be browsed in novel ways to devise a hypothesis. Such browsing of data requires the execution of queries that are novel and ideally require an on-demand integration of different data modalities. For example, a researcher, who is developing new algorithms to help craft tighter dose margins, will want to explore the planning computed tomography (CT), the radiotherapy (RT) dose and structures, patient diagnosis, and outcome data. To test their algorithms, the researcher may want to retrieve RT structures and possibly the associated planning CT, given a particular diagnosis, dose, and outcome. Most existing systems will have a hard time executing a query where some of the queryable attributes are in the electronic medical records (diagnosis and outcome) and a RT PACS (dose), whereas the objects of interest are in a RT PACS (structure) and radiology PACS (planning CT). From these examples, it becomes evident that existing clinical systems fall short when it comes to research use cases. Next, we explore some specific research data management systems, the benefits of a service-oriented architecture in facilitating data federation, and finally information security.