ABSTRACT

Data integration in large enterprises is a crucial but at the same time a costly, long-lasting, and challenging problem. While business-critical information is often already gathered in integrated information systems such as ERP, CRM, and SCM systems, the integration of these systems themselves as well

CONTENTS

Introduction ......................................................................................................... 169 Challenges in Data Integration for Large Enterprises ................................... 173 Linked Data Paradigm for Integrating Enterprise Data ................................ 178 Runtime Complexity .......................................................................................... 180

Preliminaries ................................................................................................... 181 The HR3 Algorithm ........................................................................................ 183

Indexing Scheme ........................................................................................ 183 Approach .................................................................................................... 184

Evaluation ....................................................................................................... 187 Experimental Setup ................................................................................... 187 Results ......................................................................................................... 188

Discrepancy .......................................................................................................... 191 Preliminaries ................................................................................................... 193 CaRLA .............................................................................................................. 194

Rule Generation ......................................................................................... 194 Rule Merging and Filtering ...................................................................... 195 Rule Falsification ....................................................................................... 196

Extension to Active Learning ....................................................................... 197 Evaluation ....................................................................................................... 198

Experimental Setup ................................................................................... 198 Results and Discussion ............................................................................. 199

Conclusion ........................................................................................................... 201 References ............................................................................................................. 202

as the integration with the abundance of other information sources is still a major challenge. Large companies often operate hundreds or even thousands of different information systems and databases. This is especially true for large OEMs. For example, it is estimated that at Volkswagen there are approximately 5000 different information systems deployed. At Daimlereven after a decade of consolidation efforts-the number of independent IT systems still reaches 3000.