ABSTRACT

Data-intensive computing is a class of parallel computing applications which use a data-parallel approach to process large volumes of data (terabytes or petabytes in size). The advent of big data and data-intensive software systems presents tremendous opportunities (e.g., in science, medicine, health care, finance) for businesses and society. Researchers, practitioners, and entrepreneurs are trying to make the most of the data available to them. However, building data-intensive systems is challenging. Software engineering techniques for building data-intensive systems are emerging. Focusing on knowledge management during software development is one way of enhancing traditional software engineering work practices. Software developers need to be aware that creation of organizational knowledge requires the social construction and sharing of knowledge by individual stakeholders. In this chapter, we explore the application of established software engineering process models and standard practices, enhanced with knowledge management techniques, to develop data-intensive systems.