ABSTRACT

298Big data refer to a wide variety of valuable data of different veracities that are generated or collected at a high velocity with volumes beyond the ability of commonly used software to manage, query, and process within a tolerable elapsed time. On the one hand, big data analytics incorporates various techniques from a broad range of fields, which include cloud computing, data mining, machine learning, mathematics, and statistics. For instance, data mining discovers implicit, previously unknown, and potentially useful information and/or knowledge from data. On the other hand, uncertain big data management represents an active and well-recognized research area where a relevant number of proposals converge. This is due to several reasons, but mostly dictated by the emergence of big data trends as well as the explosion of cloud computing paradigms. Within this wide research context, a leading role is played by the issue of extracting useful knowledge from big data being the uncertain big data setting a critical case to be considered. In our research, we specially focus on two well-known distinct first-class data-mining problems over uncertain big data, namely: (i) frequent itemset mining from uncertain big data and (ii) constrained mining from uncertain big data. We recognize that these subproblems converge into a general problem that we name as complex mining from uncertain big data, for which a plethora of real-life applications and systems can be found. Inspired by these relevant research challenges, we provide in this chapter the following contributions: (i) a comprehensive overview of state-of-the-art literature in the context of the research problem of complex mining from uncertain big data, (ii) an effective and efficient algorithm for supporting tree-based constrained mining of uncertain big data in distributed environments, as well as (iii) another effective and efficient algorithm for supporting MapReduce-based constrained mining of uncertain big transactional data in cloud environments.