ABSTRACT

The field of big-data analytics is littered with a few myths and evidence-free lore. The reasons for these myths are simple: the emerging nature of technologies, the lack of common definitions, and the non-availability of validated best practices. In reality, for data in a lake to be usable for analytics, it needs to be organized, queryable, and well understood and trusted by users and domain experts alike. Storage in distributed computing environments and big-data appliances are becoming cheaper for on-premise environments—while storage and compute costs are also reducing drastically on cloud options. Non-functional requirements are as important in big-data management and analytics programs as they are in other areas of software systems engineering. Big data gets many headlines and interest in big data is at a record high—within enterprises and in the popular press. The promise of big data is real, and large majorities of businesses and organizations see big data as critically important.