ABSTRACT

This chapter reviews the data analytics pipeline, from crowdsourcing techniques using Amazon Mechanical Turk, to Data Visualization tools for building, understanding, and fine-tuning analytics models and performing data exploration, to parallelized data analytics models, namely MapReduce and Spark. The chapter also explores emerging Graph Analytics, and its applications in fraud detection, graph mining, and approximate matching. With regard to applications, it covers various security aspects in the big data domain, and possible preventive or vaccination measures against malicious actors. State-of-the-art approaches in convolutional neural network attacks, as well as graph-mining based fraud and malware detection are described. Finally, it covers social network modeling and social media, and by extension, text analytics - already a major part of modern big data analytics. The authors describe modern text analytics approaches, from word embeddings to part-of-speech tagging to sentiment analysis. The chapter recognizes that there are significant emergent works in crowdsourced big data and neural networks as applied to pattern detection, speech and language synthesis, and object recognition. As better neural network architectures are discovered, and as better visualization tools for convolutional and recurrent neural networks (to name two) are developed, applications under such deep learning models are expected to significantly increase.