chapter  10
Big Data Classification
ByHanghang Tong
Pages 11

We are in the age of ’big data.’ Big Data has the potential to revolutionize many scientific disciplines, ranging from astronomy, biology, and education, to economics, social science, etc [16]. From an algorithmic point of view, the main challenges that ’big data’ brings to classification can be summarized by three characteristics, that are often referred to as the “three Vs,” namely volume, variety, and velocity. The first characteristic is volume, which corresponds to the fact that the data is being generated at unprecedented scale. For example, it is estimated that [16] there were more than 13 exabytes of new data stored by enterprises and users in 2010. The second characteristic is variety, according to which, real data is often heterogeneous, comprised of multiple different types and/or coming from different sources. The third characteristic is that the data is not only large and complex, but also generated at a very high rate.