Posted by on
Categories: Big Data

While the origins of the term are elusive, and even debated, #bigdata is one of those concepts that many know about, yet it defies a simple definition. At the heart of big data, as the term directly suggests, is an extremely large volume of data. This is often drawn from diverse sources and even different types of data, which is then crunched through advanced analytic techniques which hopefully pick out patterns that can lead to useful conclusions.

Big data also infers the three Vs: Volume, Variety and Velocity. Volume refers to the size of the data, variety indicates that the datasets are non-homogenous, and velocity is the speed at which the analysis takes place, often with the goal of achieving real-time analysis.

The datasets involved are indeed seriously large – we’re talking terabytes to zettabytes (1ZB is equivalent to 909,494,701TB, for the curious). In addition to the size of these datasets, the data can be of different types: structured, semi-structured and unstructured, plus it can be drawn from multiple sources.

This does beg the question as to where all this data is being generated from. It comes from all types of places, including the web, social media, networks, log files, video files, sensors, and from mobile devices.

The latter are particularly important as most of us keep our phones with us and on 24/7, and they have an array of sensors, including GPS, cameras, a microphone, and a motion sensor. Furthermore, the majority of smartphone use is not voice communication, but rather other activities, including emails, games, web browsing, and social apps – which ultimately translates to 90% of use being mobile apps. A large driver of big data is this mobile data, which gets generated at a breakneck pace.

Data mining
But data without any analysis is hardly worth much, and this is the other part of the big data process. This analysis is referred to as data mining, and it endeavors to search for patterns and anomalies within these large datasets. These patterns then generate information that is used for a variety of purposes, such as improving marketing campaigns, increasing sales or cutting costs. The big data and data mining approach not only has the power to transform entire industries, but it has already done so.