Saturday, January 12, 2013

Big Data Characters

Traditionally, big data describes data that’s too large for existing systems to process.  Main characteristic of Big data is defined in 3Vs namely variety, velocity and volume.

This original characteristic describes the relative size of data to the processing capability. Today a large number may be few terabytes.  Overcoming the volume issue requires technologies that store vast amounts of data in a scalable fashion and provide distributed approaches to querying or finding that data.

Velocity describes the frequency at which data is generated, captured, and shared. The growth in sensor data from devices, and web based click stream analysis now create requirements for greater real-time use cases.  The velocity of large data streams power the ability to parse text, detect sentiment, and identify new patterns

A proliferation of data types from social, machine to machine, and mobile sources add new data types to traditional transactional data.  Data no longer fits into neat, easy to consume structures. New types include content, geo-spatial, hardware data points, location based, log data, machine data, metrics, mobile, physical data points, process, RFID’s, search, sentiment, streaming data, social, text, and web.  The addition of unstructured data such as speech, text, and language increasingly complicate the ability to categorize data.

In my opinion, Volume is the least weightage and Variety is the most weightage factor.  On writing the best Big Data solution, you need to give the importance in the order of Variety, Velocity and Volume.

No comments:

Post a Comment