Big data is the big amount of data in simple word. As one of the most “hyped” terms in the market today, there is no consensus as to how to define big data. The term is often used synonymously with related concepts such as Business Intelligence ( BI) and data mining. It is true that all three terms are about analyzing data and in many cases advanced analytics. But big data concept is different from the two others when data volumes, number of transactions and the number of data sources are so big and complex that they require special methods and technologies in order to draw insight out of data (for instance, traditional data warehouse solutions may fall short when dealing with big data). This also forms the basis for the most used definition of big data, the three V: Volume, Velocity, and Variety.
Volume: Large amounts of data, from datasets with sizes of terabytes to zettabyte.
Velocity: Large amounts of data from transactions with high refresh rate resulting in data streams coming at great speed and the time to act on the basis of these data streams will often be very short. There is a shift from batch processing to real-time streaming.
Variety: Data come from different data sources. For the first, data can come from both internal and external data source. More importantly, data can come in various format such as transaction and log data from various applications, structured data as a database table, semi-structured data such as XML data, unstructured data such as text, images, video streams, audio statement, and more. There is a shift from sole structured data to increasingly more unstructured data or the combination of the two.
Comments
Post a Comment