Beyond the boundaries

Challenges of Big Data Analysis

Due to the technological advancements of the 21st century data is getting advanced day by day. Data size is not calculated simply in gigabytes or terabytes, but it is also calculated in petabytes, exabytes and zettabytes.

Big data can be further identified by its characteristics, which is simply known as the 3Vs, which stand for Volume, Velocity and Variety.

The volume considers the large quantities of data found in big data as the name implies. Velocity refers to the rate in which data can be processed and retrieved when needed. It considers how fast data can be generated, stored and gained. Variety refers to different types of data. Data gained from different sources such as sensor data, data from smartphones, social media data provide different types of data.

Big data faces different types of problems during its data cycle.

Heterogeneity and Incompleteness

It is a fact that natural information consisting different types of data has more depth than structured homogeneous data. Analyzing unstructured data is problematic. It takes more time and money rather than analyzing structured data. So as a first step in data analyzing, data must be structured carefully. Greater structure is likely to be needed by most of the computer programs to work in the modern world although heterogeneous data is more valuable. However, for efficient representation, analyzing and interpretation of semi structured data requires further work. 


When the size of the data set is getting increased the speed of having processed data is decreased gradually. It’s a challenging task to obtain processed data when there are so much data to be checked. However when it comes to the velocity of big data analysis it not only says about the size of the data set but it also refers to the acquisition rate challenge too.

Scale and Complexity

Managing large and rapidly increasing data has become the main challenge when handling big data. But now the challenge of this has become insolvable as the data volume is scaling faster than computer resources. The other move is that cloud computing has become a trending method to store big data. Cloud computing enables to work and store data and to return the required data in a fixed amount of time. When the data sets are getting bigger every day, data storage methods and techniques should be found accordingly. In cloud computing, after a certain amount of storage limit is reached, high maintenance costs are common to access more storage space. Because of this, the scale and complexity of big data has become a major challenge in handling these.

Privacy and Security

Big data increases the sources for different data sets. However it is problematic to ensure whether these data can be trusted or not. The trustworthiness of the data set should be verified and technologies should be introduced in order to identify maliciously inserted data. The security violation of big data can happen in many ways such as data modification and manipulation, unauthorized release of information and denial of resources. Sometimes when there is less security, unauthorized users may try to attack and access files which are confidential. Security of big data can be ensured using authorization, validations, encryption and audit trails. Although there are steps taken to manage security aspects unauthorized users may get in to the systems and attack the systems which is now considered as a major challenge in big data analytics.

Most common challenges we recognize from data acquisition to result interpretation occurs due to the above reasons. So that before it gets late to get the value of big data new methods should be introduced to mitigate these challenges.

Shavindi Pathirana



Post comment