Wrestling With Big Data


Data .. Data Everywhere But Not a Byte to Think 

No doubt we are dealing with thousands of petabytes data every data. But one thing to think is that data is dead or alive, I mean to say ,that data is just a record or giving us also some powers of thoughts for future.
Yes ! that can be dead or alive as we deal with this data. To make the data alive we have to play some wrestling of analytics with this data. This wrestling gives business intelligence about behavior and trends. This intelligence empowers us to make decision about our dealings. Yes this is the Data analytics. If we talks it in the sense of Big Data then it will be called Big Data Analytics.

Big Data analytics

What is Structured Data ?

The data which remains in fixed field within file or records is called structured data.This includes data contained in relational databases and spreadsheets.


The term structured data generally refers to data that has a defined length and format for big data. Examples of structured data include numbers, dates, and groups of words and numbers calledstrings. Most experts agree that this kind of data accounts for about 20 percent of the data that is out there. Structured data is the data you’re probably used to dealing with. It’s usually stored in a database.
Structured Data

Sources of structured big data

Although
 this might seem like business as usual, in reality, structured data is taking on a new impotence and role in the world of big data. The moderation  of technology leads to newer sources of structured data being generated  — often in real time and in huge volumes. The sources of data are categorized  into two types:
  • Computer- or machine-generated: Machine-generated data normally  refers to data that is created by a machine without human interaction.
  • Human-generated: This is data that humans, in intervention with computers, supply.
Some experts argue that a third type exists that is a in between machine and human. Here though, we’re explaining with the first two types.
Machine-generated structured data can include the following:
  • Sensor data: Examples contains radio frequency ID tags, medical devices, and Global Positioning System data. Corporations  are looking for this type  for supply chain management and data inventory control.
  • web log data: When servers, applications, networks, and so on operate, they capture all kinds of data about their activity. This can amount to huge volumes of data that can be useful, for example, to deal with service-level agreements or to predict security breaches.
  • Point-of-sale data: When the cashier swipes the bar code of any product that you are purchasing, all that data associated with the product is generated.
  • Financial data: Lots of financial systems are now programmatic; they are operated based on predefined rules that automate processes. Stock-trading data is a good example of this. It contains structured data such as the company symbol and dollar value. Some of this data is machine generated, and some is human generated.
Examples of structured human-generated data might include the following:
  • Input data: This is any piece of data that a human might input into a computer, such as name, age, income, non-free-form survey responses, and so on. This data can be useful to understand basic customer behavior.
  • Click-stream data: Data is generated every time you click a link on a website. This data can be analyzed to determine customer behavior and buying patterns.
  • Gaming-related data: Every move you make in a game can be recorded. This can be useful in understanding how end users move through a gaming portfolio.
When taken together with millions of other users submitting the same information, the size is astronomical. Additionally, much of this data has a real-time component to it that can be useful for understanding patterns that have the potential of predicting outcomes.
The bottom line is that this kind of information can be powerful and can be utilized for many purposes.

The role of relational databases in big data

Data persistence refers to how a database retains versions of itself when modified. The great granddaddy of persistent data stores is the relational database management system. In its infancy, the computing industry used what are now considered primitive techniques for data persistence.
The relational model was invented by Edgar Codd, an IBM scientist, in the 1970s and was used by IBM, Oracle, Microsoft, and others. It is still in wide usage today and plays an important role in the evolution of big data. Understanding the relational database is important because other types of databases are used with big data.
In a relational model, the data is stored in a table. This database would contain a schema — that is, a structural representation of what is in the database. For example, in a relational database, the schema defines the tables, the fields in the tables, and the relationships between the two.
The data is stored in columns, one each for each specific attribute. The data is also stored in the row. The first table stores product information; the second stores demographic information. Each has various attributes. Each table can be updated with new data, and data can be deleted, read, and updated. This is often accomplished in a relational model using a structured query language (SQL).
image0.jpg
structure data

Another aspect of the relational model using SQL is that tables can be queried using a common key. The common key in the tables is CustomerID.
You can submit a query, for example, to determine the gender of customers who purchased a specific product. It might look something like this:
Select CustomerID, State, Gender, Product from "demographic table", "product t


What is Big Data Analytics ?

Big Data Analytics

Big data analytics is a step by step process of analyzing large amount of different datatypes and big data for uncovering hidden correlations ,unknown patterns and other useful information.

The process can be imagined in the below picture.

Big Data Analytic Process
Big Data Analytics


Information gain from these analytics can extract competitive advantages over rival organization and can result in business benefits such as increased revenue and and effective marketing.

To make the companies enable to make better business decisions by enabling Data Scientist  and other data experts to analyze huge volume of data as well as utapped data source by business Intelligence(BI).thsese source include web servers logs ,internet click-stream data ,social media activity reports,mobile phone records and sensors captured data. Some people exclusively relates Big Data and big data analytics to unstructured Data but large scale  firms like Gartner Inc. and Forrester Research Inc. also considers transactions and other structured collection of data to be valid forms of big data.

Big Data analytics are done by using tools and softwares normaly used as a part of advance analytics discipline such as Data Mining and Predictive analytics.Traditional data warehouses do not fits for unstructured Big Data.These warehouse do not completes the demand of process handling of big data too. 

What is Big Data ?


Big Data

The expression Big Data is used for describing  the collection of large and complex data set such that it is difficult to capture ,store search, process and analyze this kind of data using conventional data base management tools and traditional data base management systems.


Here question would be  where dose big data come from ? Where from big data originates?  What makes the Big Data?

Basically the data coming from everywhere like
  • Cell Phone GPS signals
  • Purchase Transaction record
  • GPS Trails
  • Traffic Records
  • Government Document scanning
  • Microphone
  • Software Logs
  • Cameras
  • Sensor used to gather weather information   
  • Blogs post and social media sites
  • Digital videos and pictures
  • Data from search engines 
  • etc .. . .
All these together make the data Big I mean  " Big Data".


  • Big data Includes both types of data
              1- Structured Data
              2- Unstructured Data
  • If we look current trend everyday we create 2.5 Quintilian bytes of data. More than 90% of big data alone has been created in last two years. Big data is normally measured in exabytes and petabytes.
  • Big Data requires "gigantic parallel running software on tens ,hundreds and even on thousands of distributed servers " .Big data difficult to handle with relational data base management systems ,desktop statistics  and  visualization packages.Big Data analytic is really a massive task to be done.
  • Making your business more agile and to solve the quires of business which are beyond considerations, big data gives you the opportunity of finding insight in new and emerging types of data and content.It reveals the hidden facts. 



Thanks for reading the information about big Data on my blog, Don't forget to share this content to others ! Next I will write "What is Big Data Analytics ?"

Big Data Analytics

Welcome to BigDataAnalytics. I will soon write my experience about big data.Hopefully I will learn and teach about Big Data Analytics here.