Hadoop:: The Big Data Boss

Hadoop is a software framework for creating a cluster of servers to handle huge amounts of data which in turn leads it to be called Big Data processing.

BigData should be in 3 “V”s, Volume , Velocity and Variety.

Volume : The amount of dat generated

Velocity : Speed of your data analysing

Variety : Data types.

BigData can be used to build a better product, services and also better predict the future for your business. Lots of companies like Facebook, Twiter, LinkedIn, ebay, google, etc., generating Yottabytes [YB] of data.

A Hadoop architecture can handle multiple Data data types including

Structured Data
Unstructured Data
Semi- Structured Data

A large percentage of data in the industry are being unstructured data (like texts , images , videos , etc.) here comes Hadoop to processing these unstructured data for further analysis.

Apache Hadoop framework consists of the following modules:

Hadoop Common
Hadoop Distributed File System – HDFS
Hadoop YARN – Resource management tool
Hadoop MapReduce

Apache Hadoop using lots of data access technologies like Apache HBase , Apache Hive, Apache Pig, etc., For interacting with your data.

History of Hadoop:

In early 1999 we have used our search engine as AltaVista ,Go.com, Netscape etc , but after year 2000 the majority of users went to Google.com. And they became the most popular for people all over the world for finding web pages of their interests. Also, they have developed the idea of selling search terms. Everyone thinking for the secret behind their technology. Google finally releases some white paper regarding their technologies.

Interesting fact behind Hadoop name:

Daug Cutting’s son has a yellow colored toy elephant and the name of the toy was Hadoop. So he named this new software as Hadoop and it is ease to pronounce as well.

Revolutionary recap:

2003 :- Google introduces a project Nutch to handle billions of searches and Indexing

2003 :- Google releases white paper regarding GFS [ Google File System]

2004 :- Google releases white paper regarding MapReduce

2006 :- Yahoo created Hadoop based Filesystem [HDFS] and MapReduce.

2007 :- Yahoo used their technology on a 1000 node cluster.

2008 :- Apache taken Hadoop and licensed as Open-Source software.

2011 :- Hadoop release version 1.0

2013 :- Hadoop release version 2.0

2015 :- Hadoop release version 2.7

So the real brain behind Hadoop is Google’s GFS and MapReduce, and they give the first light to this great project.

At first yahoo faced some issue with implementing Hadoop on a 1000 node cluster system. In 2008 Apache successfully implemented Hadoop on a 4000 node cluster system.

Basic Architectural Diagram of Hadoop:

Share on Facebook

Post on X

DevOps Services

Hybrid & Multi Cloud Consulting

Enterprise Kubernetes Consulting

Product Design & Development

Mobile Application Development

Cloud Managed Services

Digital Innovation & Strategy

Hadoop:: The Big Data Boss

Apache Hadoop framework consists of the following modules:

History of Hadoop:

Interesting fact behind Hadoop name:

Revolutionary recap:

Basic Architectural Diagram of Hadoop:

40

SHARES

DevOps Services

Hybrid & Multi Cloud Consulting

Enterprise Kubernetes Consulting

Product Design & Development

Mobile Application Development

Cloud Managed Services

Digital Innovation & Strategy

Enjoy this blog? Please spread the word :)

DevOps Services

Hybrid & Multi Cloud Consulting

Enterprise Kubernetes Consulting

Product Design & Development

Mobile Application Development

Cloud Managed Services

Digital Innovation & Strategy

Apache Hadoop framework consists of the following modules:

History of Hadoop:

Interesting fact behind Hadoop name:

Revolutionary recap:

Basic Architectural Diagram of Hadoop:

40

SHARES

Related Posts

DevOps Services

Hybrid & Multi Cloud Consulting

Enterprise Kubernetes Consulting

Product Design & Development

Mobile Application Development

Cloud Managed Services

Digital Innovation & Strategy

Enjoy this blog? Please spread the word :)