BIG DATA: AN APPROACH ON APACHE HADOOP

Autores

  • Rodrigo Ramos Nogueira Universidade Estadual de Ponta Grossa
  • Ezequiel Gueiber Universidade Estadual de Ponta Grossa

Palavras-chave:

Massive Data, Database, Hbase.

Resumo

Every eighteen months the volume of existing data in the world doubles in size, making it increasingly becomes difficult to store and query the information that is derived from multiple data sources. This paper introduces Apache Hadoop as a solution to problems involving Big Data. BIG Data is the term used to refer to the great mass of data, from different sources of information, stored in various locations, updated all the time and these are the three V's (Volume, Speed, Variety). Big Data is not a technology, it is a concept where voluminous and complex databases, and can be structured, semi-structured and unstructured communicate, but not always perform operations on the waiting time, making some impossible tasks using traditional storage technologies . The objective of this paper is to present a solution for Big Data storage, distribution and mining data from different sources, with a large volume of information and agile way. For the development of the research tool Apache Hadoop open source technology developed in Java and runs on Linux operating system was used. The main contribution of this research is to present an efficient and free solution for Big Data application in a distributed environment, with its benefits and specifying its easy use.

Downloads

Publicado

2016-01-18

Edição

Seção

Papers