Data Analytics - Pentaho , Tableau , Qlikview: Pig

Pig Introduction:

Apache pig is a platform for analyzing large data sets that consist of a high - level languages for expressing data analysis programs, coupled with infrastructure for evaluating these programs.

At the present time, pig infrastructure layer consist of a compiler that produces a sequence of Map-Reduce programs , for which large scale parallel implementations already exist.

Pig's language layer currently consist of a textual language called pig latin and will have the below functionalities:

Ease of programming :
It is trivial to achieve parallel execution of simple , "embarrassingly parallel" data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequence making them easy to write and understand and maintain.

Optimization Opportunities:
The way in which tasks are encoded permits the system to optimize execution automatically, allowing the user to focus on semantics rather than efficiency.

Extensibility:
Users can create their own custom function to do special processing of data.

Developed by yahoo and managing by apache.
Developed using JAVA
Process data by using pig procedural called Pig-Latin
Step by step process
Pig can deal with structured , semi structured and unstructured.
It will works directly on HDFS
current version is 0.14 (As per blog update)
Can process any kind of data
Can process data from HDFS
Pig has two modes one is MapReduce mode and Local mode

syntax to go to PIG terminal:
#pig -x local
after executing the above command you will be navigated to grunt>
where you can execute pig commands

Difference between PIG and HIVE to execute statements:

HIVE PIG
hive>execute statement; grunt>statement;
from linux/unix shell:
$hive -e 'execute statement' $pig -e 'execute statement'
$hive -f .hql file $pig -f test.pig

Different modes in PIG:

Local
Mapreduce
Tez

************THIS PAGE IS STILL UNDER CONSTRUCTION **********************

Data Analytics - Pentaho , Tableau , Qlikview

Pages

Pig

No comments:

Post a Comment

Followers