STEPS TO INTEGRATE PENTAHO WITH HADOOP
2. In this particular folder, we will have a directory with the name " pentaho-big-data-plugin "
3. Navigate to "pentaho-big-data-plugin" directory and observe the file configuration located here. A file with the name "plugin.properties" existed in this folder.
4.Try to analyse by opening the file, at very beginning of the file there will be a field with name "active.hadoop.configuration " and it set to to hadoop-20 by default. It some thing like below image.
5. In "pentaho-big-data-plugin" directory there is another folder available with name "hadoop-configurations". Lets navigate and observe the folder structure.
6.Now in the "plugin.properties" file observe the value set to the filed "active.hadoop.configuration". by default it is set to hadoop-20. If we want to change this to cloudera then there are other folders availble with name "cdh*". So simply in the "plugin.propertes" file change the value accordingly.
7. After modifications done to the file it looks like in below image.Save the file and restart the server in order to reflect changes.
8. Now you can connect to Hadoop eco system from your Pentaho Data Integration.