I have gone through the book “Getting Started With Greenplum For Big Data Analytics”
from PacktPub.com
http://bit.ly/HYOwrW,
This is a fabulous one for those Big Data
enthusiastic and those who wants to integrate Data Integration (Data warehouse
) with Big Data using the Greenplum as warehouse by using external ETL Tool
Informatica (Power Exchange) to load data to Greenplum. Practical approach of
using various Greenplum utilities (gpload (Insert, Update, Merge) –
INSERT,COPY) have been clearly explained. Complete overview on “Unified
Analytics Platform” and also Physical Architecture of Greenplum clearly discussed.
I would recommend to go through this book for those who are enthuse to
understand the Big Data Analytics and to integrate the DW with Big Data.
By the end of reading this book reader would be able to
understand.
- What is Big data ?
- What is Hadoop, Hive, Pig, Sqoop components.
- How to query the data stored in HDFS file system and load the data to Greenplum ? ( Data Communication between Hadoop and Greenplum)
- What is Chorus ? how is it used for integrating the multidimensional data visualization from Tableau software. Capability of the Chorus to grab data from HDFS and also from Greenplum database to create the dashboards in Tableau.
Greenplum Database Management System :
- How to start/stop the Greenplum database instance ?
- How to monitor the workload on the Database by using the GUI Greenplum Command Center ?
- Performance monitoring.
- Parallel Data loading.
- Query monitoring on the Greenplum.
- Compute/Storage/Database/Network Architecture for the overall Greenplum setup.
- Hardware/System configuration for database on multi node
- Various functions used in queries and also how to get the query execution plan using the EXPLAIN and ANALYZE functions
- Parallel data flow using the Dynamic Pipeline in Greenplum .
- Greenplum table distribution and partitioning (Colum Oriented or Hash Distribution and Random Distribution).
- Pushdown Optimization using the ODBC.
- How to use Weka for Knowledge Analysis, Data Mining and Machine Learning.
- How it is used for Data processing, Regression, Clustering and classification and also for the data visualization.
- Magnetic, Agile and Deep library of scalable, Parallel, Advanced in database functions
- How it is helpful in-database functions
- How to do in-Database Analytics using the MADlib.
- What is R- Programming ?
- How it is used for statistical data analysis and exploration ?
- How slicing/dicing, data modeling and data visualization is possible using R- Program?
- The use of R - Programme in Predictive Analysis ?
- What is Text Analysis ?
- What are the main challenges in Text Analytics ?
- Techniques involved in text analytics ?
- What is the Predictive Analytics and how this can be achieved where it used ?
Well written articles like yours renews my faith in today's writers. The article is very informative. Thanks for sharing such beautiful information.
ReplyDeleteBest Data Migration tools
Penetration testing companies USA
What is Data Lake
Artificial Intelligence in Banking
What is Data analytics
Big data Companies USA