Classification of Digital Data, Structured and Unstructured Data - Introduction to Big Data: Characteristics – Evolution – Definition - Challenges with Big Data - Other Characteristics of Data - Why Big Data - Traditional Business Intelligence versus Big Data - Data Warehouse and Hadoop Environment . Big Data Analytics: Classification of Analytics – Challenges - Big Data Analytics important - Data Science - Data Scientist - Terminologies used in Big Data Environments - Basically Available Soft State Eventual Consistency - Top Analytics Tools
NoSQL, Comparison of SQL and NoSQL, Hadoop - RDBMS Versus Hadoop - Distributed Computing Challenges – Hadoop Overview - Hadoop Distributed File System - Processing Data with Hadoop - Managing Resources and Applications with Hadoop YARN - Interacting with Hadoop Ecosystem
MongoDB: Why Mongo DB - Terms used in RDBMS and Mongo DB - Data Types - MongoDB Query Language Cassandra: Features - CQL Data Types – CQLSH – Key spaces - CRUD Operations – Collections -
MapReduce: Mapper – Reducer – Combiner – Partitioner – Searching – Sorting – Compression Hive: Introduction – Architecture - Data Types - File Formats - Hive Query Language Statements – Partitions – Bucketing – Views – Sub - Query – Joins – Aggregations - Group by and Having – RC File Implementation - Hive User Defined Function - Serialization and Deserialization - Hive Analytic Functions
Pig: Introduction - Anatomy – Features – Philosophy - Use Case for Pig - Pig Latin Overview - Pig Primitive Data Types - Running Pig - Execution Modes of Pig - HDFS Commands - Relational Operators - Eval Function - Complex Data Types - Piggy Bank - User-Defined Functions - Parameter Substitution - Diagnostic Operator - Word Count Example using Pig - Pig at Yahoo! - Pig Versus Hive – JasperReport using Jaspersoft
Reference Book:
1 Judith Huruwitz, Alan Nugent, Fern Halper, Marcia Kaufman, “Big data for dummiesâ€, John Wiley & Sons, Inc. (2013) 2 Tom White, “Hadoop The Definitive Guideâ€, O’Reilly Publications, Fourth Edition, 2015 3 Dirk Deroos, Paul C.Zikopoulos, Roman B.Melnky, Bruce Brown, Rafael Coss, “Hadoop For Dummiesâ€, Wiley Publications, 2014 4 Robert D.Schneider, “Hadoop For Dummiesâ€, John Wiley & Sons, Inc. (2012) 5 Paul Zikopoulos, “Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data, McGraw Hill, 2012
Text Book:
1.Seema Acharya, Subhashini Chellappan, “Big Data and Analyticsâ€, Wiley Publications, First Edition,2015