COURSE OBJECTIVES:
The term big data has been used repeatedly with technologies concerning IoT, Machine learning, Artificial Intelligence with the aspect in the generation of huge amount of data. With the generated data the algorithms and its tools explores the patterns and insights that makes the data into a decision support model. The course aims to provide practical experience in big data techniques and technologies that can be used significantly with different forms of real-time data.
COURSE DURATION: 4 Weeks
COURSE OUTCOMES:
By the end of this course, the learners will be able to
• Explain the big data platforms like Hadoop and Spark for data sharing, services, repositories to manage big data
• Use scripting technologies Hive and pig for big data processing
• Use NoSql database models for different big data applications
• Use python packages and libraries for big data applications
COURSE CONTENTS:
Module 1: Big Data Analytics Platforms - Hadoop, Spark: Introduction - Big Data Technologies - Introduction to Hadoop - Hadoop Architecture- Design of HDFS- Mapreduce - Hadoop Ecosystem - Spark - Architecture -Spark streaming
Module 2: Scripting technologies - Pig , Hive: Introduction - Pig -Execution types - running PIG programs -PIG Latin Structure- statement - Expression - Function Hive Shell - HiveQL queries -Services- Tables
Module 3: NoSQL Data Models: Introduction - Aggregate data models - Distribution models -Key value - Document data model - Columnar data model - Graph data model -case study
Module 4: Python for Big data Applications:Big Data Technology: Time series data - NLP - Chatbot - Classification models- Weather forecasting - Sensor data analysis
COURSE INSTRUCTORS:
1. Dr.R. Suganya , rsuganya@tce.edu
2. Dr.A.M. Abirami, abiramiam@tce.edu