Today, we’re surrounded by data. Existing tools were becoming inadequate to process such large data sets. Hadoop, and large-scale distributed data processing in general, is rapidly becoming an important skill set for many programmers. Hadoop is an open-source framework for writing and running distributed applications that process large amounts of data. Distributed computing is a wide and varied field, but the key distinctions of Hadoop are that it is - Accessible, Robust, Scalable and Simple.
This course introduces Hadoop in terms of distributed systems and data processing systems. You will learn the basics of DFS and Hadoop Architecture. The course will give you an overview of the Map Reduce programming model using simple word counting examples.