Introduction to Big Data

This course is part of Big Data Specialization

Instructors: Ilkay Altintas +1 more

Instructor ratings

We asked all learners to give feedback on our instructors based on the quality of their teaching style.

Skills you'll gain

  •   Distributed Computing
  •   Unstructured Data
  •   Data Science
  •   Real Time Data
  •   Scalability
  •   Big Data
  •   Data Storage
  •   Data Analysis
  •   Apache Hadoop
  •   Data Processing
  • There are 6 modules in this course

    At the end of this course, you will be able to: * Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors. * Explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting. * Get value out of Big Data by using a 5-step process to structure your analysis. * Identify what are and what are not big data problems and be able to recast big data problems as data science questions. * Provide an explanation of the architectural components and programming models used for scalable big data analysis. * Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model. * Install and run a program using Hadoop! This course is for those new to data science. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Hardware Requirements: (A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size. Software Requirements: This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge. Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.

    Big Data: Why and Where

    Characteristics of Big Data and Dimensions of Scalability

    Data Science: Getting Value out of Big Data

    Foundations for Big Data Systems and Programming

    Systems: Getting Started with Hadoop

    Explore more from Data Analysis

    ©2025  ementorhub.com. All rights reserved