What is the CCC Big Data Foundation certification?

Big Data Training and Certification

The Big Data Foundation certification is designed to provide candidates with a well-rounded understanding of big data. It covers the potential data sources that can be used for solving real business problems and an overview of data mining and the tools used in it.

This is a fundamental course with practical, hands-on, exercises to experience using two of the most popular technologies in big data processing – Hadoop and MongoDB. Candidates will get the opportunity to practice installing these two technologies through lab exercises. The course exposes candidates to real-life big data technologies with the purpose of obtaining results from real datasets, including major social media platforms.

After completing the course, candidates will be equipped with fundamental big data knowledge, and introduced to a working development environment containing Hadoop and MongoDB, installed by them. This practical knowledge can be used as a starting point in the journey into big data.

Who is this certification for?

This course is best suited to IT professionals who possess intermediate to advanced programming, system administration, or relational database skills and are looking to move into the area of big data. These include:

  • Software Engineers
  • Application Developers
  • IT Architects
  • System Administrators
  • Data Analysts and Scientists


Syllabus – Big Data Foundation

Module 1. Big Data Fundamentals

1.1 Big Data – History, Overview, and Characteristics

  • History
  • Big Data Definition
  • Big Data Benefits
  • Big Data Characteristics – Volume, Velocity & Variety
    • Big Data Technologies – Overview
    • Big Data Success Stories
    • Big Data – Privacy and Ethics Privacy – Compliance
  • Privacy – Challenges
  • Privacy – Approach
  • Ethics

1.5 Big Data Projects

  • Who Should Be Involved?
  • What Is Involved?

Module 2. Big Data Sources

2.1 Enterprise Data Sources

  • Enterprise Systems
  • Oracle
  • SAP
  • Microsoft
  • Data Warehouses
  • Unstructured Data – Introduction
  • Unstructured Data – Metadata

2.2 Social Media Data Sources

  • Introduction
  • Facebook – Introduction
  • Facebook – Public Feed API
  • Facebook – Keyword Insights API
  • Facebook – Graph API
  • Twitter – Introduction
  • Twitter – Streaming APIs
  • Twitter – REST APIs
  • Other Social Media

2.3 Public Data Sources

  • Introduction
  • Weather
  • Economics
  • Finance
  • Regulatory Bodies

Module 3. Data Mining – Concepts and Tools

3.1 Data Mining – Introduction

  • Introduction
  • Types of Data Mining – Overview
  • Types of Data Mining – Classification
  • Types of Data Mining – Association
  • Types of Data Mining – Clustering

3.2 Data Mining – Tools

  • Introduction
  • Weka
  • Modules of Weka Applications
  • KNIME – Example
  • R Language

Module 4. Big Data Technologies – Hadoop

4.1 Hadoop Fundamentals

  • Introduction
  • Main Components of Hadoop
  • Additional Components of Hadoop

4.2 Install and Configure

  • Download
  • How to Install and Configure
    • MapReduce Introduction ? How Does It Work?
    • Data Processing with Hadoop
  • Introduction
  • Twitter Sentiment Analysis – Overview
  • Twitter Sentiment Analysis – Algorithm
  • Network Log Analysis – Overview Network Log Analysis – Algorithm

Module 5. Big Data Technologies – MongoDB

5.1 MongoDB Fundamentals

  • Introduction
  • Replication
  • Sharding
  • Sharding and Replication
  • MongoDB Ecosystem – Languages and Drivers
  • MongoDB Ecosystem – Hadoop Integration
  • MongoDB Ecosystem – Tools

5.2 Install and Configure

  • Download
  • How to Install and Configure

5.3 Document Databases ? Introduction

  • Documents
  • Document Design Considerations
  • Fields

5.4 Data Modelling with Document Databases

  • Introduction
  • Twitter Sentiment Analysis
  • Twitter Sentiment Analysis – Algorithm
  • Network Log Analysis
  • Network Log Analysis – Algorithm


Exam Details

Big Data Foundation Certification Exam 
Exam Type Multiple Choice
No. of Questions 40
Duration 60 minutes
Additional Time Provisions 15 minutes additional time for candidates who speak English as a second language.
Prerequisite There are no required prerequisites. We recommend that participants possess intermediate to advanced programming, system administration, or relational database skills to understand the concepts in this certification.
Supervised (Proctored) Yes (Web/Live)
Open Book No
Pass Score 65%
Delivery Online


Cloud Credential Council

The Cloud Credential Council (CCC) is an international member-based organization mandated to drive cloud readiness through effective competence development. The CCC has established critical cloud certifications for key IT roles in order to cultivate cloud-ready IT professionals. The certification scheme was developed after several years research investment in over 20 roles led by industry experts in conjunction with the leading technology vendors in the cloud computing arena.