Q1. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, Hummer, and Iguana
MapReduce, Heron, an Trumpet
MapReduce, MySQL, and Google Apps
MapReduce, Hive, and HBase
Q2. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Redundancy
Data Sharding
Data Partitioning
Data Replication
Q3. Which of these options is Hadoop named after?
The toy elephant of Creator Cutting's son
Creator Doug Cutting's favourite circus act
A sound Cutting's laptop made during Hadoop development
Cutting's high school best friend
Q4. All the options given accurately describe Hadoop except one. Which one is it?
Java-based
Real-time
Open-source
Distributed computing approach
Q5. Which type of database is optimized for handling transactional workloads and providing high availability?
NoSQL
OLTP
NewSQL
OLAP
Q6. Data is what size of bytes is known as Big Data?
Tera
Meta
Giga
Peta
Q7. How many V's are there in Big Data?
5
3
4
2
Q8. Which technology is commonly used for real-time stream processing in Big Data systems?
Kafka
Hadoop
Flink
Spark
Q9. Which of the following is not a component of the Hadoop ecosystem?
MapReduce
HDFS
YARN
Spark
Q10. Which of the following is not a characteristic of a data warehouse?
Integrated data
Historical data
Optimized for analytics
Real-time processing
Q11. Which technology is commonly used for real-time data analytics and visualization?
QlikView
Tableau
Databricks
Power BI
Q12. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Warehousing
Data Analysis
Data Visualization
Data Mining
Q13. Which of the following is not a characteristic of Big Data?
Variety
Velocity
Volume
Velocity
Q14. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Project Data
Project Big
Prism
Project Prism
Q15. Which of the following is not a layer of the Big Data stack?
Processing Layer
Application Layer
Presentation Layer
Storage Layer
Q16. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Big Data
Huge Data
Massive Data
Mega Data
Q17. Which of these has the world's largest Hadoop cluster?
All of the above
Facebook
Apple
Datamatics
Q18. Which of the following is not a challenge associated with Big Data?
Scalability
Privacy
Security
Data Consistency
Q19. Which technology is commonly used for distributed data storage in Big Data systems?
HDFS
SQL
MongoDB
Cassandra
Q20. Which technology framework is commonly used for distributed storage and processing of Big Data?
Flink
Spark
Hadoop
Kafka
Q21. Which of the following is not a key feature of Apache Spark?
In-memory Computing
Real-time Processing
MapReduce Support
Batch Processing
Q22. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
Oozie
MapReduce
Mahout
None of the above
Q23. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Staging
Data Preparation
Data Scrubbing
Data Cleansing
Q24. Which technology is commonly used for distributed messaging in Big Data systems?
Kafka
Flink
Spark
Hadoop
Q25. Which type of data refers to data that is generated in real-time or near real-time?
Structured Data
Semi-Structured Data
Unstructured Data
Streaming Data
Q26. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Normalization
Data Aggregation
Data Fusion
Data Integration
Q27. What is the transaction data of the bank?
Both 1 and 2
Structured data
Unstructured data
None of the above
Q28. Which of the following is not a data type commonly encountered in Big Data?
JSON
CSV
Binary
XML
Q29. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Reservoir
Data Lake
Data Stream
Data Pond
Q30. Big Data can be found in how many versions?
1
2
4
3