Q1. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
MapReduce
Mahout
Oozie
None of the above
Q2. Data is what size of bytes is known as Big Data?
Tera
Peta
Giga
Meta
Q3. Which of these has the world's largest Hadoop cluster?
Datamatics
Facebook
All of the above
Apple
Q4. What is the transaction data of the bank?
None of the above
Unstructured data
Structured data
Both 1 and 2
Q5. Which of the following is not a data type commonly encountered in Big Data?
XML
Binary
CSV
JSON
Q6. Which type of database is optimized for handling transactional workloads and providing high availability?
NoSQL
NewSQL
OLTP
OLAP
Q7. Which of the following is not a challenge associated with Big Data?
Security
Scalability
Data Consistency
Privacy
Q8. Which technology is commonly used for real-time data analytics and visualization?
Power BI
Tableau
Databricks
QlikView
Q9. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, Hummer, and Iguana
MapReduce, Heron, an Trumpet
MapReduce, MySQL, and Google Apps
MapReduce, Hive, and HBase
Q10. Which of the following is not a characteristic of a data warehouse?
Real-time processing
Integrated data
Historical data
Optimized for analytics
Q11. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Project Prism
Prism
Project Big
Project Data
Q12. Which of the following is not a characteristic of Big Data?
Velocity
Volume
Velocity
Variety
Q13. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Huge Data
Massive Data
Big Data
Mega Data
Q14. Which technology is commonly used for distributed messaging in Big Data systems?
Kafka
Hadoop
Flink
Spark
Q15. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Reservoir
Data Stream
Data Pond
Data Lake
Q16. Which of the following is not a key feature of Apache Spark?
Batch Processing
Real-time Processing
MapReduce Support
In-memory Computing
Q17. Which type of data refers to data that is generated in real-time or near real-time?
Streaming Data
Structured Data
Semi-Structured Data
Unstructured Data
Q18. Which technology is commonly used for real-time stream processing in Big Data systems?
Spark
Hadoop
Flink
Kafka
Q19. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Visualization
Data Warehousing
Data Analysis
Data Mining
Q20. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Normalization
Data Fusion
Data Aggregation
Data Integration
Q21. Which of the following is not a layer of the Big Data stack?
Storage Layer
Presentation Layer
Processing Layer
Application Layer
Q22. All the options given accurately describe Hadoop except one. Which one is it?
Java-based
Real-time
Open-source
Distributed computing approach
Q23. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Partitioning
Data Sharding
Data Replication
Data Redundancy
Q24. Which of these options is Hadoop named after?
Cutting's high school best friend
The toy elephant of Creator Cutting's son
Creator Doug Cutting's favourite circus act
A sound Cutting's laptop made during Hadoop development
Q25. Which technology framework is commonly used for distributed storage and processing of Big Data?
Kafka
Flink
Hadoop
Spark
Q26. How many V's are there in Big Data?
4
5
2
3
Q27. Which of the following is not a component of the Hadoop ecosystem?
MapReduce
YARN
HDFS
Spark
Q28. Which technology is commonly used for distributed data storage in Big Data systems?
Cassandra
HDFS
SQL
MongoDB
Q29. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Preparation
Data Staging
Data Cleansing
Data Scrubbing
Q30. Big Data can be found in how many versions?
2
1
3
4