Q1. What is the transaction data of the bank?
Both 1 and 2
None of the above
Unstructured data
Structured data
Q2. Which of these options is Hadoop named after?
Creator Doug Cutting's favourite circus act
Cutting's high school best friend
The toy elephant of Creator Cutting's son
A sound Cutting's laptop made during Hadoop development
Q3. Which of the following is not a component of the Hadoop ecosystem?
HDFS
Spark
MapReduce
YARN
Q4. All the options given accurately describe Hadoop except one. Which one is it?
Open-source
Distributed computing approach
Java-based
Real-time
Q5. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Fusion
Data Aggregation
Data Normalization
Data Integration
Q6. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Redundancy
Data Partitioning
Data Sharding
Data Replication
Q7. Which of these has the world's largest Hadoop cluster?
Apple
Facebook
Datamatics
All of the above
Q8. Which technology is commonly used for distributed messaging in Big Data systems?
Spark
Kafka
Hadoop
Flink
Q9. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Analysis
Data Warehousing
Data Mining
Data Visualization
Q10. Which of the following is not a key feature of Apache Spark?
Real-time Processing
MapReduce Support
In-memory Computing
Batch Processing
Q11. Which of the following is not a characteristic of Big Data?
Volume
Velocity
Velocity
Variety
Q12. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Pond
Data Lake
Data Stream
Data Reservoir
Q13. How many V's are there in Big Data?
3
2
5
4
Q14. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
MapReduce
Mahout
Oozie
None of the above
Q15. Which of the following is not a characteristic of a data warehouse?
Historical data
Optimized for analytics
Integrated data
Real-time processing
Q16. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Staging
Data Preparation
Data Scrubbing
Data Cleansing
Q17. Which type of data refers to data that is generated in real-time or near real-time?
Structured Data
Streaming Data
Semi-Structured Data
Unstructured Data
Q18. Which of the following is not a data type commonly encountered in Big Data?
CSV
Binary
JSON
XML
Q19. Data is what size of bytes is known as Big Data?
Meta
Giga
Peta
Tera
Q20. Big Data can be found in how many versions?
4
1
3
2
Q21. Which technology is commonly used for real-time stream processing in Big Data systems?
Flink
Spark
Kafka
Hadoop
Q22. Which of the following is not a challenge associated with Big Data?
Security
Scalability
Privacy
Data Consistency
Q23. Which of the following is not a layer of the Big Data stack?
Processing Layer
Application Layer
Presentation Layer
Storage Layer
Q24. Which type of database is optimized for handling transactional workloads and providing high availability?
OLAP
NoSQL
NewSQL
OLTP
Q25. Which technology is commonly used for real-time data analytics and visualization?
Databricks
Power BI
QlikView
Tableau
Q26. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, Hummer, and Iguana
MapReduce, Heron, an Trumpet
MapReduce, Hive, and HBase
MapReduce, MySQL, and Google Apps
Q27. Which technology framework is commonly used for distributed storage and processing of Big Data?
Kafka
Hadoop
Spark
Flink
Q28. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Prism
Project Prism
Project Big
Project Data
Q29. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Big Data
Mega Data
Massive Data
Huge Data
Q30. Which technology is commonly used for distributed data storage in Big Data systems?
MongoDB
Cassandra
HDFS
SQL