Q1. Which technology is commonly used for real-time stream processing in Big Data systems?
Spark
Hadoop
Flink
Kafka
Q2. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Huge Data
Big Data
Mega Data
Massive Data
Q3. Which of the following is not a data type commonly encountered in Big Data?
CSV
JSON
Binary
XML
Q4. Which of these has the world's largest Hadoop cluster?
All of the above
Apple
Facebook
Datamatics
Q5. Which technology framework is commonly used for distributed storage and processing of Big Data?
Kafka
Hadoop
Flink
Spark
Q6. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Lake
Data Reservoir
Data Pond
Data Stream
Q7. All the options given accurately describe Hadoop except one. Which one is it?
Java-based
Open-source
Distributed computing approach
Real-time
Q8. Which of these options is Hadoop named after?
The toy elephant of Creator Cutting's son
A sound Cutting's laptop made during Hadoop development
Creator Doug Cutting's favourite circus act
Cutting's high school best friend
Q9. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Project Big
Prism
Project Prism
Project Data
Q10. Which of the following is not a characteristic of a data warehouse?
Real-time processing
Optimized for analytics
Historical data
Integrated data
Q11. How many V's are there in Big Data?
2
5
3
4
Q12. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Partitioning
Data Replication
Data Sharding
Data Redundancy
Q13. Which of the following is not a characteristic of Big Data?
Velocity
Variety
Volume
Velocity
Q14. Which technology is commonly used for distributed data storage in Big Data systems?
MongoDB
HDFS
SQL
Cassandra
Q15. Which type of data refers to data that is generated in real-time or near real-time?
Unstructured Data
Structured Data
Streaming Data
Semi-Structured Data
Q16. Which technology is commonly used for distributed messaging in Big Data systems?
Flink
Spark
Hadoop
Kafka
Q17. Big Data can be found in how many versions?
3
4
2
1
Q18. Which of the following is not a key feature of Apache Spark?
MapReduce Support
Batch Processing
In-memory Computing
Real-time Processing
Q19. Which type of database is optimized for handling transactional workloads and providing high availability?
NoSQL
OLTP
NewSQL
OLAP
Q20. Which of the following is not a challenge associated with Big Data?
Data Consistency
Scalability
Privacy
Security
Q21. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Fusion
Data Normalization
Data Integration
Data Aggregation
Q22. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, Hive, and HBase
MapReduce, Heron, an Trumpet
MapReduce, MySQL, and Google Apps
MapReduce, Hummer, and Iguana
Q23. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Analysis
Data Mining
Data Visualization
Data Warehousing
Q24. Which of the following is not a layer of the Big Data stack?
Presentation Layer
Storage Layer
Application Layer
Processing Layer
Q25. What is the transaction data of the bank?
Unstructured data
Both 1 and 2
Structured data
None of the above
Q26. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
Mahout
None of the above
MapReduce
Oozie
Q27. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Scrubbing
Data Preparation
Data Cleansing
Data Staging
Q28. Which technology is commonly used for real-time data analytics and visualization?
Tableau
Databricks
QlikView
Power BI
Q29. Data is what size of bytes is known as Big Data?
Giga
Peta
Meta
Tera
Q30. Which of the following is not a component of the Hadoop ecosystem?
MapReduce
YARN
HDFS
Spark