Q1. Big Data can be found in how many versions?
1
2
3
4
Q2. What is the term for a collection of data that is too large to be processed using traditional database techniques?
Data Lake
Data Reservoir
Data Pond
Data Stream
Q3. Which of the following is not a challenge associated with Big Data?
Scalability
Security
Data Consistency
Privacy
Q4. How many V's are there in Big Data?
5
4
2
3
Q5. Which of the following is not a key feature of Apache Spark?
MapReduce Support
In-memory Computing
Real-time Processing
Batch Processing
Q6. Which of the following is not a characteristic of a data warehouse?
Real-time processing
Historical data
Optimized for analytics
Integrated data
Q7. Which of the following is not a characteristic of Big Data?
Variety
Volume
Velocity
Velocity
Q8. What is the term for the process of analyzing large and complex datasets to uncover patterns, trends, and insights?
Data Analysis
Data Warehousing
Data Mining
Data Visualization
Q9. Data is what size of bytes is known as Big Data?
Giga
Tera
Peta
Meta
Q10. Which technology is commonly used for real-time stream processing in Big Data systems?
Flink
Hadoop
Kafka
Spark
Q11. Which of the following is not a data type commonly encountered in Big Data?
XML
Binary
CSV
JSON
Q12. All the options given accurately describe Hadoop except one. Which one is it?
Open-source
Java-based
Real-time
Distributed computing approach
Q13. What is the term for the process of storing data across multiple servers to ensure redundancy and fault tolerance?
Data Partitioning
Data Replication
Data Sharding
Data Redundancy
Q14. What is the transaction data of the bank?
Structured data
Unstructured data
None of the above
Both 1 and 2
Q15. Which technology is commonly used for real-time data analytics and visualization?
Power BI
Tableau
Databricks
QlikView
Q16. Which technology is commonly used for distributed messaging in Big Data systems?
Flink
Hadoop
Kafka
Spark
Q17. Which type of data refers to data that is generated in real-time or near real-time?
Structured Data
Unstructured Data
Streaming Data
Semi-Structured Data
Q18. Which technology framework is commonly used for distributed storage and processing of Big Data?
Hadoop
Spark
Kafka
Flink
Q19. Which of these has the world's largest Hadoop cluster?
All of the above
Datamatics
Facebook
Apple
Q20. Hadoop is a framework. It is used with several types of related tools. What are its common cohorts?
MapReduce, Heron, an Trumpet
MapReduce, MySQL, and Google Apps
MapReduce, Hummer, and Iguana
MapReduce, Hive, and HBase
Q21. What can be described as a model for programming used to develop applications based on Hadoop that can process massive amounts of data?
MapReduce
Mahout
Oozie
None of the above
Q22. Which type of database is optimized for handling transactional workloads and providing high availability?
OLTP
NoSQL
OLAP
NewSQL
Q23. What is the term for the process of cleaning and transforming raw data into a usable format for analysis?
Data Cleansing
Data Preparation
Data Scrubbing
Data Staging
Q24. What is the term for the process of integrating data from multiple sources to create a unified view?
Data Fusion
Data Aggregation
Data Integration
Data Normalization
Q25. Which technology is commonly used for distributed data storage in Big Data systems?
SQL
MongoDB
HDFS
Cassandra
Q26. Which of these projects based on Hadoop is used by Facebook to tackle with Big Data?
Project Big
Prism
Project Data
Project Prism
Q27. What is the term for a large volume of data that cannot be processed using traditional database techniques?
Huge Data
Mega Data
Massive Data
Big Data
Q28. Which of the following is not a component of the Hadoop ecosystem?
HDFS
YARN
Spark
MapReduce
Q29. Which of these options is Hadoop named after?
Creator Doug Cutting's favourite circus act
Cutting's high school best friend
The toy elephant of Creator Cutting's son
A sound Cutting's laptop made during Hadoop development
Q30. Which of the following is not a layer of the Big Data stack?
Storage Layer
Application Layer
Presentation Layer
Processing Layer