166k views
4 votes
Which statement about Apache Spark is true?

A. It runs on Hadoop clusters with RAM drives configured on each DataNode.
B. It is much faster than MapReduce for complex applications on disk.
C. It supports HDFS, MS-SQL, and Oracle.
D. It features APIs for C++ and .NET.

User Blondelg
by
8.3k points

1 Answer

4 votes

Final answer:

The true statement about Apache Spark is that it supports HDFS, MS-SQL, and Oracle. Apache Spark is known for its speed and can handle various data sources, but it does not provide APIs for C++ or .NET.

Step-by-step explanation:

The question refers to characteristics of Apache Spark, an open-source distributed computing system known for its speed and capability of handling large-scale data processing. Let's evaluate the statements.

  • It runs on Hadoop clusters with RAM drives configured on each DataNode: This statement is not necessarily true. Apache Spark can run on Hadoop clusters but does not require RAM drives to be configured on each DataNode.
  • It is much faster than MapReduce for complex applications on disk: This statement is generally true. Spark is designed to perform both batch processing and new workloads like streaming, interactive queries, and machine learning, which are not as efficient in MapReduce.
  • It supports HDFS, MS-SQL, and Oracle: This statement is true. Spark can read from and write to various data sources including HDFS, MS-SQL, and Oracle.
  • It features APIs for C++ and .NET: This statement is false. Spark provides APIs in Java, Scala, Python, and R, but not in C++ or .NET.

Therefore, the statement about Apache Spark that is true is: It supports HDFS, MS-SQL, and Oracle.

User James Beith
by
8.9k points