108k views
5 votes
Why would a firm choose Hadoop for a big data project instead of using conventional relationial databases?

User Amuniz
by
7.1k points

1 Answer

4 votes

Final answer:

Firms choose Hadoop for big data projects because it's designed for large amounts of unstructured data, scales economically, and offers high-throughput data processing. Conventional relational databases, optimized for structured data, may not handle the scale and complexity of big data as efficiently as Hadoop, especially in high-velocity data environments like the Sloan Digital Sky Survey.

Step-by-step explanation:

A firm might choose Hadoop for a big data project over conventional relational databases for several reasons. Hadoop is specifically designed to store and analyze huge amounts of unstructured data in a distributed computing environment, which allows it to handle the volume, velocity, and variety aspects of big data more efficiently than traditional relational databases, that are optimized for structured data and may not scale as well economically or technically.

For instance, in a scientific setting like the Sloan Digital Sky Survey (SDSS), where data rates can reach 8 megabytes per second, totaling over 15 terabytes of data, the conventional databases might struggle with storage and quick data retrieval. Hadoop, on the other hand, employs a distributed file system (HDFS) that can grow with the data, and MapReduce algorithms that can process large datasets in parallel across a cluster of servers.

Additionally, Hadoop's ability to process data in a cluster allows for high throughput, making it suitable for applications that require processing large datasets quickly. This is critical for projects like SDSS, where timely analysis is key to scientific discovery. Conventional relational databases typically do not offer the same level of flexibility or cost-effectiveness for such tasks.

User Achuth
by
8.1k points