Question 124
You are designing a cloud-native historical data processing system to meet the following conditions:
✑ The data being analyzed is in CSV, Avro, and PDF formats and will be accessed by multiple analysis tools including Dataproc, BigQuery, and Compute
Engine.
✑ A batch pipeline moves daily data.
✑ Performance is not a factor in the solution.
✑ The solution design should maximize availability.
How should you design data storage for this solution?
- A. Create a Dataproc cluster with high availability. Store the data in HDFS, and perform analysis as needed.
- B. Store the data in BigQuery. Access the data using the BigQuery Connector on Dataproc and Compute Engine.
- C. Store the data in a regional Cloud Storage bucket. Access the bucket directly using Dataproc, BigQuery, and Compute Engine.
- D. Store the data in a multi-regional Cloud Storage bucket. Access the data directly using Dataproc, BigQuery, and Compute Engine.