Question 138
You have several Spark jobs that run on a Cloud Dataproc cluster on a schedule. Some of the jobs run in sequence, and some of the jobs run concurrently. You need to automate this process. What should you do?
- A. Create a Cloud Dataproc Workflow Template
- B. Create an initialization action to execute the jobs
- C. Create a Directed Acyclic Graph in Cloud Composer
- D. Create a Bash script that uses the Cloud SDK to create a cluster, execute jobs, and then tear down the cluster
A. Create a Cloud Dataproc Workflow Template Dataproc Workflow Template can be used to run jobs concurrently and sequentially. DAG is an overkill.
https://cloud.google.com/dataproc/docs/concepts/workflows/use-workflows