Skip to main content

Question 138

You have several Spark jobs that run on a Cloud Dataproc cluster on a schedule. Some of the jobs run in sequence, and some of the jobs run concurrently. You need to automate this process. What should you do?

  • A. Create a Cloud Dataproc Workflow Template
  • B. Create an initialization action to execute the jobs
  • C. Create a Directed Acyclic Graph in Cloud Composer
  • D. Create a Bash script that uses the Cloud SDK to create a cluster, execute jobs, and then tear down the cluster

A. Create a Cloud Dataproc Workflow Template Dataproc Workflow Template can be used to run jobs concurrently and sequentially. DAG is an overkill.

https://cloud.google.com/dataproc/docs/concepts/workflows/use-workflows