How To List Down All The Dataflow Jobs Using Python Api
My use case involves fetching the job id of all streaming dataflow jobs present in my project and cancel it. Update the sources for my dataflow job and re-run it. I am trying to ac
Solution 1:
You can use directly the Dataflow rest api like this
from google.auth.transport.requests import AuthorizedSession
import google.auth
base_url = 'https://dataflow.googleapis.com/v1b3/projects/'
credentials, project_id = google.auth.default(scopes=['https://www.googleapis.com/auth/cloud-platform'])
project_id = 'PROJECT_ID'
location = 'europe-west1'
authed_session = AuthorizedSession(credentials)
response = authed_session.request('GET', f'{base_url}{project_id}/locations/{location}/jobs')
print(response.json())
You have to import the google-auth dependency.
You can also add the query parameter ?filter=ACTIVE
to get only the active dataflow, that can match with your streaming jobs.
Solution 2:
In addition to using the rest API directly, you can use the generated Python bindings for the API in google-api-python-client. For simple calls it doesn't add that much value, but when passing in many parameters it can be easier to work with than a raw HTTP library.
With that library, the jobs list call would look like
from googleapiclient.discovery import build
import google.auth
credentials, project_id = google.auth.default(scopes=['https://www.googleapis.com/auth/cloud-platform'])
df_service = build('dataflow', 'v1b3', credentials=credentials)
response = df_service.projects().locations().jobs().list(
project_id=project_id,
location='<region>').execute()
Post a Comment for "How To List Down All The Dataflow Jobs Using Python Api"