Skip to content Skip to sidebar Skip to footer

Bigquery: Too Many Table Dml Insert Operations For This Table

I'm trying to import more than 200M records on different computers (n=20) to my BigQuery table via Python client. Each computer runs every 10. second a job (with multiple rows) fro

Solution 1:

There are 4 major ways to insert data into BigQuery tables.

  1. Batch load a set of data records.
  2. Stream individual records or batches of records.
  3. Use queries to generate new data and append or overwrite the results to a table.
  4. Use a third-party application or service.

I think you are using the 3rd option, which is DML INSERT. It's not designed for large-scale high-frequency data loading use case.

In your use case, it seems the 2nd option, streaming data, could be a good fit.

Example

from google.cloud import bigquery

# Construct a BigQuery client object.
client = bigquery.Client()

# TODO(developer): Set table_id to the ID of table to append to.# table_id = "your-project.your_dataset.your_table"

rows_to_insert = [
    {u"full_name": u"Phred Phlyntstone", u"age": 32},
    {u"full_name": u"Wylma Phlyntstone", u"age": 29},
]

errors = client.insert_rows_json(table_id, rows_to_insert)  # Make an API request.if errors == []:
    print("New rows have been added.")
else:
    print("Encountered errors while inserting rows: {}".format(errors))

You could see more details here. https://cloud.google.com/bigquery/streaming-data-into-bigquery

Post a Comment for "Bigquery: Too Many Table Dml Insert Operations For This Table"