Bigquery: Too Many Table Dml Insert Operations For This Table
I'm trying to import more than 200M records on different computers (n=20) to my BigQuery table via Python client. Each computer runs every 10. second a job (with multiple rows) fro
Solution 1:
There are 4 major ways to insert data into BigQuery tables.
- Batch load a set of data records.
- Stream individual records or batches of records.
- Use queries to generate new data and append or overwrite the results to a table.
- Use a third-party application or service.
I think you are using the 3rd option, which is DML INSERT. It's not designed for large-scale high-frequency data loading use case.
In your use case, it seems the 2nd option, streaming data, could be a good fit.
Example
from google.cloud import bigquery
# Construct a BigQuery client object.
client = bigquery.Client()
# TODO(developer): Set table_id to the ID of table to append to.# table_id = "your-project.your_dataset.your_table"
rows_to_insert = [
{u"full_name": u"Phred Phlyntstone", u"age": 32},
{u"full_name": u"Wylma Phlyntstone", u"age": 29},
]
errors = client.insert_rows_json(table_id, rows_to_insert) # Make an API request.if errors == []:
print("New rows have been added.")
else:
print("Encountered errors while inserting rows: {}".format(errors))
You could see more details here. https://cloud.google.com/bigquery/streaming-data-into-bigquery
Post a Comment for "Bigquery: Too Many Table Dml Insert Operations For This Table"