Psycopg2: Copy Csv Data To Table With Extra Column Values
Solution 1:
Looks like there are a couple of ways to do this by first adding the columns I need to the data, and then uploading the updated data.
Using the petl
package:
import psycopg2
from petl import fromcsv, addfield, todb
csv_file ='/path/to/data.csv'table= fromcsv(csv_file)
table= addfield(table, 'colf', 'Some value')
table= addfield(table, 'colg', 'Another value')
conn = psycopg2.connect('dbname=test user=user')
todb(table, conn, 'my_table')
This works okay on small data, but it's incredibly slow on large data. The psycopg2
copy_from
and copy_expert
commands seem to go much faster since they make use of postgresql
bulk copy. I was able to copy my data using copy_from
by first converting my csv file into a pandas
dataframe
:
import psycopg2
import pandas as pd
from io importStringIO
csv_file = '/path/to/file'
df = pd.read_csv(csv_file)
df['colf'] = 'My value'
df['colg'] = 'Foobar'
In order to make use of the psycopg2
copy_
commands, I need to convert the dataframe
into a file-like object with read()
and readline()
methods, which I can do using StringIO
:
buf = StringIO()
df.to_csv(buf, header=False, index=False)
buf.pos = 0
Note that you need to set the pos
of the buffer to 0, because pandas.to_csv
seems to set the pos to the end by default. See this SO answer for an explanation.
Then I can copy that buffer object:
conn = psycopg2.connect('dbname=test user=user')
cur = conn.cursor()
cur.copy_from(buf, 'my_table', sep=',')
conn.commit()
cur.close()
Post a Comment for "Psycopg2: Copy Csv Data To Table With Extra Column Values"