Skip to content Skip to sidebar Skip to footer

Read Csv With Extra Commas And No Quotechar With Pandas?

Data: from io import StringIO import pandas as pd s = '''ID,Level,QID,Text,ResponseID,responseText,date_key 375280046,S,D3M,Which is your favorite?,D5M0,option 1,2012-08-08 00:00:

Solution 1:

Of course, as I write the question, I figured it out. Rather than delete it, I'll share it with my future self when I forget how to do this.

Apparently, pandas default sep=',' can also be a regular expression.

The solution was to add sep=r',(?!\s)' to read_csv like so:

df = pd.read_csv(StringIO(s), sep=r',(?!\s)')

The (?!\s) part is a negative lookahead to match only commas that don't have a following space after them.

Result:

          ID Level  QID                                  Text ResponseID  \
0375280046     S  D3M               Which is your favorite?       D5M0   
1375280046     S  D3M  How often? (at home, at work, other)       D3M0   
2375280046     M  A78             Do you prefer a, b, or c?       A78C   

  responseText             date_key  
0option12012-08-0800:00:001         Work  2010-03-3100:00:002            a  2010-03-3100:00:00

Post a Comment for "Read Csv With Extra Commas And No Quotechar With Pandas?"