Skip to content Skip to sidebar Skip to footer

How To Remove Extra Commas From Data In Python

I have a CSV file through which I am trying to load data into my SQL table containing 2 columns. I have 2 columns and the data is separated by commas, which identify the next field

Solution 1:

If all your data will be in the same format, as Number Address "000" , "000 abc street, Unit 000", you can split the list, remove the comma, and put the list back together, making it a string again. For example using the data you gave:

ori_addr ="Number Address \"12345\" , \"123 abc street, Unit 345\""
addr = ori_addr.split()
addr[6] = addr[6].replace(",", "")
together_addr =" ".join(addr)

together_addr is equal to "Number Address "12345" , "123 abc street Unit 345" note that there is no comma between "street" and "Unit."

Solution 2:

Edits:

  • Following user's comments, added a failing address to this test. This address loads to the database without issue.
  • Added code to store CSV addresses into MySQL.

Answer:

The code below performs the following actions:

  • MySQL database engine (connection) created.
  • Address data (number, address) read from CSV file.
  • Non-field separating commas replaced from source data, and extra whitespace removed.
  • Edited data fed into a DataFrame
  • DataFrame used to store data into MySQL.
import csv
    import pandas as pd
    from sqlalchemy import create_engine

    # Set database credentials.
    creds = {'usr': 'admin',
             'pwd': '1tsaSecr3t',
             'hst': '127.0.0.1',
             'prt': 3306,
             'dbn': 'playground'}
    # MySQL conection string.
    connstr = 'mysql+mysqlconnector://{usr}:{pwd}@{hst}:{prt}/{dbn}'# Create sqlalchemy engine for MySQL connection.
    engine = create_engine(connstr.format(**creds))

    # Read addresses from mCSV file.
    text = list(csv.reader(open('comma_test.csv'), skipinitialspace=True))

    # Replace all commas which are not used as field separators.# Remove additional whitespace.for idx, row inenumerate(text):
        text[idx] = [i.strip().replace(',', '') for i in row]

    # Store data into a DataFrame.
    df = pd.DataFrame(data=text, columns=['number', 'address'])
    # Write DataFrame to MySQL using the engine (connection) created above.
    df.to_sql(name='commatest', con=engine, if_exists='append', index=False)

Source File (comma_test.csv):

"12345" , "123 abc street, Unit 345""10101" , "111 abc street, Unit 111""20202" , "222 abc street, Unit 222""30303" , "333 abc street, Unit 333""40404" , "444 abc street, Unit 444""50505" , "abc DR, UNIT# 123 UNIT 123"

Unedited Data:

['12345 ', '123 abc street, Unit 345']['10101 ', '111 abc street, Unit 111']['20202 ', '222 abc street, Unit 222']['30303 ', '333 abc street, Unit 333']['40404 ', '444 abc street, Unit 444']['50505 ', 'abc DR, UNIT# 123 UNIT 123']

Edited Data:

['12345', '123 abc street Unit 345']['10101', '111 abc street Unit 111']['20202', '222 abc street Unit 222']['30303', '333 abc street Unit 333']['40404', '444 abc street Unit 444']['50505', 'abc DR UNIT# 123 UNIT 123']

Queried from MySQL:

number  address
12345123 abc street Unit34510101111 abc street Unit11120202222 abc street Unit22230303333 abc street Unit33340404444 abc street Unit44450505   abc DR UNIT# 123 UNIT 123

Acknowledgement:

This is a long-winded approach. However, each step has been broken down intentionally to clearly show the steps involved.

Post a Comment for "How To Remove Extra Commas From Data In Python"