Skip to content Skip to sidebar Skip to footer

Python2.7: How To Split A Column Into Multiple Column Based On Special Strings Like This?

I'm a newbie for programming and python, so I would appreciate your advice! I have a dataframe like this. In 'info' column, there are 7 different categories: activities, locations

Solution 1:

It looks like the content of the info column is JSON-formatted, so you can parse that into a dict object easily:

>>> import json
>>> s = '''{"activities": ["Tour"], "locations": ["Tokyo"], "groups": []}'''
>>> j = json.loads(s)
>>> j
{u'activities': [u'Tour'], u'locations': [u'Tokyo'], u'groups': []}

Once you have the data as a dict, you can do whatever you like with it.


Solution 2:

Ok, here is how to do it :

import pandas as pd
import ast

#Initial Dataframe is df
mylist = list(df['info'])
mynewlist = []

for l in mylist:
    mynewlist.append(ast.literal_eval(l))

df_info = pd.DataFrame(mynewlist)

#Add columns of decoded info to the initial dataset
df_new = pd.concat([df,df_info],axis=1)

#Remove the column info
del df_new['info']

Solution 3:

You can use the json library to do that.

1) import the json libray

import json

2) Turn into string all the rows of that column and then Apply the json.loads function to all of them. Insert the result in an object

jsonO = df['info'].map(str).apply(json.loads)

3)The Json object is now a json dataframe in which you can navigate. For each columns of your Json dataframe, create a column in your final dataframe

df['Activities'] = jsonO.apply(lambda x: x['Activities'])

Here for one column of your json dataframe each 'rows' is dump in the new column of your final dataframe df

4) Re-do 3 for all the columns you're interested in


Post a Comment for "Python2.7: How To Split A Column Into Multiple Column Based On Special Strings Like This?"