Skip to content Skip to sidebar Skip to footer

Taking Two And More Data Frames And Extracting Data On Unique Keys In Python

Firstly I have 2 data frames one in which I have name of a guy and pages liked by him in columns. So no. of columns will be different for different person here is the example. 1st

Solution 1:

Thank you for your code. Now it is more clear.

I try optimalize your loops and I think you can rather use isin with any for mask with boolean indexing. Also I simplier code in concat:

##adding a column category in df1 based on index
df1['category'] =  df2['categories']

##creating a list of page which i have in meta_data
meta_list = list(df3.iloc[:,0])

mask = df1.isin(meta_list).any(1)
new_df1 = (df1[mask])
new_df2 = (df1[~mask])

## merging newdf1 and newdf2 on page_name and category repectively 
mdf1 = pd.merge(new_df1, df3, how= 'left', on ='page_name')
mdf2 = pd.merge(new_df2, df4, how= 'left', on='category')
## concatenating the 2 data frame mdf1 and mdf2 and summing the tags for     each of them
finaldf = pd.concat([mdf1,mdf2])
## finally grouping on user and summing the tags for each user
finaldf1 = finaldf.groupby('user', as_index=False).sum()
print (finaldf1)
          user  tag1  tag2  tag3
0  Roshan ghai   0.01.01.01    mank nion   1.01.02.02   pop rajuel   2.00.01.03   random guy   2.01.01.0

Post a Comment for "Taking Two And More Data Frames And Extracting Data On Unique Keys In Python"