Check Element-wise For Existence Of String
I'm looking for a way to check whether one string can be found in another string. str.contains only takes a fixed string pattern as argument, I'd rather like to have an element-wis
Solution 1:
Use list comprehension with zip:
df['short_in_long'] = [b in a for a, b in zip(df['long'], df['short'])]
print (df)
            long  short  short_in_long
0       sometext   some           True
1  someothertext  other           True
2   evenmoretext  stuff          False
Solution 2:
This is a prime use case for a list comprehension:
# df['short_in_long'] = [y in x for x, y in df[['long', 'short']].values.tolist()]
df['short_in_long'] = [y in x for x, y in df[['long', 'short']].values]
df
            long  short  short_in_long
0       sometext   some           True
1  someothertext  other           True
2   evenmoretext  stuff          False
List comprehensions are usually faster than string methods because of lesser overhead. See For loops with pandas - When should I care?.
If your data contains NaNs, you can call a function with error handling:
deftry_check(haystack, needle):
    try:
        return needle in haystack
    except TypeError:
        returnFalse
df['short_in_long'] = [try_check(x, y) for x, y in df[['long', 'short']].values]
Solution 3:
Check with numpy, it is row-wise :-) .
np.core.char.find(df.long.values.astype(str),df.short.values.astype(str))!=-1
Out[302]: array([ True,  True, False])
Solution 4:
Also,
df['short_in_long'] = df['long'].str.contains('|'.join(df['short'].values))
Update : I misinterpreted the problem. Here is the corrected version:
df['short_in_long'] = df['long'].apply(lambda x: Trueif x[1] in x[0] elseFalse, axis =1)
Post a Comment for "Check Element-wise For Existence Of String"