Pandas and apply function to match a string

xxxvincxxx

I have a df column containing various links, some of them containing the string "search".

I want to create a function that - being applied to the column - returns a column containing "search" or "other".

I write a function like:

search = 'search'
def page_type(x):
if x.str.contains(search):
    return 'Search'
else:
    return 'Other'   

df['link'].apply(page_type)

but it gives me an error like:

AttributeError: 'unicode' object has no attribute 'str'

I guess I'm missing something when calling the str.contains().

jezrael

I think you need numpy.where:

df = pd.DataFrame({'link':['search','homepage d','login dd', 'profile t', 'ff']})

print (df)
         link
0      search
1  homepage d
2    login dd
3   profile t
4          ff
search = 'search'
profile = 'profile'
homepage = 'homepage'
login = "login"

def page_type(x):
    if search in x:
        return 'Search'
    elif profile in x:
        return 'Profile'
    elif homepage in x:
        return 'Homepage'
    elif login in x:
        return 'Login'
    else:
        return 'Other'  

df['link_new'] = df['link'].apply(page_type)

df['link_type'] = np.where(df.link.str.contains(search),'Search', 
                  np.where(df.link.str.contains(profile),'Profile', 
                  np.where(df.link.str.contains(homepage), 'Homepage', 
                  np.where(df.link.str.contains(login),'Login','Other')))) 


print (df)
         link  link_new link_type
0      search    Search    Search
1  homepage d  Homepage  Homepage
2    login dd     Login     Login
3   profile t   Profile   Profile
4          ff     Other     Other

Timings:

#[5000 rows x 1 columns]
df = pd.DataFrame({'link':['search','homepage d','login dd', 'profile t', 'ff']})
df = pd.concat([df]*1000).reset_index(drop=True)

In [346]: %timeit df['link'].apply(page_type)
1000 loops, best of 3: 1.72 ms per loop

In [347]: %timeit np.where(df.link.str.contains(search),'Search', np.where(df.link.str.contains(profile),'Profile', np.where(df.link.str.contains(homepage), 'Homepage', np.where(df.link.str.contains(login),'Login','Other'))))
100 loops, best of 3: 11.7 ms per loop

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

how do I apply normalize function to pandas string series?

From Dev

how do I apply normalize function to pandas string series?

From Dev

How to apply MATCH() function in VBA?

From Dev

How to apply MATCH() function in VBA?

From Dev

Apply function with args in pandas

From Dev

Alternative to apply function in pandas

From Dev

Exception Handling in Pandas .apply() function

From Java

Apply ewm function on Pandas groupby

From Java

apply using a function with iloc pandas

From Dev

Speeding up Pandas apply function

From Dev

Pandas Groupby Apply Function to Level

From Dev

pandas apply function with arguments no lambda

From Dev

Counting within Pandas apply() function

From Dev

Apply Customize Cumulative Function to Pandas

From Dev

Counting within Pandas apply() function

From Dev

Function to sequentially apply modifications to a string

From Dev

Apply numpy function based on string

From Dev

Function to match array of char to a string

From Dev

Python function partial string match

From Dev

Optimizing pandas filter inside apply function

From Java

getting the index of a row in a pandas apply function

From Dev

Apply function on each column in a pandas dataframe

From Dev

How to apply scipy function on Pandas data frame

From Dev

pandas apply function to multiple columns and multiple rows

From Dev

Using 'apply' in Pandas (externally defined function)

From Dev

pandas DataFrame, how to apply function to a specific column?

From Dev

Pandas Apply lambda function null values

From Dev

Python pandas apply function if a column value is not NULL

From Dev

pandas - apply UTM function to dataframe columns

Related Related

HotTag

Archive