What's a better way to use df.apply on multiple columns?

debugcn 投稿 Dev

Celso Pereira Neto

After struggling with a csv file encoding I decided to do the encoding heresy of manually replacing some characters.

This is how the dataframe looks:

df = pd.DataFrame({'a' : 'bÃ‰d encoded',
               'b' : ['foo', 'bar'] * 3,
               'c' : 'bÃ‰d encoded too'})


              a    b                 c
0  bÃ‰d encoded  foo  bÃ‰d encoded too
1  bÃ‰d encoded  bar  bÃ‰d encoded too
2  bÃ‰d encoded  foo  bÃ‰d encoded too
3  bÃ‰d encoded  bar  bÃ‰d encoded too
4  bÃ‰d encoded  foo  bÃ‰d encoded too
5  bÃ‰d encoded  bar  bÃ‰d encoded too

If my only problem was column 'a' this function would be enough:

def force_good_e(row):
    col = row['a']
    if 'Ã‰' in col:
        col = col.replace('Ã‰','a') 
    return col

df['a'] = df.apply(force_good_e, axis=1)

But then I would need another function for column 'c'

I got an improvement with this:

def force_good_es(row, column):
    col = row[column]
    if 'Ã‰' in col:
        col = col.replace('Ã‰','a') 
    return col


df['a'] = df.apply(lambda x: force_good_es(x,'a'), axis=1)
df['c'] = df.apply(lambda x: force_good_es(x,'c'), axis=1)

But it got me wondering, is there a better way to do this?

i.e. eliminating the need to make one line of

df[n] = df.apply(lambda x: force_good_es(x,n), axis=1)

for each n column that needs to be fixed.

Abhi

You could use str.replace

df['a'] = df['a'].str.replace('Ã‰','a')
df['c'] = df['c'].str.replace('Ã‰','a')

or like @wen mentioned in comments.

df = df.replace({'Ã‰':'a'},regex=True)

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集2021-06-6

コメントを追加

サインイン

分類Dev

Apply Search to Multiple Columns

分類Dev

Apply (in Pandas) to Multiple Columns

分類Dev

mutating multiple columns automatically in a df

分類Dev

Apply over xts with multiple columns

分類Dev

What is a better way to write this SQL query?

分類Dev

What is the better way for passing props to component in React?

分類Dev

What is a better way to write an if statement using PHP

分類Dev

Better way of capturing multiple same tags?

分類Dev

Better way to write Case When Multiple conditions

分類Dev

Better way to summarize multiple groups in same dataframe

分類Dev

Is there a better way to replace multiple spaces in file names?

分類Dev

Better way to check for multiple regex conditions in Ruby?

分類Dev

Angular - What's the best way to include html in multiple components?

分類Dev

pandas multiply multiple columns to make new df

分類Dev

Is there a better way to re-use plots Matplotlib?

分類Dev

Pandas DataFrame apply function to multiple columns and output multiple columns

分類Dev

What's the recommended way to use/install webapps on Arch Linux?

分類Dev

pass output from pandas apply to multiple columns

分類Dev

apply calculation on multiple columns of each element in a list

分類Dev

What is the better way to use IP field (IPV4 or IPV6) in proto3 file for Golang and C# usage

分類Dev

What is better practice: to store a string or use relations?

分類Dev

Better to use 1 or multiple database tables?

分類Dev

What is the best way to conditionally apply attributes in AngularJS?

分類Dev

Process multiple columns with Pandas apply method when columns are not predefined

分類Dev

What is the better way of show a Modal Form with a dim background?

分類Dev

What is better way for merge sort? recursive function or non-recursive?

分類Dev

What is the right way to use "instanceof"?

分類Dev

Better way to use findstr to get packet loss, batch

分類Dev

What's a good way to take multiple parallel subsets of a data frame in R?

Related 関連記事

記事