python中Dataframe的Concat列？

debugcn 发表于 Dev

帕拉斯加

我有一个数据框架，其代码如下：

# importing pandas as pd 
import pandas as pd 

# Create the dataframe 
df = pd.DataFrame({'Category':['A', 'B', 'C', 'D'], 
                   'Event':['Music Theater', 'Poetry Music', 'Theatre Comedy', 'Comedy Theatre'], 
                   'Cost':[10000, 5000, 15000, 2000]}) 

# Print the dataframe 
print(df)

我希望生成一个将所有三列组合在一起的列表，并且还要通过“ _”删除空格，并且也删除所有尾随空格：-

[A_Music_Theater_10000, B_Poetry_Music_5000,C_Theatre_Comedy_15000,D_Comedy_Theatre_2000]

我想以最优化的方式来实现它，因为运行时间对我来说是个问题。因此，寻找避免循环。有人可以告诉我如何实现这种最优化的方式吗？

耶斯列尔

最通用的解决方案是将所有值转换为字符串，usejoin和last replace：

df['new'] = df.astype(str).apply('_'.join, axis=1).str.replace(' ', '_')

如果需要仅过滤某些列：

cols = ['Category','Event','Cost']
df['new'] = df[cols].astype(str).apply('_'.join, axis=1).str.replace(' ', '_')

或分别处理每个列-如有必要replace，还将数字列转换为字符串：

df['new'] = (df['Category'] + '_' + 
             df['Event'].str.replace(' ', '_') + '_' + 
             df['Cost'].astype(str))

或转换为字符串后添加_，sum但更换卸下traling后需要_通过rstrip：

df['new'] = df.astype(str).add('_').sum(axis=1).str.replace(' ', '_').str.rstrip('_')

print(df) 
  Category           Event   Cost                     new
0        A   Music Theater  10000   A_Music_Theater_10000
1        B    Poetry Music   5000     B_Poetry_Music_5000
2        C  Theatre Comedy  15000  C_Theatre_Comedy_15000
3        D  Comedy Theatre   2000   D_Comedy_Theatre_2000

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。