我有一个名为 output 的数据框 -
RAW_ENTITY_NAME ENTITY_TYPE ENTITY_NAME IS_MAIN
01-03-2017 TNRMATDT 01 03 2017 1
04-02-2017 TNRSTRTDT 04 02 2017 1
documents TNRTYPE SIGHT 1
documents TNRDOCSBY NOT FOUND 1
accept TNRDTL accept 1
23 TNRDAYS 23 1
打印(df.dtypes())
RAW_ENTITY_NAME object
ENTITY_TYPE object
ENTITY_NAME object
IS_MAIN object
注意 - ENTITY_TYPE = TNRTYPE
, ENTITY_NAME = SIGHT
ANDIS_MAIN = 1
只会在数据框中出现一次。
如果 ENTITY_TYPE 是 TNRTYPE,ENTITY_NAME = SIGHT AND IS_MAIN = 1,我想更新一些值。
temp = output.loc[(output['IS_MAIN'] == 1) & (output['ENTITY_TYPE'] == 'TNRTYPE'), 'ENTITY_NAME']
temp = temp.reset_index(drop=True)
temp = temp[0]
if (temp == 'SIGHT'):
output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'] == 'TNRDOCSBY'), 'ENTITY_NAME'] = 'PAYMENT'
output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'].isin(['TNRDTL'])),
['ENTITY_NAME', 'RAW_ENTITY_NAME']] = 'NOT APPLICABLE'
output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'].isin(['TNRDAYS'])),
['ENTITY_NAME']] = '0'
output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'].isin(['TNRDAYS'])),
['RAW_ENTITY_NAME']] = ''
output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRSTRTDT'),
['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''
output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRMATDT'),
['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''
最终输出是——
RAW_ENTITY_NAME ENTITY_TYPE ENTITY_NAME IS_MAIN
01-03-2017 TNRMATDT 01 03 2017 1
04-02-2017 TNRSTRTDT 04 02 2017 1
documents TNRTYPE SIGHT 1
documents TNRDOCSBY PAYMENT 1
NOT APPLICABLE TNRDTL NOT APPLICABLE 1
TNRDAYS 0 1
正如你所看到的,除了前两行,一切都在更新,即 ENTITY_TYPE = TNRMATDT AND TNRSTRTDAT。
我想知道为什么下面的代码没有给出想要的结果。
output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRSTRTDT'),
['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''
output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRMATDT'),
['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''
如果有人能找出我犯的错误或告诉我任何解决方法,我会很高兴。
多谢。
对我来说,你的解决方案工作得很好,我尝试重写它以获得更好的可读性并且不重复相同的条件:
temp = output.loc[(output['IS_MAIN'] == '1') &
(output['ENTITY_TYPE'] == 'TNRTYPE'), 'ENTITY_NAME']
#if values in IS_MAIN are integers
#temp = output.loc[(output['IS_MAIN'] == 1) &
# (output['ENTITY_TYPE'] == 'TNRTYPE'), 'ENTITY_NAME']
if (temp.iat[0] == 'SIGHT'):
#more general working if not match condition
#if (next(iter(temp), 'not match') == 'SIGHT'):
m1 = output['IS_MAIN'] == '1'
#if values in IS_MAIN are integers
#m1 = output['IS_MAIN'] == 1
m2 = output['ENTITY_TYPE'] == 'TNRDOCSBY'
m3 = output['ENTITY_TYPE'] == 'TNRDTL'
m4 = output['ENTITY_TYPE'] == 'TNRDAYS'
m5 = output['ENTITY_TYPE'].isin(['TNRMATDT','TNRSTRTDT'])
output.loc[m1 & m2, 'ENTITY_NAME'] = 'PAYMENT'
output.loc[m1 & m3, ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = 'NOT APPLICABLE'
output.loc[m1 & m4, ['ENTITY_NAME']] = '0'
output.loc[m1 & m4, ['RAW_ENTITY_NAME']] = ''
output.loc[m1 & m5, ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''
print (output)
RAW_ENTITY_NAME ENTITY_TYPE ENTITY_NAME IS_MAIN
0 TNRMATDT 1
1 TNRSTRTDT 1
2 documents TNRTYPE SIGHT 1
3 documents TNRDOCSBY PAYMENT 1
4 NOT APPLICABLE TNRDTL NOT APPLICABLE 1
5 TNRDAYS 0 1
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句