Here is a simple dataframe
import pandas as pd
import numpy as np
dates = pd. date_range(' 20130101' , periods=14)
data = pd.DataFrame({'a':[1,0,0,1,0,0,0,1,1,0,0,1,0,0],'b':[0,0,1,0,0,1,0,0,0,0,1,0,1,0]},index=dates)
Now I'd like to add column 'c', with the following conditions all together.
if a = 1, c = 1
if b = 1, c = 0
if a = 0 and b = 0, c = c.shift(1)
constraint : there exists no cases of a = 1
and b = 1
at the same time.This is a simple question, but very hard to solve...
Any good idea?
IIUC you need:
data['c'] = np.where(data.a == 1, 1,
np.where(data.b == 1, 0, np.nan))
print (data)
a b c
2013-01-01 1 0 1.0
2013-01-02 0 0 NaN
2013-01-03 0 1 0.0
2013-01-04 1 0 1.0
2013-01-05 0 0 NaN
2013-01-06 0 1 0.0
2013-01-07 0 0 NaN
2013-01-08 1 0 1.0
2013-01-09 1 0 1.0
2013-01-10 0 0 NaN
2013-01-11 0 1 0.0
2013-01-12 1 0 1.0
2013-01-13 0 1 0.0
2013-01-14 0 0 NaN
Then I am not sure if need bfill
or ffill
:
data['c'] = data['c'].bfill()
print (data)
a b c
2013-01-01 1 0 1.0
2013-01-02 0 0 0.0
2013-01-03 0 1 0.0
2013-01-04 1 0 1.0
2013-01-05 0 0 0.0
2013-01-06 0 1 0.0
2013-01-07 0 0 1.0
2013-01-08 1 0 1.0
2013-01-09 1 0 1.0
2013-01-10 0 0 0.0
2013-01-11 0 1 0.0
2013-01-12 1 0 1.0
2013-01-13 0 1 0.0
2013-01-14 0 0 NaN
data['c'] = data['c'].ffill()
print (data)
a b c
2013-01-01 1 0 1.0
2013-01-02 0 0 1.0
2013-01-03 0 1 0.0
2013-01-04 1 0 1.0
2013-01-05 0 0 1.0
2013-01-06 0 1 0.0
2013-01-07 0 0 0.0
2013-01-08 1 0 1.0
2013-01-09 1 0 1.0
2013-01-10 0 0 1.0
2013-01-11 0 1 0.0
2013-01-12 1 0 1.0
2013-01-13 0 1 0.0
2013-01-14 0 0 0.0
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments