Let df1
, df2
, and df3
are pandas.DataFrame
's having the same structure but different numerical values. I want to perform:
res=if df1>1.0: (df2-df3)/(df1-1) else df3
res
should have the same structure as df1
, df2
, and df3
have.
numpy.where()
generates result as a flat array.
Edit 1:
res
should have the same indices as df1
, df2
, and df3
have.
For example, I can access df2
as df2["instanceA"]["parameter1"]["paramter2"]
. I want to access the new calculated DataFrame/Series res
as res["instanceA"]["parameter1"]["paramter2"]
.
Actually numpy.where
should work fine there. Output here is 4x2 (same as df1, df2, df3).
df1 = pd.DataFrame( np.random.randn(4,2), columns=list('xy') )
df2 = pd.DataFrame( np.random.randn(4,2), columns=list('xy') )
df3 = pd.DataFrame( np.random.randn(4,2), columns=list('xy') )
res = df3.copy()
res[:] = np.where( df1 > 1, (df2-df3)/(df1-1), df3 )
x y
0 -0.671787 -0.445276
1 -0.609351 -0.881987
2 0.324390 1.222632
3 -0.138606 0.955993
Note that this should work on both series and dataframes. The [:]
is slicing syntax that preserves the index and columns. Without that res
will come out as an array rather than series or dataframe.
Alternatively, for a series you could write as @Kadir does in his answer:
res = pd.Series(np.where( df1>1, (df2-df3)/(df1-1), df3 ), index=df1.index)
Or similarly for a dataframe you could write:
res = pd.DataFrame(np.where( df1>1, (df2-df3)/(df1-1), df3 ), index=df1.index,
columns=df1.columns)
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments