I want to normalize a pandas dataframe with all columns together
In [8]: df
Out[8]:
x y
0 1 2
1 2 3
2 3 4
I do
df_nor = (df-df.min())/(df.max()-df.min())
OUT:
In [10]: df_nor
Out[10]:
x y
0 0.0 0.0
1 0.5 0.5
2 1.0 1.0
How can I get column x and y be normalized together like
In [10]: df_nor
Out[10]:
x y
0 0.000 0.333
1 0.333 0.666
2 0.666 1.000
Thanks!
Since it's NumPy tagged, here's one using the underlying array data -
In [54]: a = df.values # get underlying array
In [55]: pd.DataFrame((a-a.min())/(a.max()-a.min()), columns=df.columns)
Out[55]:
x y
0 0.000000 0.333333
1 0.333333 0.666667
2 0.666667 1.000000
Alternatively staying closer to pandas
, we could do -
In [79]: (df-df.values.min())/(df.values.max()-df.values.min())
Out[79]:
x y
0 0.000000 0.333333
1 0.333333 0.666667
2 0.666667 1.000000
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments