I have a pandas dataframe with dense rank matrix and want to select all the cells that have 2. And then transform it to results dataframe like below. I am looping through each column and row with just for-loop but is there a better way?
df looks like
A B C ........ x 2000 columns
AA 1 3 2
BB 2 1 3
CC 2 2 1
.
.
.
x
2000 rows
results_df to be like
Col1 Col2
0 A BB
1 A CC
2 B CC
3 C AA
Here is one method.
rows, cols = np.nonzero((df==2).values)
results_df = pandas.DataFrame({
'Col1':[df.columns[c] for c in cols],
'Col2':[df.index[r] for r in rows]
}).sort('Col1').reset_index(drop=True)
For example:
In [88]: df
Out[88]:
A B C
AA 1 3 2
BB 2 1 3
CC 2 2 1
In [89]: pandas.DataFrame({'Col1':[df.columns[c] for c in cols], 'Col2':[df.index[r] for r in rows]}).sort('Col1').reset_index(drop=True)
Out[89]:
Col1 Col2
0 A BB
1 A CC
2 B CC
3 C AA
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments