I have the following pandas.DataFrame
:
time
offset ts op
0.000000 2015-10-27 18:31:40.318 BuildIndex 282.604
Compress 253.649
Decompress 2.953
Deserialize 0.063
InsertIndex 1.343
4.960683 2015-10-27 18:36:37.959 BuildIndex 312.249
Compress 280.747
Decompress 2.844
Deserialize 0.110
InsertIndex 0.907
Now I need to update the dataframe (in-place is OK): for each group, subtract the time for op == 'Compress'
from the one for op == 'BuildIndex'
- within the same group.
What is the most elegant way to do it in pandas?
I'd use xs (cross-section) to do this:
In [11]: df1.xs("Compress", level="op")
Out[11]:
time
offset ts
0.000000 2015-10-27 18:31:40.318 253.649
4.960683 2015-10-27 18:36:37.959 280.747
In [12]: df1.xs("BuildIndex", level="op")
Out[12]:
time
offset ts
0.000000 2015-10-27 18:31:40.318 282.604
4.960683 2015-10-27 18:36:37.959 312.249
In [13]: df1.xs("BuildIndex", level="op") - df1.xs("Compress", level="op")
Out[13]:
time
offset ts
0.000000 2015-10-27 18:31:40.318 28.955
4.960683 2015-10-27 18:36:37.959 31.502
The subtraction works on the index labels (in this case offset and ts), so no need to group.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments