Calculate difference of adjacent rows (decimal numbers) in a data frame for each group defined in a different column

debugcn 投稿 Dev

ShrutiTurner

I have a data frame with three columns of interest, 'time', 'peak' and 'cycle'. I want to calculate the time elapsed between each row for a given cycle.

   time  peak  cycle
0     1     1      1
1     2     0      1
2   3.5     0      1
3   3.8     1      2
4     5     0      2
5   6.2     0      2
6     7     0      2

I want to add a fourth column, so the data frame would look like this when complete:

   time  peak  cycle  time_elapsed
0     1     1      1             0
1     2     0      1             1
2   3.5     0      1           1.5
3   3.8     1      2             0
4     5     0      2           1.2
5   6.2     0      2           1.2
6     7     0      2           0.8

The cycle number is calculated based on the peak information, so I don't think I need to refer to both columns.

data['time_elapsed'] = data['time'] - data['time'].shift()

Applying the above code I get:

   time  peak  cycle  time_elapsed
0     1     1      1             0
1     2     0      1             1
2   3.5     0      1           1.5
3   3.8     1      2           0.3
4     5     0      2           1.2
5   6.2     0      2           1.2
6     7     0      2           0.8

Is there a way to "reset" the calculation every time the value in 'peak' is 1?Any tips or advice would be appreciated!

jezrael

Subtract first value per groups converted in Series by GroupBy.transform with GroupBy.first:

df['time_elapsed'] = df['time'].sub(df.groupby('cycle')['time'].transform('first'))
print (df)
   time  peak  cycle  time_elapsed
0     1     1      1             0
1     2     0      1             1
2     3     0      1             2
3     4     1      2             0
4     5     0      2             1
5     6     0      2             2
6     7     0      2             3

For adding reset add new Series with Series.cumsum - if values are only 1 or 0 in peak column:

s = df['peak'].cumsum()
df['time_elapsed'] = df['time'].sub(df.groupby(['cycle', s])['time'].transform('first'))

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集2021-06-10

コメントを追加

サインイン

分類Dev

Max value of for each column for distinct rows of data frame in r

分類Dev

For each group in a column, get only the rows with values in another column closest to a defined set

分類Dev

Calculate difference between values in rows by group

分類Dev

R repeat and increase numbers of data frame column, based on text in different column

分類Dev

Select/Group rows from a data frame with the nearest values for a specific column(s)

分類Dev

User grpl on each element of a dataframe column to find a string in a different data frame

分類Dev

Different Column number for each rows in GridView android

分類Dev

Diff on each subset of a data frame column

分類Dev

R: Efficiently extract rows with different element in specified column by group in data.table

分類Dev

Compare the two column in different data frame in pandas

分類Dev

Data frame with different number of values for a column

分類Dev

How to calculate the t test for each groups in a data frame in R

分類Dev

TSQL: Find odd and even numbers in each column for all rows

分類Dev

R: Convert list with different number of rows to data.frame

分類Dev

R: return row and column numbers of matches in a data frame

分類Dev

Filter group of rows based on sum of values from different column

分類Dev

Unite adjacent ranges from different rows in MySQL

分類Dev

Flag column based on min(date) of different rows each 2 months

分類Dev

mass removal of rows from a data frame based on a column condition

分類Dev

Creating a column by addition of two adjacent rows with a condition

分類Dev

Add a column to a data frame that index the number of occurrences in a group

分類Dev

group by toggling Rows and difference in rows

分類Dev

group by toggling Rows and difference in rows

分類Dev

Each column of a data.frame as a factor of two levels

分類Dev

filtering each column of a data frame an put NA for unmatched values

分類Dev

repeating the rows of a data frame

分類Dev

First difference data frame

分類Dev