How to Update Value in First N Rows by Group in a Multi-Index Pandas Dataframe?

Slee Published at Dev

slee

I am attempting to update the first N rows in a multi-index dataframe but was having a bit of trouble finding a solution so thought I'd create a post for it.

The example code is as follows:

# Imports
import numpy as np
import pandas as pd

# Set Up Data Frame
dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), columns=['A', 'B', 'C', 'D'])
df['DATE'] = dates
df['CATEGORY'] = ['A','B','A','B','A','B','A','B']

# Set Index
df.set_index(['CATEGORY','DATE'],inplace=True)
df.sort(inplace=True)

# Get First Two Rows of Each Category
df.groupby(level=0).apply(lambda x: x.iloc[0:2])

# Set The Value of Column 'C' Equal to Zero
# ???

So I was able to get as far as selecting the rows using "iloc", but after that I'm not sure how to set column "C" equal to zero. Feels like maybe I'm going about this the wrong way though. Any help would be greatly appreciated. Thanks!

chrisb

How about this - first define a function that takes a dataframe, and replaces the first x records with a specified value.

def replace_first_x(group_df, x, value):
    group_df.iloc[:x, :] = value
    return group_df

Then, pass that into the groupby object with apply.

In [97]: df.groupby(level=0).apply(lambda df: replace_first_x(df, 2, 9999))
Out[97]: 
                               A            B            C            D
CATEGORY DATE                                                          
A        2000-01-01  9999.000000  9999.000000  9999.000000  9999.000000
         2000-01-03  9999.000000  9999.000000  9999.000000  9999.000000
         2000-01-05     1.590503     0.948911    -0.268071     0.622280
         2000-01-07    -0.493866     1.222231     0.125037     0.071064
B        2000-01-02  9999.000000  9999.000000  9999.000000  9999.000000
         2000-01-04  9999.000000  9999.000000  9999.000000  9999.000000
         2000-01-06     1.663430    -1.170716     2.044815    -2.081035
         2000-01-08     1.593104     0.108531    -1.381218    -0.517312

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-02-11

Comments

0 comments

From Java

Related Related

Article

How to Update Value in First N Rows by Group in a Multi-Index Pandas Dataframe?

How to Update Value in First N Rows by Group in a Multi-Index Pandas Dataframe?

How to update dataframe based on dependent value in pandas?

How to group dataframe rows into list in pandas groupby

Pandas DataFrame: How to Create Multi Column Index

How do I get the first timestamp (index) of a group when applying groupby to a python pandas dataframe?

How to reindex a multi-index pandas dataframe?

Update Specific Pandas Rows with Value from Different Dataframe

Show first 10 rows of multi-index pandas dataframe

Pandas: remove rows of dataframe with unique index value

How to select cells greater than a value in a multi-index Pandas dataframe?

pandas: return first N rows of each secondary index of dataframe

How to Update a group of rows

Pandas: Collapse first n rows in each group by aggregation

How to select consecutive rows from a multi-index pandas dataframe?

Pandas DataFrame how to group (pivot?) rows by values of specified columns, but keeping the original index?

Sort by both index and value in Multi-indexed data of Pandas dataframe

Pandas Dataframe: get average of first rows of each subgroup within a group

how to drop rows where the index succeed another index value in a dataframe?

How to rearrange the order of the rows of this multi-index pandas dataframe?

How to reindex a multi-index pandas dataframe?

Pandas: remove rows of dataframe with unique index value

Pandas dataframe remove rows based on index and column value

sqlite: update all rows in group with value from first row

How to divide pandas dataframe's value by its first row by each group?

Creating a histogram for each value in multi-index pandas dataframe

Pandas: How to have multi index on both rows and columns of a dataframe?

Transposing multi index dataframe in pandas

How to merge pandas dataframes while removing rows from first dataframe which have the same index?

Sum duplicated rows on a multi-index pandas dataframe

How can I use pandas to set a value for all rows that match part of a multi part index