Update rows in Pandas Dataframe based on the list values

Antonych Published at Java

Antonych

I have a DataFrame that looks like this:

product units_sold  week
 sku        5        W01
 sku        3        W02
 sku        2        W03
 sku        4        W04
 sku        6        W05
 sku        5        W36
 sku        3        W38
 sku        2        W39
 sku        4        W40

In the 'week', w37 is missing, and columns are in a wrong sequence:

I have a list with all needed rows and right sequence:

week_list = ['W36','W37','W38','W39','W40','W01','W02','W03','W04','W05']

desired output is:

   product units_sold  week
     sku        5        W36
     sku        0        W37
     sku        5        W38
     sku        2        W39
     sku        4        W40
     sku        6        W01
     sku        3        W02
     sku        2        W03
     sku        4        W04
     sku        6        W05

Where W37 is on place and 'unit_sold' is 0

Just add one column solution is not suited, as I have quite a big DataFrame, and there is probably can be other missing rows.

I tried with pd.sort_values and pd.categorical:

def sorter(column):
    reorder = week_list
    cat = pd.Categorical(column, categories=reorder, ordered=True)
    return pd.Series(cat)

df.sort_values(by="week", key=sorter)

this helped me to set the right sequence, but 'w37' is still missing, so the problem is not solved.

Is there any way I can update a DataFrame rows values based on this list?

Henry Yik

IIUC just use reindex:

print (df.set_index("week").reindex(week_list).fillna({"product":"sku", "units_sold": 0}))

     product  units_sold
week                    
W36      sku         5.0
W37      sku         0.0
W38      sku         3.0
W39      sku         2.0
W40      sku         4.0
W01      sku         5.0
W02      sku         3.0
W03      sku         2.0
W04      sku         4.0
W05      sku         6.0

Collected from the Internet

Please contact [email protected] to delete if infringement.