I have the following data frame:
df1_given =pd.DataFrame.from_dict({'col_0':[0, 3], 'col_1':[0.1, 2], 'col_2':[0.2, 0], 'col_3':[0.3, 2]})
The desired data frame is as follows:
df2_result =pd. DataFrame.from_dict({'col_0_0':[0, 0, 0, 0, 0, 0], 'col_0_1':[0, 0, 0, 0, 0, 0],'col_0_2':[0, 0, 0, 0, 0, 0],
'col_1_0':[0.1, 0.1, 0.1, 0.1, 0.1, 0.1],'col_1_1':[0.1, 0.1, 0.1, 0.1, 0.1, 0.1],
'col_3_0':[0.3, 0.3, 0.3, 0.3, 0.3, 0.3], 'col_3_1':[0.3, 0.3, 0.3, 0.3, 0.3, 0.3]})
I tried using .repeat() function but it did not work. The problem is to propagate/repeat columns based on corresponding column values (i.e., row2 in df_given) and rows based on total values in row2. Note that I have huge number of columns and large values in row2 in df_given in the actual dataframe.
df_tried = pd.DataFrame(df1_given.values.repeat(df1_given.col_0, axis=0), columns = df1_given.columns)
import pandas as pd
df =pd.DataFrame.from_dict({'col_0':[0, 3], 'col_1':[0.1, 2], 'col_2':[0.2, 0], 'col_3':[0.3, 2]})
from collections import defaultdict
my_dict = defaultdict(list)
cols = list(df.columns)
for i in range (0,len(cols)):
if (df.iloc[1,i])>0:
for x in range(0,int(df.iloc[1,i])):
y = str(cols[i])
my_dict[y+'_'+str(x)].append( df.iloc[0,i])
df_2 = pd.DataFrame(my_dict)
print(df_2)
but I don't understand the logic behind you wait for 6 rows instead of 1 at the result.but I think this solves your problem.
df_2
Out[57]:
col_0_0 col_0_1 col_0_2 col_1_0 col_1_1 col_3_0 col_3_1
0 0 0 0 0.1 0.1 0.3 0.3
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments