I have a python dictionary with nested lists, that I would like to turn into a pandas DataFrame
a = {'A': [1,2,3], 'B':['a','b','c'],'C':[[1,2],[3,4],[5,6]]}
I would like the final DataFrame to look like this:
> A B C
> 1 a 1
> 1 a 2
> 2 b 3
> 2 b 4
> 3 c 5
> 3 c 6
When I use the DataFrame command it looks like this:
pd.DataFrame(a)
> A B C
>0 1 a [1, 2]
>1 2 b [3, 4]
>2 3 c [5, 6]
Is there anyway I make the data long by the elements of C?
This is what I came up with:
In [53]: df
Out[53]:
A B C
0 1 a [1, 2]
1 2 b [3, 4]
2 3 c [5, 6]
In [58]: s = df.C.apply(Series).unstack().reset_index(level=0, drop = True)
In [59]: s.name = 'C2'
In [61]: df.drop('C', axis = 1).join(s)
Out[61]:
A B C2
0 1 a 1
0 1 a 2
1 2 b 3
1 2 b 4
2 3 c 5
2 3 c 6
apply(Series)
gives me a DataFrame with two columns. To join them into one while keeping the original index, I use unstack
. reset_index
removes the first level of the index, which basically holds the index of the value in the original list which was in C. Then I join it back into the df.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments