How can I access the columns after the stack function is applied on a dataframe?
For example, if I have a dataframe such as:
df11 = pd.DataFrame(np.random.randn(5, 3), columns=['a', 'b', 'c'])
a b c
0 -1.108734 0.458352 -1.567971
1 1.656508 -0.091190 -0.700334
2 -1.278772 0.034386 0.680842
3 1.133447 0.710459 -0.562747
4 0.563312 -0.346689 -0.883099
df11.stack() produces:
0 a -1.108734
b 0.458352
c -1.567971
1 a 1.656508
b -0.091190
c -0.700334
2 a -1.278772
b 0.034386
c 0.680842
3 a 1.133447
b 0.710459
c -0.562747
4 a 0.563312
b -0.346689
c -0.88309
However these new columns don't have a name, and I can't seem to find a way to access them.
That's because there aren't any columns; they're now MultiIndex levels on a Series:
>>> s.index
MultiIndex(levels=[[0, 1, 2, 3, 4], [u'a', u'b', u'c']],
labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4], [0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]])
There are lots of ways to get at what's inside, depending on what form you need it in:
>>> s.index.get_values()
array([(0L, 'a'), (0L, 'b'), (0L, 'c'), (1L, 'a'), (1L, 'b'), (1L, 'c'),
(2L, 'a'), (2L, 'b'), (2L, 'c'), (3L, 'a'), (3L, 'b'), (3L, 'c'),
(4L, 'a'), (4L, 'b'), (4L, 'c')], dtype=object)
>>> s.index.get_level_values(0)
Int64Index([0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4], dtype='int64')
>>> s.index.get_level_values(1)
Index([u'a', u'b', u'c', u'a', u'b', u'c', u'a', u'b', u'c', u'a', u'b', u'c', u'a', u'b', u'c'], dtype='object')
or even:
>>> s.reset_index()
level_0 level_1 0
0 0 a 1.419391
1 0 b 1.142944
2 0 c 0.413431
3 1 a 0.705091
4 1 b -1.846493
5 1 c -0.756824
[etc.]
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments