How do I refer to the index of my Pandas dataframe?

orome Published at Dev

orome

I have a Pandas dataframe where I have designated some of the columns as indices:

planets_dataframe.set_index(['host','name'], inplace=True)

and would like to be able to refer to these indices in a variety of contexts. Using the name of an index works fine in queries

planets_dataframe.query('host == "PSR 1257 12"')

but results in an error if try to use it to get a list of the values of an index as I could when it was a column

planets_dataframe.name
#AttributeError: 'DataFrame' object has no attribute 'name'

or to use it to list results as I could when it was a "regular" column

planets_dataframe.query('30 > mass > 20 and discoveryyear > 2009')['name']
#KeyError: u'no item named name'

How do I refer to the "columns" of the dataframe that I'm using as indexes?

Before set_index:

planets_dataframe.columns
# Index([u'name', u'lastupdate', u'temperature', u'semimajoraxis', u'discoveryyear', u'calculated', u'period', u'age', u'mass', u'host', u'verification', u'transittime', u'eccentricity', u'radius', u'discoverymethod', u'inclination'], dtype='object')

After set_index:

planets_dataframe.columns
#Index([u'lastupdate', u'temperature', u'semimajoraxis', u'discoveryyear', u'calculated', u'period', u'age', u'mass', u'verification', u'transittime', u'eccentricity', u'radius', u'discoverymethod', u'inclination'], dtype='object')

BrenBarn

I think you have a slight misunderstanding of what indexes are. You don't just "designate" columns as indexes; that is, you don't just "tag" certain columns with info that says "this is an index". The index is a separate data structure that can hold data that aren't even present in the columns. If you do set_index, you move those columns into the index, so they no longer exist as regular columns. This is why you can no longer use them in the ways you mention: they aren't there anymore.

One thing you can do is, when using set_index, pass drop=False to tell it to keep the columns as columns in addition to putting them in the index (effectively copying them to the index rather than moving them), e.g., df.set_index('SomeColumn', drop=False). However, you should be aware that the index and column are still distinct, so for instance if you modify the column values this will not affect what's stored in the index.

The upshot is that indexes aren't really columns of the DataFrame, so if you want to be able to use some data as both an index and a column, you need to duplicate it in both places. There is some discussion of this issue here.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-02-8

Comments

0 comments

From Dev

How do I turn a Pandas DataFrame object with 1 main column into a Pandas Series with the index column from the original DataFrame

From Java

Pandas dataframe index causing problems when indexing subset of dataframe. How do I remove the indexes, or prevent the error from occurring?

From Dev

Related Related

Article