Pandas to_datetime with multiindex

crayxt

how can I drop a level in multi-indexed columns when converting three columns to datetime? Below example only contains three columns while in my dateframe there are more columns, of course, and those other columns use two level names.

    >>> import pandas as pd
    >>> df = pd.DataFrame([[2010, 1, 2],[2011,1,3],[2012,2,3]])
    >>> df.columns = [['year', 'month', 'day'],['y', 'm', 'd']]
    >>> print(df)
       year month day
          y     m   d
    0  2010     1   2
    1  2011     1   3
    2  2012     2   3
    >>> pd.to_datetime(df[['year', 'month', 'day']])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/site-packages/pandas/core/tools/datetimes.py", line 512, in to_datetime
    result = _assemble_from_unit_mappings(arg, errors=errors)
  File "/usr/lib64/python2.7/site-packages/pandas/core/tools/datetimes.py", line 582, in _assemble_from_unit_mappings
    unit = {k: f(k) for k in arg.keys()}
  File "/usr/lib64/python2.7/site-packages/pandas/core/tools/datetimes.py", line 582, in <dictcomp>
    unit = {k: f(k) for k in arg.keys()}
  File "/usr/lib64/python2.7/site-packages/pandas/core/tools/datetimes.py", line 577, in f
    if value.lower() in _unit_map:
AttributeError: 'tuple' object has no attribute 'lower'

Edit: Add more columns to explain better:

>>> df = pd.DataFrame([[2010, 1, 2, 10, 2],[2011,1,3,11,3],[2012,2,3,12,2]])
>>> df.columns = [['year', 'month', 'day', 'temp', 'wind_speed'],['', '', '', 'degc','m/s']]
>>> print(df)
   year month day temp wind_speed
                  degc        m/s
0  2010     1   2   10          2
1  2011     1   3   11          3
2  2012     2   3   12          2

What I need is to combine first three columns to datetime index, leaving two last columns with data.

jezrael

Use droplevel for remove second level:

df.columns = df.columns.droplevel(1)
df = pd.to_datetime(df[['year', 'month', 'day']])
print (df)
0   2010-01-02
1   2011-01-03
2   2012-02-03
dtype: datetime64[ns]

If only 3 columns:

df.columns = df.columns.droplevel(1)
df = pd.to_datetime(df)
print (df)

0   2010-01-02
1   2011-01-03
2   2012-02-03
dtype: datetime64[ns]

If more columns:

df = pd.DataFrame([[2010, 1, 2,3],[2011,1,3,5],[2012,2,3,7]])
df.columns = [['year', 'month', 'day','a'],['y', 'm', 'd', 'b']]
print(df)
   year month day  a
      y     m   d  b
0  2010     1   2  3
1  2011     1   3  5
2  2012     2   3  7

#select datetime columns only
df1 = df[['year', 'month', 'day']]
df1.columns = df1.columns.droplevel(1)
print (df1)
   year  month  day
0  2010      1    2
1  2011      1    3
2  2012      2    3

#convert to Series
s1 = pd.to_datetime(df1)
#set new MultiIndex 
s1.name=('date','dat')
print (s1)
0   2010-01-02
1   2011-01-03
2   2012-02-03
Name: (date, dat), dtype: datetime64[ns]

#remove original columns and add new datetime Series
df = df.drop(['year', 'month', 'day'], axis=1, level=0).join(s1)
print (df)
   a       date
   b        dat
0  3 2010-01-02
1  5 2011-01-03
2  7 2012-02-03

Another solution with transpose, should be slowier in big DataFrame:

df1 = df[['year', 'month', 'day']]
s1 =  pd.to_datetime(df1.T.reset_index(drop=True, level=1).T).rename(('date', 'dat'))
print (s1)
0   2010-01-02
1   2011-01-03
2   2012-02-03
Name: (date, dat), dtype: datetime64[ns]

df1 = df.join(s1)
print (df1)
   year month day temp wind_speed       date
                  degc        m/s        dat
0  2010     1   2   10          2 2010-01-02
1  2011     1   3   11          3 2011-01-03
2  2012     2   3   12          2 2012-02-03

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

Shift DateTime index within a Pandas MultiIndex

分類Dev

Pandas dataframe to_datetime() is converting date incorrectly

分類Dev

Pandas to_datetime ValueError:不明な文字列形式

分類Dev

Pandas Dataframe Multiindex Merge

分類Dev

Pandas Multiindex Groupby on Columns

分類Dev

Pandas Multiindex Groupby on Columns

分類Dev

Querying MultiIndex DataFrame in Pandas

分類Dev

Pandas Rearranging Multiindex Series

分類Dev

Pandas 0.15.2 MultiIndex vs. 0.14.1 (datetime.date vs. pandas.tslib.Timestamp)

分類Dev

pandas to_datetime()次に、DateTimeインデックスのconcat()

分類Dev

pandas to_datetime converts non-zero padded month and day into datetime

分類Dev

pandas to_datetime()が列を検出しない

分類Dev

Pandas to_datetime - setting the date when only times (HH:MM) are input

分類Dev

Slice MultiIndex pandas DataFrame by position

分類Dev

Pandas Groupby、MultiIndex、Multiple Columns

分類Dev

Convert pandas concat of dataframes to multiindex

分類Dev

Python pandas multiindex select values

分類Dev

pandas MultiIndex assign multiple columns

分類Dev

Pandas groupby using MultiIndex values

分類Dev

Divide Pandas dataframe with multiindex by another dataframe with smaller multiindex

分類Dev

Multiindex pandas groupby + aggregate, keep full index

分類Dev

Pandas Reindexing MultiIndex Relative to Arbitrary Level

分類Dev

Split columns into MultiIndex with missing columns in pandas

分類Dev

optimize pandas query on multiple columns / multiindex

分類Dev

difference between 2 columns in Pandas Dataframe with multiindex

分類Dev

pandas: conditional select using .loc with MultiIndex

分類Dev

Returning top n values for group/multiindex in Pandas

分類Dev

How to use pandas .at function for Series with multiindex

分類Dev

convert pandas multiindex series to Json python

Related 関連記事

  1. 1

    Shift DateTime index within a Pandas MultiIndex

  2. 2

    Pandas dataframe to_datetime() is converting date incorrectly

  3. 3

    Pandas to_datetime ValueError:不明な文字列形式

  4. 4

    Pandas Dataframe Multiindex Merge

  5. 5

    Pandas Multiindex Groupby on Columns

  6. 6

    Pandas Multiindex Groupby on Columns

  7. 7

    Querying MultiIndex DataFrame in Pandas

  8. 8

    Pandas Rearranging Multiindex Series

  9. 9

    Pandas 0.15.2 MultiIndex vs. 0.14.1 (datetime.date vs. pandas.tslib.Timestamp)

  10. 10

    pandas to_datetime()次に、DateTimeインデックスのconcat()

  11. 11

    pandas to_datetime converts non-zero padded month and day into datetime

  12. 12

    pandas to_datetime()が列を検出しない

  13. 13

    Pandas to_datetime - setting the date when only times (HH:MM) are input

  14. 14

    Slice MultiIndex pandas DataFrame by position

  15. 15

    Pandas Groupby、MultiIndex、Multiple Columns

  16. 16

    Convert pandas concat of dataframes to multiindex

  17. 17

    Python pandas multiindex select values

  18. 18

    pandas MultiIndex assign multiple columns

  19. 19

    Pandas groupby using MultiIndex values

  20. 20

    Divide Pandas dataframe with multiindex by another dataframe with smaller multiindex

  21. 21

    Multiindex pandas groupby + aggregate, keep full index

  22. 22

    Pandas Reindexing MultiIndex Relative to Arbitrary Level

  23. 23

    Split columns into MultiIndex with missing columns in pandas

  24. 24

    optimize pandas query on multiple columns / multiindex

  25. 25

    difference between 2 columns in Pandas Dataframe with multiindex

  26. 26

    pandas: conditional select using .loc with MultiIndex

  27. 27

    Returning top n values for group/multiindex in Pandas

  28. 28

    How to use pandas .at function for Series with multiindex

  29. 29

    convert pandas multiindex series to Json python

ホットタグ

アーカイブ