Currently I am using pandas to read a csv file into a DataFrame
, using the first column as the index. The first column is in ISO 8601 format, so according to the documentation for read_csv, it should be recognized as a datetime:
In [1]: import pandas as pd
In [2]: df = pd.read_csv('data.csv', index_col=0)
In [3]: print df.head()
U V Z Ubar Udir
2014-11-01 00:00:00 0.73 -0.81 0.46 1.0904 317.97
2014-11-01 01:00:00 1.26 -1.50 0.32 1.9590 319.97
2014-11-01 02:00:00 1.50 -1.80 0.13 2.3431 320.19
2014-11-01 03:00:00 1.39 -1.65 0.03 2.1575 319.89
2014-11-01 04:00:00 0.94 -1.08 -0.03 1.4318 318.96
However, when querying the index dtype, it returns 'object':
In [4]: print df.index.dtype
object
I then have to manually convert it to datetime dtype:
In [5]: df.index = pd.to_datetime(df.index)
In [6]: print df.index.dtype
datetime64[ns]
Is there any way to automatically have the index set to datetime dtype when calling read_csv()
?
read_csv documentation describes parse_dates parameter:
parse_dates : boolean or list of ints or names or list of lists or dict, default False
- boolean. If True -> try parsing the index.
- list of ints or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column.
- list of lists. e.g. If [[1, 3]] -> combine columns 1 and 3 and parse as a single date column.
- dict, e.g. {‘foo’ : [1, 3]} -> parse columns 1, 3 as date and call result ‘foo’
Note: A fast-path exists for iso8601-formatted dates.
Since you want to parse index you can use:
import pandas as pd
df = pd.read_csv('data.csv', index_col=0, parse_dates=True)
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加