遭遇した問題を探しているパンダの時間列のグループ化diff



Pandas Time Column Grouping Seeking Problems Encountered Diff



例:

df = pd.DataFrame() df['A'] = [1, 1, 2] df['B'] = [datetime.date(2018, 1, 2), datetime.date(2018, 1, 3), datetime.date(2018, 1, 3)] df['C'] = df.groupby('A').B.diff() df['C'] = df.C.dt.days

エラー:



Traceback (most recent call last): File 'D:python_virtualenvcommonlibsite-packagespandas-0.20.3-py3.6-win-amd64.eggpandascoreseries.py', line 2820, in _make_dt_accessor return maybe_to_datetimelike(self) File 'D:python_virtualenvcommonlibsite-packagespandas-0.20.3-py3.6-win-amd64.eggpandascoreindexesaccessors.py', line 84, in maybe_to_datetimelike 'datetimelike index'.format(type(data))) TypeError: cannot convert an object of type to a datetimelike index During handling of the above exception, another exception occurred: Traceback (most recent call last): File 'D: / learning /pandas_test/pandas_learn_20190102.py', line 49, in test2() File 'D: / learning /pandas_test/pandas_learn_20190102.py', line 32, in test2 df['C'] = df.C.dt.days File 'D:python_virtualenvcommonlibsite-packagespandas-0.20.3-py3.6-win-amd64.eggpandascoregeneric.py', line 3077, in __getattr__ return object.__getattribute__(self, name) File 'D:python_virtualenvcommonlibsite-packagespandas-0.20.3-py3.6-win-amd64.eggpandascorease.py', line 243, in __get__ return self.construct_accessor(instance) File 'D:python_virtualenvcommonlibsite-packagespandas-0.20.3-py3.6-win-amd64.eggpandascoreseries.py', line 2822, in _make_dt_accessor raise AttributeError('Can only use .dt accessor with datetimelike ' AttributeError: Can only use .dt accessor with datetimelike values

理由:
差分要求パケットは次の結果です。

A B C 0 1 2018-01-02 NaT 1 1 2018-01-03 1 days 00:00:00 2 2 2018-01-03 NaN

タイプ:



A int64 B object C object dtype: object

予想されるタイプは次のとおりです。

A int64 B object C timedelta64[ns] dtype: object

解決する:
元々、オブジェクト列にastype forceを使用しようとしましたが、列はtimedeltaに変わりました

df['C'] = df.C.astype(pd.Timedelta)

このコードは文句を言いませんが、列Cのタイプは変更されず、影響はありません。



最後に、2つのアプローチがあります。
Bは、事前に時間列として定義されています。

df = pd.DataFrame() df['A'] = [1, 1, 2] df['B'] = [datetime.date(2018, 1, 2), datetime.date(2018, 1, 3), datetime.date(2018, 1, 3)] df.B = pd.to_datetime(df.B) df['C'] = df.groupby('A').B.diff() df['C'] = df.C.dt.days

型変換を増やす:

df = pd.DataFrame() df['A'] = [1, 1, 2] df['B'] = [datetime.date(2018, 1, 2), datetime.date(2018, 1, 3), datetime.date(2018, 1, 3)] df['C'] = df.groupby('A').B.diff() df['C'] = pd.to_timedelta(df.C, unit='d').dt.days