Pandas - groupby ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead

回答 2 浏览 1267 2023-05-02

我把我的Pandas从1.5.1更新到2.0.1。无论如何,我开始在一些以前工作正常的代码上得到一个错误。

df = df.groupby(df['date'].dt.date)['Lake', 'Canyon'].mean().reset_index()

Traceback (most recent call last): File "f:...\My_python_file.py", line 37, in df = df.groupby(df['date'].dt.date)['Lake', 'Canyon'].mean().reset_index() File "C:\Users...\Local\Programs\Python\Python310\lib\site-packages\pandas\core\groupby\generic.py", line 1767, in getitem raise ValueError( ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead.

Shane S 提问于2023-05-02
2 个回答
#1楼 已采纳
得票数 6

如果不使用双括号选择多列,Pandas < 2.0.0之前的版本会引发FutureWarning的问题

FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead

Pandas >= 2.0.0,它提出了一个ValueError

ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead.

例如:

# Pandas < 2.0.0
#             Missing [[ ... ]] --v              --v
>>> df.groupby(df['date'].dt.date)['Lake', 'Canyon'].mean().reset_index()
...
FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.
  df.groupby(df['date'].dt.date)['Lake', 'Canyon'].mean().reset_index()

# Pandas >= 2.0.0
>>> df.groupby(df['date'].dt.date)['Lake', 'Canyon'].mean().reset_index()
...
ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead.

[[col1, col2, ...]]来修复这个问题:

>>> df.groupby(df['date'].dt.date)[['Lake', 'Canyon']].mean().reset_index()
         date  Lake  Canyon
0  2023-05-02   1.5     3.5

最小可重现示例:

import pandas as pd

df = pd.DataFrame({'date': ['2023-05-02 12:34:56', '2023-05-02 12:32:12'], 
                   'Lake': [1, 2], 'Canyon': [3, 4]})
df['date'] = pd.to_datetime(df['date'])
print(df)

# Output
                 date  Lake  Canyon
0 2023-05-02 12:34:56     1       3
1 2023-05-02 12:32:12     2       4
Corralien 提问于2023-05-02
Corralien 修改于2023-05-02
#2楼
得票数 1

我发现的解决方案是添加额外的[ ],如:

df = df.groupby(df['date'].dt.date)[['Lake', 'Canyon']].mean().reset_index()
Shane S 提问于2023-05-02