当数据帧存在NaN时使用类型错误

Question

问题说明

df
     A     B  
0   a=10   b=20.10
1   a=20   NaN
2   NaN    b=30.10
3   a=40   b=40.10

我尝试过:

df['A'] = df['A'].str.extract('(\d )').astype(int)
df['B'] = df['B'].str.extract('(\d )').astype(float)

但是出现以下错误:

ValueError:无法将float NaN转换为整数

ValueError: cannot convert float NaN to integer

并且:

AttributeError:只能将.str访问器与字符串值一起使用，后者在熊猫中使用np.object_ dtype

AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

我该如何解决?

Answer 1

正确答案

#1

如果缺少列中的某些值(NaN)，然后将其转换为数字，则dtype始终为float.您不能将值转换为int.仅限于float，因为NaN的type是float.

If some values in column are missing (NaN) and then converted to numeric, always dtype is float. You cannot convert values to int. Only to float, because type of NaN is float.

print (type(np.nan))
<class 'float'>

请参见 docs 如何在以下情况下转换值:至少一个NaN:

See docs how convert values if at least one NaN:

整数>转换为float64

integer > cast to float64

如果需要int值，则需要将NaN替换为某些int，例如0通过 fillna 效果很好:

If need int values you need replace NaN to some int, e.g. 0 by fillna and then it works perfectly:

df['A'] = df['A'].str.extract('(\d )', expand=False)
df['B'] = df['B'].str.extract('(\d )', expand=False)
print (df)
     A    B
0   10   20
1   20  NaN
2  NaN   30
3   40   40

df1 = df.fillna(0).astype(int)
print (df1)
    A   B
0  10  20
1  20   0
2   0  30
3  40  40

print (df1.dtypes)
A    int32
B    int32
dtype: object

这篇好文章是转载于：学新通技术网

当数据帧存在NaN时使用类型错误

问题说明

正确答案

YouTube API 不能在 iOS (iPhone/iPad) 工作，但在桌面浏览器工作正常?

iPhone，一张图像叠加到另一张图像上以创建要保存的新图像?(水印)

保持在后台运行的 iPhone 应用程序完全可操作

使用 iPhone 进行移动设备管理

在android同时打开手电筒和前置摄像头

扫描 NFC 标签时是否可以启动应用程序?

检查邮件是否发送成功

Android微调工具-删除当前选择

希伯来语的空格句子标记化错误

Android App 和三星 Galaxy S4 不兼容