Sklearn自定义转换器:使用FunctionTransformer和子类化TransformerMixin:间的区别
问题说明
为了执行适当的CV,建议使用流水线,以便可以对CV中的每个折叠应用相同的转换.我可以使用 sklearn.preprocessing.FunctionTrasformer
或子类化sklearn.base.TransformerMixin
来定义自定义转换.推荐哪种方法?为什么?
In order to do proper CV it is advisable to use pipelines so that same transformations can be applied to each fold in the CV. I can define custom transformations by using either sklearn.preprocessing.FunctionTrasformer
or by subclassing sklearn.base.TransformerMixin
. Which one is the recommended approach? Why?
正确答案
完全由您决定,两者都会或多或少地取得相同的结果,只是编写代码的方式有所不同.
Well it is totally upto you, both will achieve the same results more or less, only the way you write the code differs.
例如,在使用 sklearn.preprocessing.FunctionTransformer
时,您可以简单地定义要使用的函数并像这样直接调用它(此博客发布)
On the other hand, while using subclassing sklearn.base.TransformerMixin
you will have to define the whole class along with the fit
and transform
functions of the class. So you will have to create a class like this(Example code take from this blog post)
class FunctionFeaturizer(TransformerMixin):
def __init__(self, *featurizers):
self.featurizers = featurizers
def fit(self, X, y=None):
return self
def transform(self, X):
#Do transformations and return
return transformed_data
如您所见,与FunctionTransformer相比, TransformerMixin
在转换功能方面为您提供了更大的灵活性.您可以根据值应用多个转换或部分转换,等等.例如,对于要记录的前50个值,而对于随后的50个值,则希望取反对数,依此类推.您可以轻松定义自己的转换方法,以选择性地处理数据.
So as you can see, TransformerMixin
gives you more flexibility as compared to FunctionTransformer with regard to transform function. You can apply multiple trasnformations, or partial transformation depending on the value, etc. An example can be like, for the first 50 values you want to log while for the next 50 values you wish to take inverse log and so on. You can easily define your transform method to deal with data selectively.
如果您只是想直接使用函数,请使用 sklearn.preprocessing.FunctionTrasformer
,否则,如果您想进行更多修改或说复杂的转换,建议使用子类化sklearn.base.TransformerMixin
If you just want to directly use a function as it is, use sklearn.preprocessing.FunctionTrasformer
, else if you want to do more modification or say complex transformations, I would suggest subclassing sklearn.base.TransformerMixin
在这里,请看以下链接以获得更好的主意
Here, take a look at the following links to get a more better idea
这篇好文章是转载于:学新通技术网
- 版权申明: 本站部分内容来自互联网,仅供学习及演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,请提供相关证据及您的身份证明,我们将在收到邮件后48小时内删除。
- 本站站名: 学新通技术网
- 本文地址: /reply/detail/tanhcfjecj
-
YouTube API 不能在 iOS (iPhone/iPad) 工作,但在桌面浏览器工作正常?
it1352 07-30 -
iPhone,一张图像叠加到另一张图像上以创建要保存的新图像?(水印)
it1352 07-17 -
保持在后台运行的 iPhone 应用程序完全可操作
it1352 07-25 -
使用 iPhone 进行移动设备管理
it1352 07-23 -
在android同时打开手电筒和前置摄像头
it1352 09-28 -
扫描 NFC 标签时是否可以启动应用程序?
it1352 08-02 -
检查邮件是否发送成功
it1352 07-25 -
Android微调工具-删除当前选择
it1352 06-20 -
希伯来语的空格句子标记化错误
it1352 06-22 -
Android App 和三星 Galaxy S4 不兼容
it1352 07-20