论文阅读Twin Neural Network Regression
@ARTICLE{SebastianKevin2022Twin,
title = {Twin neural network regression},
author = {Sebastian Johann Wetzel and Kevin Ryczko and Roger Gordon Melko and Isaac Tamblyn},
journal = {Applied AI Letters},
year = {2022},
volume = {3},
number = {4},
pages = {e78},
doi = {10.1002/ail2.78}
}
1. 摘要
We introduce twin neural network (TNN) regression.
This method
predicts differences
between the target values of two different data points rather than the targets themselves.
The solution of a traditional regression problem is then obtained by averaging over an ensemble of all predicted differences between the targets of an unseen data point and all training data points.
Whereas ensembles are normally costly to produce, TNN regression intrinsically creates an ensemble of predictions of twice the size of the training set while only training a single neural network.
虽然集合通常是昂贵的生产,但TNN回归本质上创建的预测集合是训练集大小的两倍,同时只训练单个神经网络。为什么是这样的,阅读后面的内容值得注意。
Since ensembles have been shown to be more accurate than single models this property naturally transfers to TNN regression.
We show that TNNs are able to compete or yield more accurate predictions for different data sets, compared to other state-of-the-art methods.
Furthermore, TNN regression is constrained by self-consistency conditions.
We find that the violation of these conditions provides an estimate for the prediction uncertainty.
Note:
全文中主要出现了两个关键字,esemble
和 self-consistency
。
2. 算法描述
从这张图中,可以大概的看出算法的华点。经典的神经网络主要是直接预测一个值,而TNNR是预测两个向量之间的距离。这样就将原本预测未知点的值转化为了预测已知点与未知点之间的差值。值得注意的是,twin neural network
也叫孪生网络(siamese neural network),是度量学习中的内容。
从图中的环,可以同样推出self-consistency
。也就是说:
( y 3 − y 1 ) ( y 1 − y 2 ) ( y 2 − y 3 ) = 0 (y_3-y_1) (y_1-y_2) (y_2-y_3) = 0 (y3−y1) (y1−y2) (y2−y3)=0
F ( x 3 , x 1 ) F ( x 1 , x 2 ) F ( x 2 , x 3 ) = 0 (1) F(x_3, x_1) F(x_1, x_2) F(x_2, x_3) = 0 \tag{1} F(x3,x1) F(x1,x2) F(x2,x3)=0(1)。
其中,等式1表述的就是self-consistency
。
算法细节:
- The training objective is to minimize the
mean squared error
on the training set. - we employ standard gradient descent methods adadelta (and rmsprop) to minimize the loss on a batch of 16 pairs at each iteration.
- All data is split into 90% training, 5% validation, and 5% test data. Each run is performed on a randomly chosen different split of the data.
- we train on a generator which generates all possible pairs batchwise before reshuffling.
3. 实验
我一般是不会仔细看实验的,在这篇论文中我看到一个有意思的点。
3.1. | Prediction accuracy
论文中说,TNNR算法的优势是将训练集拓充到了二次方,但是在实际实验中,在大训练集上,TNNR
反而会变差。
If the training set is very large, the number of pairs increases quadratically to a point where the TNN will in practice converge to a minimum before observing all possible pairs. At that point, the TNN begins to lose its advantages in terms of prediction accuracy.
其实,我觉得主要是模型的参数量太小,训练集变大,限制了神经网络的学习能力。
3.2. | Prediction uncertainty estimation
利用self-consistency
的违反来建模预测不确定性。但是在实验部分的表述我不太能看懂。
这篇好文章是转载于:学新通技术网
- 版权申明: 本站部分内容来自互联网,仅供学习及演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,请提供相关证据及您的身份证明,我们将在收到邮件后48小时内删除。
- 本站站名: 学新通技术网
- 本文地址: /boutique/detail/tanhgcihbh
-
论文阅读|ViTPose
-
论文简述和翻译PSMNetPyramid Stereo Matching NetworkCVPR 2018
-
论文笔记ViT An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
-
论文笔记ICML‘21 FedGNN: Federated Graph Neural Network for Privacy-Preserving Recommendation
-
论文笔记四影像图水体识别和提取技术研究综述
-
EGO-Planner: An ESDF-free Gradient-based Local Planner for Quadrotors论文笔记
-
photoshop保存的图片太大微信发不了怎么办
PHP中文网 06-15 -
Android 11 保存文件到外部存储,并分享文件
Luke 10-12 -
《学习通》视频自动暂停处理方法
HelloWorld317 07-05 -
word里面弄一个表格后上面的标题会跑到下面怎么办
PHP中文网 06-20 -
photoshop扩展功能面板显示灰色怎么办
PHP中文网 06-14 -
微信公众号没有声音提示怎么办
PHP中文网 03-31 -
excel下划线不显示怎么办
PHP中文网 06-23 -
怎样阻止微信小程序自动打开
PHP中文网 06-13 -
excel打印预览压线压字怎么办
PHP中文网 06-22 -
TikTok加速器哪个好免费的TK加速器推荐
TK小达人 10-01