人工神经网络非常适应于信号感知、序列学习和增强学习,但受限于表达变量与数据结构以及长时间存储数据的能力,因为缺乏额外存储区。这里,我们介绍一个机器学习方法,叫做可微神经计算机(Differentiable Neural Computer, DNC),它由一个可以从额外存储阵列中进行读取的神经网络构成,类似于传统计算机中的内存(RAM)。有如传统计算机,它可以使用存储区来表示并操作复杂的数据结构,又像神经网络一样能够从数据中学会这样去做。当使用监督学习进行训练时,我们展示了DNC能够成功地回答应用于模仿推理的综合问题以及自然语言的问题推断。我们展示了它可以学习一些任务,例如寻找特殊点之间的最短路径并且推断随机生成的图形中丢失的连接,并且将这些任务推广于特殊的图形例如交通网络和家族树。当使用增强学习进行训练时,一个DNC可以根据序列符号对应的目标完成一个移动拼图。总之,我们的结果展示了DNCs能够解决复杂的结构化任务,这些任务对于没有额外读写存储区的神经网络来说,是不可能的。
可微神经计算机(Differentiable Neural Computer, DNC)
文章解读:
N × W memory matrix M
-------------------
The read vector r returned by a read weighting
wr over memory M is a weighted sum over the memory locations:
r=i= M[i, ]w [i]
N
1
r , where the ‘·’ denotes all j = 1, …, W. Similarly,
the write operation uses a write weighting ww to first erase with
an erase vector e, then add a write vector v: M[i,j] ← M[i,j]
(1 − ww[i]e[j]) + ww[i]v[j].
-------------------------
The functional units that determine and
apply the weightings are called read and write heads. The operation of
the heads is illustrated in Fig. 1 and summarized below; see Methods
for a formal description.
由a b c d四个区域组成
a为神经网络 ,b为read and write heads,c为memory M
a中的递归架构
b与c之间的交互
similarity scores
b的写里面有write key、write vector和erase vector
b的读里面有read key和read mode(BCF)
另外还有read vectors传送给a区域
L[i, j]
is close to 1 if i was the next location written after j, and is close to 0
otherwise
As well as automatically increasing with each write to a location, usage
can be decreased after each read. This allows the controller to reallocate
memory that is no longer required
the bAbI dataset26
DNCs performed much better than both long shortterm
memory27 (LSTM; at present the benchmark neural network for
most sequence processing tasks) and the neural Turing machine 优于神经图灵机
墨之科技,版权所有 © Copyright 2017-2027
湘ICP备14012786号 邮箱:ai@inksci.com