Gated Transformer-XL (GTrXL) 主要做了两个改进:
1. reordering of the layer normalization
2. addition of a new gating mechanism
将第一个改进的版本叫做:TrXL-I(The model using this Identity Map Reordering is termed TrXL-I)
原来的版本叫做:TrXL
这 3 个版本的结构如图:
----------------------------
TrXL-I 能明显优于 TrXL,如表、图:
100-capped 的原文解释:
We also include the 100-capped score where the per-level mean score is clipped at 100, providing a metric that is proportional to the percentage of levels that the agent is superhuman.
我的理解:100-capped 即对所有的值 $x$ 进行 $max(100, x)$。显然,100-capped 的均值总是小于等于 100,仅当所有的实验都超过人类水平,则 100-c ... ...
{{item.post.textarea}}
墨之科技,版权所有 © Copyright 2017-2027
湘ICP备14012786号 邮箱:ai@inksci.com