-
Notifications
You must be signed in to change notification settings - Fork 1.3k
飞桨2.0实例教程 — 使用BERT实现自动写诗 #968
base: develop
Are you sure you want to change the base?
Conversation
删除了1312-1315行多余代码,训练和评测verbose改为1并重新生成了输出。
删除了原先的位置,增加了预训练词向量文件夹
修改了两处说明(Line9,190),修改了最近修改时间(Line15)
rewrite some discriptions
|
补充了对输入的详细说明;将数据集替换为飞桨官方数据集;补充了自动写诗的说明;对章节进行了划分和标序
1、不建议将 import paddlenlp as ppnlp 直接 import paddlenlp 即可 |
update the newest url for paddlenlp; update the new API for paddlenlp.
merge op of load_dataset for test, dev, and train.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
整体写的很好,有一些表述需要完善下,辛苦~
"### 3.1 预训练BERT模型\n", | ||
"古诗生成是一个文本生成的过程,在实际中模型无法获知还未生成的内容,也即BERT中的双向关系中只能捕捉到前向关系而不能捕捉到后向关系。这个限制我们可以通过添加注意力掩码(attention mask)来屏蔽掉后向的关系,使模型无法注意到还未生成的内容,从而使BERT仍能完成文本生成任务。\n", | ||
"\n", | ||
"进一步地,我们可以将文本生成简化为基于BERT的词分类模型(理解为词性标注),即赋予每个词一个标签,该标签即该词后的下一个词是什么。因此,我们直接调用PaddleNLP的BERT词分类模型即可看,需注意模型分类的类别为词表长度。" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
即赋予每个词一个标签,该标签即该词后的下一个词是什么。
这一句的描述不是很清晰,建议补充一个例子来说明;
因此,我们直接调用PaddleNLP的BERT词分类模型即可看,
这一句似乎不全吧?
}, | ||
"source": [ | ||
"## 4. 古诗生成\n", | ||
"以下,我们定义一个类来利用已经训练好的模型完成古诗生成的任务。在生成古诗的过程中,我们将已经生成的内容作为输入,编码后输入模型,得到输入中每个词对应的分类结果。然后选取最后一个词的分类结果作为下一个待预测的词。下一轮中,刚刚预测的词将加入到已生成的内容中,继续进行下一个词的预测。\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
然后选取最后一个词的分类结果作为下一个待预测的词
这句有点歧义,建议改为:
作为下一个待预测的词 -> 作为根据当前内容要预测的词
" self.sequence_length = input_length\r\n", | ||
" self.lower_triangle_mask = paddle.tril(paddle.tensor.full((input_length, input_length), 1, 'float32'))\r\n", | ||
"\r\n", | ||
" def forward(self, token, token_type, input_mask, input_length=None):\r\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这一段代码可以多加一些注释,帮助读者理解~
add more descriptions.
No description provided.